CN111627502A - Single cell data visualization method, system, device and storage medium - Google Patents

Single cell data visualization method, system, device and storage medium Download PDF

Info

Publication number
CN111627502A
CN111627502A CN202010442036.0A CN202010442036A CN111627502A CN 111627502 A CN111627502 A CN 111627502A CN 202010442036 A CN202010442036 A CN 202010442036A CN 111627502 A CN111627502 A CN 111627502A
Authority
CN
China
Prior art keywords
data
user
module
cell
single cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010442036.0A
Other languages
Chinese (zh)
Inventor
李小平
项鹏
马远尘
刘欢瑶
李伟强
王涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202010442036.0A priority Critical patent/CN111627502A/en
Publication of CN111627502A publication Critical patent/CN111627502A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides a method, a system, a device and a storage medium for single cell data visualization, which comprise a user side, a system terminal and a data storage; the system terminal comprises a user verification module, a data quality module, a PCA dimension reduction module, a dimension reduction clustering module and a cell group marker gene display module; the preprocessing module is used for verifying login information of a user, calling processing data and providing data selection; the data quality module is used for detecting the data quality condition of each cell; the PCA dimension reduction module is used for performing dimension reduction calculation on the high-dimensional data; the dimension reduction clustering module is used for calculating cell clustering conditions obtained by different dimension reduction algorithms; the cell population marker gene display module is used for calculating a difference gene table and a cell marker gene enrichment table among the cell populations according to cell population clustering; by adopting the system, the requirement of a user on single cell data analysis can be basically met, the time consumption is greatly reduced, and the working efficiency is improved.

Description

Single cell data visualization method, system, device and storage medium
Technical Field
The invention relates to the technical field of networks, in particular to a method, a system, a device and a storage medium for single cell data visualization.
Background
In order to enable bioinformatics to observe functions and types of single cells with higher precision, single cell data are analyzed and visualized through r or python language software packages in the market at present. But the use of software packages requires a higher programming base; meanwhile, a single software package can only analyze one aspect of single-cell data.
For bioinformatics with zero programming base, it takes a lot of time to overcome programming hurdles if using off-the-shelf software packages; and selects a software package suitable for the requirement of the user. But due to the limited time of the bioinformatics and their expertise, they spend a lot of time in areas that they are not good at themselves. Although professional bioinformatics are scarce, there is still no way to provide individual single cell data visualization for each of the required bioinformatics. And the existing single-cell transcriptome data analysis software cannot perform customized data classification and data display according to each researcher.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method, a system, a device and a storage medium for single cell data visualization, which solve the problem that the single cell data visualization can be realized only by user autonomous programming.
The technical scheme of the invention is realized as follows: in a first aspect, a single cell data visualization system comprises a user terminal, a system terminal and a data memory; the system terminal comprises a preprocessing module, a data quality module, a PCA dimension reduction module, a dimension reduction clustering module and a cell group marker gene display module;
the preprocessing module is used for verifying login information of a user, calling processing data and providing data selection;
the data quality module is used for detecting the data quality condition of each cell;
the PCA dimension reduction module is used for performing dimension reduction calculation on the high-dimensional data;
the dimension reduction clustering module is used for calculating cell clustering conditions obtained by different dimension reduction algorithms;
the cell population marker gene display module is used for calculating a difference gene table and a cell marker gene enrichment table among the cell populations according to cell population clustering;
the user side is used for inputting user information and displaying a calculation result of the system terminal;
and the data memory is used for storing the user data and the single cell data.
Preferably, the system terminal further comprises cell-specific gene display;
the cell-specific building block comprises a surface protein marker unit and a transcription factor unit;
a surface protein marker unit for calculating the ratio of surface protein markers in different cell populations;
and the transcription factor unit is used for calculating the proportion of the transcription factors in different cell groups.
Preferably, the user side comprises an information input module and an information display module;
the information input module is used for inputting user login information;
and the information display module is used for displaying the calculation result of the system terminal.
Preferably, the preprocessing module comprises a verification unit, a data display unit and a data set loading unit;
the verification unit is used for extracting the user information in the data storage, comparing the user information with the user login information of the information input module and sending a comparison result to the data display unit;
the data display unit is used for judging whether the processed single cell data exist according to the comparison result and displaying the processed single cell data through the user side;
and the loading data set unit is used for selecting the source of the use data and the species in the data.
In a second aspect, a method for visualizing single cell data comprises the steps of:
after the user side logs in, the system terminal verifies the user information, and the user side enters the system terminal after the user information passes the verification;
the user side sends the received user operation instruction to the system terminal, and the system terminal skips to a corresponding function according to the user operation instruction;
the system terminal executes calculation according to the corresponding function, and sends the calculated result to the user terminal;
and the user side carries out visual display on the received result.
Preferably, after the user terminal logs in, the system terminal verifies the user information, and the user terminal enters the user terminal step after the user information is verified, and the method comprises the following substeps:
after the user logs in, the information input module sends the input user login information to a preprocessing module of the system terminal;
and the preprocessing module executes verification on the received user login information, and enters the user side after the verification is passed.
Preferably, the preprocessing module performs authentication on the received user login information, and the authentication is passed and then enters the user side step, which includes the following substeps:
the preprocessing module sends the received user login information to the verification unit to execute verification, and sends a verification passing signal to the data display unit and the information display module after the verification is passed;
the data display unit receives the verification passing signal, judges whether the processed single cell data exists or not, and displays the processed single cell data through the user side;
after receiving the verification passing signal, the information display module jumps to the user side for display;
after jumping to the user side, the loading data set unit selects the source of the used data and the species in the data according to the user instruction.
Preferably, the functions include calculating data quality, PCA dimensionality reduction, dimensionality reduction clustering, and cell population marker genes.
In a third aspect, an apparatus for single cell data visualization, comprising:
a memory for storing a program;
a processor for executing a program which causes the processor to perform a method of visualising single cell data according to any one of claims 5 to 8.
In a fourth aspect, a computer-readable storage medium comprising a computer program which, when run on a computer, causes a method of visualizing single cell data as claimed in any one of claims 5 to 8 to be performed.
Compared with the prior art, the invention has the following advantages: compared with the traditional single cell data analysis software, the single cell data analysis software generally needs to have a programming basis, and the display mode of the final data is single, so that diversified data expression cannot be realized; the system does not need a programming basis, is provided with various functional modules, can select the corresponding functional module according to the work requirement, can select the corresponding display mode to express the result after the execution of each function is finished, and can customize the display mode. The system can basically meet the requirements of users on single cell data analysis, and simultaneously, due to the multi-style of result display, the experience of the users is further improved; the time consumption is greatly reduced, and the working efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a system for visualizing single cell data according to the present invention;
FIG. 2 is a flow chart of a method for visualizing single cell data according to the present invention;
in the figure:
1. a system terminal; 11. a preprocessing module; 111. a verification unit; 112. a data display unit; 113. loading a data set unit; 12. a data quality module; 13. a PCA dimension reduction module; 14. a dimension reduction clustering module; 15. a cell population marker gene display module; 2. a user side; 21. an information input module; 22. an information display module; 3. and a data memory.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
As shown in fig. 1, the present embodiment provides a single cell data visualization system, which includes a user terminal 2, a system terminal 1 and a data storage 3; the user terminal 2 and the system terminal 1 perform data interaction, and the system terminal 1 and the data memory 3 perform data interaction; the system terminal 1 comprises a preprocessing module 11, a data quality module 12, a PCA dimension reduction module 13, a dimension reduction clustering module 14 and a cell population marker gene display module 15. The user side 2 is used for inputting user information and displaying a calculation result of the system terminal 1; the user end 2 comprises an information input module 21 and an information display module 22; the information input module 21 is used for inputting user login information; and the information display module 22 is used for displaying the calculation result of the system terminal 1. The user side 2 mainly realizes two functions, one is to provide a user login interface and simultaneously temporarily store login information of a user, and the other is to display a function execution result of the user system terminal 1 so as to provide friendly operation experience and design display. And the data memory 3 is used for storing user data and single cell data.
Specifically, the preprocessing module 11 is configured to verify login information of a user, retrieve processing data, and provide data selection; the preprocessing module 11 includes a verification unit 111, a data display unit 112, and a load data set unit 113; the verification unit 111 is configured to extract the user information in the data storage 3, compare the user information with the user login information of the information entry module 21, and send a comparison result to the data display unit 112; the verification unit 111 is used for verifying login information of a user and providing different authority divisions for different users; the data display unit 112 is configured to determine whether to retrieve the single-cell data from the data storage 3 according to the comparison result, and display the single-cell data through the user end 2; after the verification information of the user passes, the data display unit 112 detects whether processed single cell data exists in the system, and displays the processed single cell data to the user terminal 2 in real time; a load data set unit 113 for selecting a source of the usage data and the species in the data. The user can select the data to be used and the species source in the currently selected data according to the loading data set unit 113 of the user terminal 2, and the current single cell data visualization system supports the gene format selection of two species, namely human and mouse.
Specifically, the data quality module 12 is configured to detect data quality conditions of each cell; after the data quality function is selected, the system terminal 1 calls the data quality module 12 to execute calculation; the data quality module 12 comprises a data quality unit, a differential gene display unit and a cell cycle score unit. According to the data selection of the loading data set unit 113, the data quality unit of the data quality module 12 analyzes and calculates the data quality according to the selection, and a violin diagram and a scatter diagram are generated after the analysis; the mitochondrial content in each cell group, the total number of detected genes and other basic data quality conditions can be observed in a violin diagram, and the display conditions of each data quality under different groups can be selectively observed in the violin diagram; observing how many points fall within the selected data range in the scatter plot by selecting the minimum gene data and the maximum gene data; the differential gene display unit displays the relative specificity of the expression of genes in the current sample, so that a user can better judge whether the quality of the current cell data is reasonable; for part of the samples, genes related to cell cycle can occupy a considerable proportion and can influence the downstream analysis, so the cell cycle fraction unit can judge the state of G1/G2M/S of all cells of the current sample through the calculation of the cell cycle fraction.
Specifically, the PCA dimension reduction module 13 is configured to perform dimension reduction calculation on the high-dimensional data; the PCA dimension reduction module 13 includes a PCA data unit, a PCA heatmap, and an assembly gene display unit. According to the data selection of the loading data set unit 113, the PCA data unit of the PCA dimension reduction module 13 calculates and presents the difference between the scatter diagram formed after dimension reduction according to the PCA and each of the PCs serving as the dimension reduction feature values, wherein the difference between each of the PCs serving as the dimension reduction feature values is presented in a discrete form, and a user can judge the first n PCs with the largest difference by observing the change of each two points. The PCA heatmap and the assembled gene presentation unit of the PCA dimension reduction module 13 show which genes are specifically included in each as a dimension reduction reference PC and the heatmap formed by these genes.
Specifically, the dimension reduction clustering module 14 is configured to calculate cell clustering conditions obtained by different dimension reduction algorithms; the dimension reduction clustering module 14 comprises a clustering unit, a subdata dimension reduction clustering unit, a drawing gene unit and a cell proportion unit. According to the data selection of the loading data set unit 113, the dimensionality reduction clustering module 14 presents a dimensionality reduction clustering scatter diagram through the calculation of the clustering unit, a user can limit the classification mode, the size of the scatter diagram, whether the caption is displayed independently or not, and the picture configuration of the caption size is displayed by a definition cell; the user can choose to observe only a part of data, and when the user chooses which composition and how to slice the data, the subdata dimension-reduction clustering unit can achieve the effect of selectively slicing and displaying the original data graph by selecting the classification of cells and word labels in the classification; mapping gene units also provides the function of relative expression of genes in a particular cell population; if the cell label of the user's data sample does not completely overlap with the actual cluster, the cell proportion unit may provide the proportion of the known cell type in the clustered cell group. The dimension reduction clustering module 14 can express results through a violin graph, a scatter diagram, a peak diagram and a bubble diagram.
Specifically, the cell population marker gene display module 15 is used for calculating a difference gene table and a cell marker gene enrichment table among cell populations according to cell population clustering; the cell group marker gene display module 15 comprises a difference gene unit, a cell marker gene enrichment unit and a cell group marker unit, wherein the difference gene unit is used for calculating to obtain a difference gene table and a heat map for observing the specific difference condition of the difference gene; the cell marker gene enrichment unit is used for carrying out cell species enrichment calculation to obtain a cell species enrichment table; specifically, by comparing the marker gene of each group with the cell type markers in the database, the cell type with the maximum similarity is represented in the form of a bubble chart, and the user can observe the cell type from the cell marker gene enrichment table. The cell population marking unit can enable a user to carry out self-defined naming on the cell population according to the expression distribution of a specific gene in a self-defined mode.
Specifically, the system terminal 1 further includes a cell-specific gene module; the cell-specific building block comprises a surface protein marker unit and a transcription factor unit; a surface protein marker unit for calculating the ratio of surface protein markers in different cell populations; and the transcription factor unit is used for calculating the proportion of the transcription factors in different cell groups.
Example two
As shown in fig. 2, the present embodiment provides a single cell data visualization method based on a single cell data visualization system, including the following steps:
and step S10: after the user side 2 logs in, the system terminal 1 verifies the user information and enters the user side 2 after the user information passes the verification; when a user needs to process single cell data, the user needs to log in a system terminal 1; when the user logs in the system terminal 1, the system terminal 1 performs authentication, and enters the user side 2 after the authentication is passed.
S101, a step: after the user logs in, the information input module 21 sends the input user login information to the preprocessing module 11 of the system terminal 1;
s102, a step: the preprocessing module 11 performs authentication on the received user login information, and the user enters the user terminal 2 after the authentication is passed.
S1021, step: the preprocessing module 11 sends the received user login information to the verification unit 111 to execute verification, and sends a verification passing signal to the data display unit 112 and the information display module 22 after the verification is passed;
step S1022 a: the data display unit 112 determines whether the processed single cell data exists after receiving the verification passing signal, and displays the processed single cell data through the user end 2;
step S1022 b: after receiving the verification passing signal, the information display module 22 skips to the user side 2 for display;
and S1023: after jumping to the user side 2, the load data set unit 113 selects the source of the used data and the species in the data according to the user instruction. When the information entry module 21 sends the entered user login information to the preprocessing module 11 through the system terminal 1, the preprocessing module 11 calls the verification unit 111 to compare the user login information with the user information in the data storage 3 for verification, and after the verification is finished, a verification passing signal is sent to the data display unit 112 and the information display module 22; the information display module 22 skips to the user terminal 2 after receiving the signal, and the data display unit 112 determines whether there is processed single cell data after receiving the signal, and displays the processed single cell data through the user terminal 2; after jumping to the user terminal 2, the user can perform data selection on the loading data set unit 113, and the loading data set unit 113 selects the source of the used data and the species in the data according to the user instruction.
And step S20: the user end 2 sends the received user operation instruction to the system terminal 1, and the system terminal 1 skips to a corresponding function according to the user operation instruction; after the user enters the user end 2, the corresponding function can be selected in the system according to the actual selection to execute the calculation.
And step S30: the system terminal 1 executes calculation according to the corresponding function, and sends the calculated result to the user terminal 2; among the common functions that a system can implement include: visualization of data quality, visualization of PCA dimension reduction, dimension reduction clustering, cell grouping marker gene display, surface marker statistics, transcription factor statistics and the like. The common functions can meet the operation function requirements of most biological information personnel.
And step S40: and the user side carries out visual display on the received result. For the data quality module 12, the expression mode of the operation result is as follows: a scatter diagram, a violin diagram, a specific gene scatter diagram and a cell cycle scatter diagram; for the PCA dimension reduction module 13, the expression of the operation result is as follows: a dimensionality reduction scatter diagram and a PC heat map; for the dimension reduction clustering module 14, the expression modes of the operation result are as follows: two-dimensional dimensionality reduction clustering maps (tsne and umap), cell relative expression maps (scattergram, violin map, peak map, bubble map, heat map); for the cell population marker gene display module 15, the expression modes of the operation results are as follows: a cell grouping differential gene general table, a gene heat map with the highest resolution in each cell group, a defined cell type (bubble map), an enrichment statistical table of specific genes in a formulated cell group and expression of the specific genes in the cell group (scatter diagram, violin diagram, peak diagram, heat map and bubble map); for the surface protein marker unit, the expression mode of the operation result is as follows: surface marker fraction in cell population (histogram); for the transcription factor unit, the expression mode of the operation result is as follows: transcription factors dominate the cell population. In the operation, some parameters control, period selection and the like are also included; specifically, in the control of cell quality parameters, the main parameters include: the sum of the detected number of genes, the mitochondrial content and the expression level of all genes in the cell; the cell cycle includes: stage G1, stage G2M and stage S; the statistical parameters include: p value, log2 fold change value, relative expression values of the present population and non-present population, and adjusted P value.
In summary, compared with the traditional single cell data analysis software, the single cell data analysis software has the advantages that programming basis is usually needed, the display mode of final data is single, and diversified data expression cannot be realized; the system does not need a programming basis, is provided with various functional modules, can select the corresponding functional module according to the work requirement, can select the corresponding display mode to express the result after the execution of each function is finished, and can customize the display mode. The system can basically meet the requirements of users on single cell data analysis, and simultaneously, due to the multi-style of result display, the experience of the users is further improved; the time consumption is greatly reduced, and the working efficiency is improved.
According to an embodiment of the present invention, there is also provided a storage medium having a program product stored thereon, which is capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above-mentioned "exemplary methods" section of the present description, when the program product is run on the terminal device. Which may employ a portable compact disc read only memory (CD-ROM) and include program code and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPRO or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Similarly, according to an embodiment of the present invention, there is also provided a processor on which a computer program is stored, which is executable by the processor, and in some possible implementations, various aspects of the present invention may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps according to various exemplary implementations of the present invention described in the above section "exemplary method" of the present specification, when the program product is run on the terminal device.
The following disclosure provides many different embodiments or examples for implementing different configurations of embodiments of the invention. In order to simplify the disclosure of embodiments of the invention, the components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit the present invention. Furthermore, embodiments of the invention may repeat reference numerals and/or reference letters in the various examples, which have been repeated for purposes of simplicity and clarity and do not in themselves dictate a relationship between the various embodiments and/or arrangements discussed. In addition, embodiments of the present invention provide examples of various specific processes and materials, but one of ordinary skill in the art may recognize applications of other processes and/or use of other materials.
In the description of the embodiments of the present invention, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of describing the embodiments of the present invention and simplifying the description, but do not indicate or imply that the device or element referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the embodiments of the present invention. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more features. In the description of the embodiments of the present invention, "a plurality" means two or more unless specifically limited otherwise.
In the description of the embodiments of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as being fixedly connected, detachably connected, or integrally connected; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. Specific meanings of the above terms in the embodiments of the present invention can be understood by those of ordinary skill in the art according to specific situations.
In embodiments of the invention, unless expressly stated or limited otherwise, the first feature "on" or "under" the second feature may comprise the first and second features being in direct contact, or the first and second features being in contact, not directly, but via another feature therebetween. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.
The following disclosure provides many different embodiments or examples for implementing different configurations of embodiments of the invention. In order to simplify the disclosure of embodiments of the invention, the components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit the present invention. Furthermore, embodiments of the invention may repeat reference numerals and/or reference letters in the various examples, which have been repeated for purposes of simplicity and clarity and do not in themselves dictate a relationship between the various embodiments and/or arrangements discussed. In addition, embodiments of the present invention provide examples of various specific processes and materials, but one of ordinary skill in the art may recognize applications of other processes and/or use of other materials.
In the description of the present specification, reference to the terms "one embodiment", "some embodiments", "an illustrative embodiment", "an example", "a specific example" or "some examples" or the like means that a specific feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processing module-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a form of hardware, and can also be realized in a form of a software functional module 5. The integrated module, if implemented in the form of a software function module 5 and sold or used as a separate product, may also be stored in a computer-readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A single cell data visualization system is characterized by comprising a user side, a system terminal and a data memory; the system terminal comprises a preprocessing module, a data quality module, a PCA dimension reduction module, a dimension reduction clustering module and a cell group marker gene display module;
the preprocessing module is used for verifying login information of a user, calling processing data and providing data selection;
the data quality module is used for detecting the data quality condition of each cell;
the PCA dimension reduction module is used for performing dimension reduction calculation on the high-dimensional data;
the dimensionality reduction clustering module is used for calculating cell clustering conditions obtained by different dimensionality reduction algorithms;
the cell population marker gene display module is used for calculating a difference gene table and a cell marker gene enrichment table among cell populations according to cell population clustering;
the user side is used for inputting user information and displaying a calculation result of the system terminal;
and the data memory is used for storing the user data and the single cell data.
2. A single cell data visualization system as recited in claim 1 wherein said system terminal further comprises a cell specific gene module;
the cell-specific building block comprises a surface protein marker unit and a transcription factor unit;
the surface protein marker unit is used for calculating the proportion of the surface protein marker in different cell groups;
the transcription factor unit is used for calculating the proportion of the transcription factor in different cell groups.
3. The single cell data visualization system of claim 1, wherein said user side comprises an information entry module and an information presentation module;
the information input module is used for inputting user login information;
and the information display module is used for displaying the calculation result of the system terminal.
4. A single cell data visualization system as claimed in claim 3 wherein said preprocessing module comprises a validation unit, a data display unit and a load data set unit;
the verification unit is used for extracting the user information in the data storage, comparing the user information with the user login information of the information input module and sending a comparison result to the data display unit;
the data display unit is used for judging whether the processed single cell data exist according to the comparison result and displaying the processed single cell data through the user side;
the loading data set unit is used for selecting the source of the use data and the species in the data.
5. A single cell data visualization method is characterized by comprising the following steps:
after the user side logs in, the system terminal verifies the user information, and the user side enters the system terminal after the user information passes the verification;
the user side sends the received user operation instruction to a system terminal, and the system terminal skips to a corresponding function according to the user operation instruction;
the system terminal executes calculation according to the corresponding function, and sends a calculated result to the user side;
and the user side displays the received result in a visual way.
6. The method for visualizing the single cell data as claimed in claim 5, wherein the system terminal verifies the user information after the user logs in, and the step of entering the user terminal after the verification is passed comprises the following sub-steps:
after the user logs in, the information input module sends the input user login information to a preprocessing module of the system terminal;
and the preprocessing module executes verification on the received user login information, and the user enters the user side after the verification is passed.
7. The method for visualizing the single cell data as claimed in claim 6, wherein the preprocessing module performs verification on the received user login information, and the step of entering the user terminal after the verification is passed comprises the following substeps:
the preprocessing module sends the received user login information to a verification unit to execute verification, and sends a verification passing signal to the data display unit and the information display module after the verification is passed;
the data display unit receives the verification passing signal, judges whether the processed single cell data exists or not, and displays the processed single cell data through the user side;
after receiving the verification passing signal, the information display module jumps to the user side for display;
and after jumping to the user side, the data set loading unit selects the source of the used data and the species in the data according to the user instruction.
8. A method for visualizing single cell data as claimed in claim 5 wherein said functions comprise calculating data quality, PCA dimension reduction, dimension reduction clustering and cell population marker genes.
9. An apparatus for single cell data visualization, comprising:
a memory for storing a program;
a processor for executing the program, the program causing the processor to perform a method of visualizing the single cell data of any of claims 5 to 8.
10. A computer-readable storage medium comprising a computer program which, when run on a computer, causes a method of single cell data visualization as claimed in any one of claims 5 to 8 to be performed.
CN202010442036.0A 2020-05-22 2020-05-22 Single cell data visualization method, system, device and storage medium Pending CN111627502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010442036.0A CN111627502A (en) 2020-05-22 2020-05-22 Single cell data visualization method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010442036.0A CN111627502A (en) 2020-05-22 2020-05-22 Single cell data visualization method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN111627502A true CN111627502A (en) 2020-09-04

Family

ID=72272389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010442036.0A Pending CN111627502A (en) 2020-05-22 2020-05-22 Single cell data visualization method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN111627502A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083522A (en) * 2022-08-18 2022-09-20 天津诺禾致源生物信息科技有限公司 Method and device for predicting cell types and server
CN117079726A (en) * 2023-10-16 2023-11-17 浙江大学长三角智慧绿洲创新中心 Database visualization method based on single cells and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109979538A (en) * 2019-03-28 2019-07-05 广州基迪奥生物科技有限公司 A kind of analysis method based on the unicellular transcript profile sequencing data of 10X
CN110149807A (en) * 2015-11-10 2019-08-20 细胞结构公司 Platform for visualization of synthetic genomic, microbiome, and metabolome data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110149807A (en) * 2015-11-10 2019-08-20 细胞结构公司 Platform for visualization of synthetic genomic, microbiome, and metabolome data
CN109979538A (en) * 2019-03-28 2019-07-05 广州基迪奥生物科技有限公司 A kind of analysis method based on the unicellular transcript profile sequencing data of 10X

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083522A (en) * 2022-08-18 2022-09-20 天津诺禾致源生物信息科技有限公司 Method and device for predicting cell types and server
CN115083522B (en) * 2022-08-18 2022-10-28 天津诺禾致源生物信息科技有限公司 Method and device for predicting cell types and server
CN117079726A (en) * 2023-10-16 2023-11-17 浙江大学长三角智慧绿洲创新中心 Database visualization method based on single cells and related equipment
CN117079726B (en) * 2023-10-16 2024-01-30 浙江大学长三角智慧绿洲创新中心 Database visualization method based on single cells and related equipment

Similar Documents

Publication Publication Date Title
US11954614B2 (en) Systems and methods for visualizing a pattern in a dataset
Tseng et al. Tight clustering: a resampling-based approach for identifying stable and tight patterns in data
Ji et al. Applications of beta-mixture models in bioinformatics
US9898578B2 (en) Visualizing expression data on chromosomal graphic schemes
US20190180844A1 (en) Method for deep learning-based biomarker discovery with conversion data of genome sequences
CN109243530B (en) Genetic variation determination method, system, and storage medium
CN111627502A (en) Single cell data visualization method, system, device and storage medium
US9501554B2 (en) Image processing system, image processing method, and image processing program
US20090226916A1 (en) Automated Analysis of DNA Samples
Puniyani et al. SPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images
CN113517022A (en) Gene detection method, feature extraction method, device, equipment and system
CN107832584B (en) Gene analysis method, device, equipment and storage medium of metagenome
Wicker et al. Density of points clustering, application to transcriptomic data analysis
Wong et al. A multi-stage approach to clustering and imputation of gene expression profiles
KR102572274B1 (en) An apparatus for analyzing nucleic sequencing data and a method for operating it
Ebert et al. Fast detection of differential chromatin domains with SCIDDO
CN115527610B (en) Cluster analysis method for single-cell histology data
CN117079717A (en) Cell subtype identification method, device, equipment and medium
US8554487B2 (en) Method and apparatus for analyzing genotype data
Martella Classification of microarray data with factor mixture models
US10883912B2 (en) Biexponential transformation for graphics display
US20120062589A1 (en) System and program for enumerating local alignments from a pair of documents
JP2016048485A (en) Gene expression information analyzer, gene expression information analysis method, and program
CN108463722B (en) Systems, methods, and apparatus for processing platelet cell data
Gunturkun et al. SVJAM: Joint Analysis of Structural Variants Using Linked Read Sequencing Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200904

RJ01 Rejection of invention patent application after publication