CN106202936A - A kind of disease risks Forecasting Methodology and system - Google Patents

A kind of disease risks Forecasting Methodology and system Download PDF

Info

Publication number
CN106202936A
CN106202936A CN201610552427.1A CN201610552427A CN106202936A CN 106202936 A CN106202936 A CN 106202936A CN 201610552427 A CN201610552427 A CN 201610552427A CN 106202936 A CN106202936 A CN 106202936A
Authority
CN
China
Prior art keywords
information
gene
disease
variation
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610552427.1A
Other languages
Chinese (zh)
Inventor
全雪萍
郝占平
吕彬彬
任永永
赵文涛
吴电云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shuo Medical Data Technology (beijing) Co Ltd
Original Assignee
Shuo Medical Data Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shuo Medical Data Technology (beijing) Co Ltd filed Critical Shuo Medical Data Technology (beijing) Co Ltd
Priority to CN201610552427.1A priority Critical patent/CN106202936A/en
Publication of CN106202936A publication Critical patent/CN106202936A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Development Economics (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to a kind of disease risks Forecasting Methodology and system, wherein method includes: S1, receives the gene sequencing object information of input, is analyzed gene sequencing object information obtaining all mutant gene information;S2, searches the disease gene variation information of corresponding gene loci in accurate medical knowledge storehouse, obtains at least one disease gene variation information according to the gene loci that described mutant gene information is corresponding;S3, mates with all disease genes variation information respectively by each mutant gene information, obtains the similarity of each mutant gene information and all disease genes variation information;S4, obtains the risk profile table for described gene sequencing object information according to the similarity of each mutant gene information with disease gene variation information.The present invention, by incidence relation analysis between tumor susceptibility gene, variant sites, variation type, disease, Healthy People being carried out risk prediction, provides overall risk grade forecast, reduces following risk.

Description

A kind of disease risks Forecasting Methodology and system
Technical field
The present invention relates to a kind of disease risks Forecasting Methodology and system, belong to technical field of biological information.
Background technology
According to " whole world cancer report 2014 " display, within 2012, Cancer in China number of the infected is 306.5 ten thousand, accounts for whole world cancer The 1/5 of number of the infected;Number of cancer deaths is 220.5 ten thousand, accounts for the 1/4 of whole world number of cancer deaths.Cancer really one Genopathy, early finds most important to the survival rate for the treatment of of cancer.Such as colorectal cancer is one of common malignant tumor of China, Being Chinese the fourth-largest cancer mortality reason, it is 6% that Chinese population suffers from the average risk of colorectal cancer in life.Wherein, 30% Colorectal cancer comes from inherited genetic factors.There is the ill wind of the crowd that first degree relative (siblings, father and mother or children) suffers from colorectal cancer Danger is the twice of crowd's average risk.If these relatives morbidity is relatively early (before 60 years old), or multidigit relatives are diagnosed as knot directly Intestinal cancer, that risk is bigger.Fraction colorectal cancer (5%) betides the family suffering from colon related diseases, such as colon Polyposis (colonic surface has been covered with countless polyps).Gene information by gene sequencing detection Healthy People, so that it may The tumor risk of Healthy People is predicted according to sequencing result.
Along with the fast development of high throughput sequencing technologies, the exploitation of big data analysis tool, Healthy People individuality is carried out base Because of detection, analyze the variation of its genes of individuals, by building based on evidence-based medicine EBM, integration Biomedical literature data, public life The accurate medical knowledge storehouse of thing medical science group data base and FDA, CFDA, NCCN guide etc. carries out clinical note to genes of individuals feature Releasing, the tumor risk developing corresponding algorithm predicts patient is possibly realized.How to integrate and utilize these groups to learn resource Also nowadays closely bound up with personal health and accurate medical treatment hot issue is reformed into.External American-European advanced country opens the beginning of this century Begin to process and the research of utilization for the big data of precisely medical treatment, established various function different, the data base that emphasis differs, Preliminarily form the Standardization System of precisely medical treatment.China is in the leading level in the world on gene sequencing technology, but how Effective utilization and collection group data message, carried out in the weight that malignant tumor risk profile is China's accurate medical field at present Weight.
Summary of the invention
The technical problem to be solved is, does not utilize and collection group data message in prior art, Set up large database system and the comprehensive data analysis system being predicted as basis with malignant tumor risk, a kind of disease is provided Sick Risk Forecast Method and system.
The technical scheme is that a kind of disease risks Forecasting Methodology, including following step Rapid:
S1, receives the gene sequencing object information of input, is analyzed obtaining all variations to gene sequencing object information Gene information;
S2, searches corresponding gene position according to the gene loci that described mutant gene information is corresponding in accurate medical knowledge storehouse The disease gene variation information of point, obtains at least one disease gene variation information;
S3, mates each mutant gene information with all disease genes variation information respectively, obtains each variation Gene information and the similarity of all disease genes variation information;
S4, obtains for described gene sequencing according to the similarity of each mutant gene information with disease gene variation information The risk profile table of object information.
The invention has the beneficial effects as follows: the present invention by based on tumor drive gene, variation type, disease, many dimensions According to algorithm, make healthy population that the disease gene variation information that autogene information is corresponding to be had gained some understanding.By to tumor susceptibility gene, Incidence relation analysis between variant sites, variation type, disease, carries out risk prediction to Healthy People, is given the most ill Risk class is predicted, and provides client prevention and health care knowledge, reduces following risk.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described accurate medical knowledge storehouse is used for storing multiple disease name information and disease gene variation information;
One-to-one relationship is there is in described disease name information with disease gene variation information.
Further, described S3 obtains the similarity of each mutant gene information corresponding all disease genes variation information, And each mutant gene information is set up similarity synopsis respectively.
Above-mentioned further scheme is used to provide the benefit that, by each mutant gene information is set up similarity pair respectively According to table, it is possible to obtain suffer from the risk assessment of various disease for same mutant gene information, there is directiveness meaning for successive treatment See.
Further, all similarity synopsis that S3 is obtained by described S4 are comprehensively arranged according to similarity is descending Sequence, generates risk profile table, and described risk profile table shows the risk of described gene sequencing object information.
Use above-mentioned further scheme to provide the benefit that, by integrated ordered, can make to be ignorant of the domestic consumer of medical knowledge Risk is recognized, without the information understanding the concrete patient's condition of various diseases etc according to risk size is open-and-shut.
Further, described S1 is analyzed specifically including following steps to gene sequencing object information:
Form according to gene sequencing object information and size selection analysis flow process;
By the analysis process of selection to all or part of information in gene sequencing object information and with reference to genome ratio Right, obtain mutant gene information.
Above-mentioned further scheme is used to provide the benefit that, the genetic test result letter that different gene sequencing systems obtains The capacity of breath is different, selects different analysis process can maximize in the case of ensureing analysis precision according to its information capacity Ground extracts effective information.
Further, the method for the invention is additionally operable to detect germline mutants in human body, and analysis process compatibility targeting captures Sequencing data, full exon group sequencing data and sequencing data of whole genome;Described analysis process is characterised by detecting in human body Germline mutants, the data structure of described analysis process is the fastq file of Illumina platform, or Ion torrent The bam file of platform.
Illumina platform the analysis process of fastq: remove low quality base, use sliding window algorithm Remove comprise more low quality base order-checking section fragment, remove joint sequence pollute, enter the comparison stage, by sequencing result with Human genome reference sequences is compared, and filters out the low-quality base sequence of comparison, obtains bam file, carries out the position that makes a variation Point extracts, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel);Outside complete Aobvious son and sequencing data of whole genome also include the knots such as copy number variation (CNV), group translocation (gene translocations) Structure makes a variation, and obtains VCF file, enters accurate medical knowledge storehouse, carries out automatization's clinical meaning note by certain search logic Release, generate risk profile report.
The bam file of Ion torrent platform: first bam file is converted back fastq file and carry out Quality Control, then enter Enter comparison and variation identifies, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel) structures such as copy number variation (CNV), group translocation (gene translocations) that and full exon group checks order Variation, obtains VCF file, enters accurate medical knowledge storehouse, carry out automatization's clinical meaning annotation by certain search logic, Generation risk profile is reported.
The technical scheme is that a kind of disease risks prognoses system, including:
Analyze module, receive the gene sequencing object information of input, be analyzed obtaining institute to gene sequencing object information There is mutant gene information;
Search module, in accurate medical knowledge storehouse, search correspondence according to the gene loci that described mutant gene information is corresponding The disease gene variation information of gene loci, obtains at least one disease gene variation information;
Matching module, mates with all disease genes variation information respectively by each mutant gene information, obtains every Individual mutant gene information and the similarity of all disease genes variation information;
Risk profile module, obtains for institute according to the similarity of each mutant gene information with disease gene variation information State the risk profile table of gene sequencing object information.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described accurate medical knowledge storehouse is used for storing multiple disease name information and disease gene variation information;
One-to-one relationship is there is in described disease name information with disease gene variation information.
Further, described matching module obtains the phase of each mutant gene information corresponding all disease genes variation information Like degree, and each mutant gene information is set up similarity synopsis respectively.
Above-mentioned further scheme is used to provide the benefit that, by each mutant gene information is set up similarity pair respectively According to table, it is possible to obtain suffer from the risk assessment of various disease for same mutant gene information, for follow-up prevention, there is directiveness Suggestion.
Further, all similarity synopsis that matching module is obtained by described risk profile module are comprehensively according to similarity Descending being ranked up, generate risk profile table, described risk profile table shows the ill of described gene sequencing object information Risk.
Use above-mentioned further scheme to provide the benefit that, by integrated ordered, can make to be ignorant of the domestic consumer of medical knowledge Risk is recognized, without the information understanding the concrete patient's condition of various diseases etc according to risk size is open-and-shut.
Further, described analysis module is analyzed specifically including following steps to gene sequencing object information:
Form according to gene sequencing object information and size selection analysis flow process;
By the analysis process of selection to all or part of information in gene sequencing object information and with reference to genome ratio Right, obtain mutant gene information.
Above-mentioned further scheme is used to provide the benefit that, the genetic test result letter that different gene sequencing systems obtains Form and the size of breath are the most different, select different analysis process can improve analysis efficiency according to its form with size, it is to avoid Still use when quantity of information is the biggest Whole genome analysis seriously to drag and analyze speed slowly.
Further, system of the present invention is additionally operable to detect germline mutants in human body, and analysis process compatibility targeting captures Sequencing data, full exon group sequencing data and sequencing data of whole genome;Described analysis process is characterised by detecting in human body Germline mutants, the data structure of described analysis process is the fastq file of Illumina platform, or Ion torrent The bam file of platform.
Illumina platform the analysis process of fastq: remove low quality base, use sliding window algorithm Remove comprise more low quality base order-checking section fragment, remove joint sequence pollute, enter the comparison stage, by sequencing result with Human genome reference sequences is compared, and filters out the low-quality base sequence of comparison, obtains bam file, carries out the position that makes a variation Point extracts, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel);Outside complete Aobvious son and sequencing data of whole genome also include the knots such as copy number variation (CNV), group translocation (gene translocations) Structure makes a variation, and obtains VCF file, enters accurate medical knowledge storehouse, carries out automatization's clinical meaning note by certain search logic Release, generate risk profile report.
The bam file of Ion torrent platform: first bam file is converted back fastq file and carry out Quality Control, then enter Enter comparison and variation identifies, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel) structures such as copy number variation (CNV), group translocation (gene translocations) that and full exon group checks order Variation, obtains VCF file, enters accurate medical knowledge storehouse, carry out automatization's clinical meaning annotation by certain search logic, Generation risk profile is reported.
Accompanying drawing explanation
Fig. 1 is a kind of disease risks Forecasting Methodology flow chart described in the embodiment of the present invention 1;
Fig. 2 is a kind of disease risks prognoses system structural representation described in the embodiment of the present invention 2.
In accompanying drawing, the list of parts representated by each label is as follows:
1, module is analyzed, 2, search module, 3, matching module, 4, risk profile module.
Detailed description of the invention
Being described principle and the feature of the present invention below in conjunction with accompanying drawing, example is served only for explaining the present invention, and Non-for limiting the scope of the present invention.
As it is shown in figure 1, for a kind of disease risks Forecasting Methodology described in the embodiment of the present invention 1, comprise the following steps:
S1, receives the gene sequencing object information of input, is analyzed obtaining all variations to gene sequencing object information Gene information;
S2, searches corresponding gene position according to the gene loci that described mutant gene information is corresponding in accurate medical knowledge storehouse The disease gene variation information of point, obtains at least one disease gene variation information;
S3, mates each mutant gene information with all disease genes variation information respectively, obtains each variation Gene information and the similarity of all disease genes variation information;
S4, obtains for described gene sequencing according to the similarity of each mutant gene information with disease gene variation information The risk profile table of object information.
Described accurate medical knowledge storehouse is used for storing multiple disease name information and disease gene variation information;Described disease One-to-one relationship is there is in name information with disease gene variation information.
Described S3 obtains the similarity of each mutant gene information corresponding all disease genes variation information, and to each Mutant gene information sets up similarity synopsis respectively.
All similarity synopsis that S3 is obtained by described S4 are comprehensively ranked up according to similarity is descending, generate wind Danger prediction table, described risk profile table shows the risk of described gene sequencing object information.
Gene sequencing object information is analyzed specifically including following steps by described step S1:
Form according to gene sequencing object information and size selection analysis flow process;By the analysis process of selection to gene All or part of information in sequencing result information, with reference to genome alignment, obtains mutant gene information.
The method of the invention is additionally operable to detect germline mutants in human body, analysis process compatibility targeting capture order-checking number According to, full exon group sequencing data and sequencing data of whole genome;The data structure of described analysis process is Illumina platform Fastq file, or the bam file of Ion torrent platform.
Illumina platform the analysis process of fastq: remove low quality base, use sliding window algorithm Remove comprise more low quality base order-checking section fragment, remove joint sequence pollute, enter the comparison stage, by sequencing result with Human genome reference sequences is compared, and filters out the low-quality base sequence of comparison, obtains bam file, carries out the position that makes a variation Point extracts, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel);Outside complete Aobvious son and sequencing data of whole genome also include the knots such as copy number variation (CNV), group translocation (gene translocations) Structure makes a variation, and obtains VCF file, enters accurate medical knowledge storehouse, carries out automatization's clinical meaning note by certain search logic Release, generate risk profile report.
The bam file of Ion torrent platform: first bam file is converted back fastq file and carry out Quality Control, then enter Enter comparison and variation identifies, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel) structures such as copy number variation (CNV), group translocation (gene translocations) that and full exon group checks order Variation, obtains VCF file, enters accurate medical knowledge storehouse, carry out automatization's clinical meaning annotation by certain search logic, Generation risk profile is reported.
As in figure 2 it is shown, for a kind of disease risks prognoses system described in the embodiment of the present invention 2, including:
Analyze module 1, receive the gene sequencing object information of input, be analyzed obtaining institute to gene sequencing object information There is mutant gene information;
Searching module 2, it is right to search in accurate medical knowledge storehouse according to the gene loci that described mutant gene information is corresponding Answer the disease gene variation information of gene loci, obtain at least one disease gene variation information;
Matching module 3, mates with all disease genes variation information respectively by each mutant gene information, obtains every Individual mutant gene information and the similarity of all disease genes variation information;
Risk profile module 4, according to the similarity of each mutant gene information and disease gene variation information obtain for The risk profile table of described gene sequencing object information.
Described accurate medical knowledge storehouse is used for storing multiple disease name information and disease gene variation information;
One-to-one relationship is there is in described disease name information with disease gene variation information.
Described matching module 3 obtains the similarity of each mutant gene information corresponding all disease genes variation information, And each mutant gene information is set up similarity synopsis respectively.
All similarity synopsis that matching module 3 is obtained by described risk profile module 4 comprehensively according to similarity by greatly Being ranked up to little, generate risk profile table, described risk profile table shows the risk of described gene sequencing object information.
Gene sequencing object information is analyzed specifically including following steps by described analysis module 1:
Form according to gene sequencing object information and size selection analysis flow process;By the analysis process of selection to gene All or part of information in sequencing result information, with reference to genome alignment, obtains mutant gene information.
System of the present invention is additionally operable to detect germline mutants in human body, analysis process compatibility targeting capture order-checking number According to, full exon group sequencing data and sequencing data of whole genome;The data structure of described analysis process is Illumina platform Fastq file, or the bam file of Ion torrent platform.
Illumina platform the analysis process of fastq: remove low quality base, use sliding window algorithm Remove comprise more low quality base order-checking section fragment, remove joint sequence pollute, enter the comparison stage, by sequencing result with Human genome reference sequences is compared, and filters out the low-quality base sequence of comparison, obtains bam file, carries out the position that makes a variation Point extracts, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel);Outside complete Aobvious son and sequencing data of whole genome also include the knots such as copy number variation (CNV), group translocation (gene translocations) Structure makes a variation, and obtains VCF file, enters accurate medical knowledge storehouse, carries out automatization's clinical meaning note by certain search logic Release, generate risk profile report.
The bam file of Ion torrent platform: first bam file is converted back fastq file and carry out Quality Control, then enter Enter comparison and variation identifies, it is thus achieved that genovariation information, including single nucleotide variations (SNVs), genetic insertion and disappearance (Indel) structures such as copy number variation (CNV), group translocation (gene translocations) that and full exon group checks order Variation, obtains VCF file, enters accurate medical knowledge storehouse, carry out automatization's clinical meaning annotation by certain search logic, Generation risk profile is reported.
In concrete example, disease risks Forecasting Methodology of the present invention includes procedure below:
The gene sequencing result input sequencing data analysis platform of healthy population, finds out all of genovariation information, logical Cross accurate medical knowledge library management platform, the clinical annotation information in Auto-matching accurate medical knowledge storehouse and case database, By the annotation information found, for disease risks, prediction client assesses overall risk, it is provided that prevention and health care knowledge, reduces Following risk.
Precisely medical knowledge storehouse: precisely medical knowledge storehouse is that biological group learns data and clinical medicine data store system, main Saving over thousands of kind of medicine, nearly 400 kinds of diseases, thousands of kinds of molecular markers, more than 600 plant cancer immunotherapies.Precisely medical treatment KBM platform major function is that the biological group data collected and clinical medicine data are added, revise, are deleted Except waiting basic maintenance operation, it is also possible to some data are done orientation and captures.Module mainly includes disease, biomarker, variation position Point and variation classification, clinical annotation, therapeutic scheme etc..
Disease module: according to patient information Auto-matching disease type in accurate medical knowledge storehouse, mainly has a solid tumor: Nonsmall-cell lung cancer, colorectal cancer, melanoma, carcinoma of prostate, breast carcinoma, gastrointestinal stromal tumor, cerebral glioma, hepatocarcinoma, stomach Cancer, renal carcinoma, head and neck cancer, thyroid carcinoma, soft tissue sarcoma, cervical cancer, ovarian cancer, the esophageal carcinoma;
Hematopathy: acute myeloid leukemia, myelodysplastic syndrome, bone marrow proliferative tumor, lymphoma, to bite blood thin Born of the same parents' syndrome.
Biomarker module: according to testing result Auto-matching biomarker site, susceptible in accurate medical knowledge storehouse Gene etc..
Variant sites and variation type: mate the variant sites in accurate medical knowledge storehouse and variation class according to testing result Type information, such as: single nucleotide variations (SNV), insertion (Insertion), disappearance (Deletion), insertion and deletion (Indel), first Base (Methylation), microsatellite instability (MSI), differential expression (Differential Expression), copy Number variation (CNV), fusion (Fusion), tandem sequence repeats (Tandem Repeats), Translocation (transposition), region note Release (Region-Based Variation), combined mutation (Combination Mutation) etc..
Clinical annotation: mate the clinical annotation in accurate medical knowledge storehouse according to testing result, such as: variant sites and disease The dependency that occurs, variation recall rate etc. in patients.
1. case management system:
Case management system can provide multiple inquiry mode, the convenient searching and managing that carries out case, during establishment case, fills out Write individual essential information.
Medical record management, is divided into four steps: patient information typing, gene and detection scheme select, gene and detection scheme Confirm and submit laboratory to.After patient's essential information is improved, click on [next step] and enter gene and detection scheme selection.
Gene and detection scheme select, and according to user's request, select suitable panel to click on [next step] to detection scheme Confirming, errorless rear click on [next step] submits laboratory to.
2. upload order-checking file and analyze module:
After gene test terminates, click [uploading/analyze experimental result], enter and upload/analyze experimental result interface, defeated Entering patient history number or name, the state of patient electronic medical record is result to be uploaded, clicks on [uploading], treats all genetic tests After result is all uploaded successfully, screen lower right [analysis], by secretly brightening, is clicked on [analysis], and platform is analyzed automatically, finds out change Ectopic sites, and mark clinical effectively site.When uploading test data result, data form is vcf or fastqa or bam, data Multiple sample once can be analyzed during analysis, please don't power-off and shutdown during analysis.Precisely medical knowledge storehouse: precisely medical treatment is known Knowing storehouse and preserve biology group data and clinical medicine data, including disease, biomarker, variation type, variant sites, site is faced The data such as bed annotates, therapeutic scheme.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.

Claims (10)

1. a disease risks Forecasting Methodology, it is characterised in that comprise the following steps:
S1, receives the gene sequencing object information of input, is analyzed obtaining all mutant genes to gene sequencing object information Information;
S2, searches corresponding gene loci according to the gene loci that described mutant gene information is corresponding in accurate medical knowledge storehouse Disease gene variation information, obtains at least one disease gene variation information;
S3, mates each mutant gene information with all disease genes variation information respectively, obtains each mutant gene Information and the similarity of all disease genes variation information;
S4, obtains for described gene sequencing result according to the similarity of each mutant gene information with disease gene variation information The risk profile table of information.
A kind of disease risks Forecasting Methodology the most according to claim 1, it is characterised in that described accurate medical knowledge storehouse is used In storing multiple disease name information and disease gene variation information;
One-to-one relationship is there is in described disease name information with disease gene variation information.
A kind of disease risks Forecasting Methodology the most according to claim 1, it is characterised in that obtain each variation in described S3 The similarity of gene information corresponding all disease genes variation information, and each mutant gene information is set up similarity pair respectively According to table.
A kind of disease risks Forecasting Methodology the most according to claim 3, it is characterised in that it is all that S3 is obtained by described S4 Similarity synopsis is comprehensively ranked up according to similarity is descending, generates risk profile table, and described risk profile table shows The risk of described gene sequencing object information.
5. according to a kind of disease risks Forecasting Methodology described in any one of claim 1-4, it is characterised in that to base in described S1 Because sequencing result information is analyzed specifically including following steps:
Form according to gene sequencing object information and size selection analysis flow process;
By the analysis process of selection to all or part of information in gene sequencing object information with reference to genome alignment, obtain To mutant gene information.
A kind of disease risks Forecasting Methodology the most according to claim 5, it is characterised in that germline mutants in detection human body, Analysis process compatibility targeting capture sequencing data, full exon group sequencing data and sequencing data of whole genome;Described analysis is flowed The data structure of journey is the fastq file of Illumina platform, or the bam file of Ion torrent platform.
7. a disease risks prognoses system, it is characterised in that including:
Analyze module, receive the gene sequencing object information of input, be analyzed gene sequencing data obtaining genovariation letter Breath;
Search module, in accurate medical knowledge storehouse, search corresponding gene according to the gene loci that described mutant gene information is corresponding The disease gene variation information in site, obtains at least one disease gene variation information;
Matching module, mates each mutant gene information with all disease genes variation information respectively, obtains each change Allogene information and the similarity of all disease genes variation information;
Risk profile module, obtains for described base according to the similarity of each mutant gene information with disease gene variation information Risk profile table because of sequencing result information.
A kind of disease risks prognoses system the most according to claim 7, it is characterised in that obtain every in described matching module The similarity of individual mutant gene information corresponding all disease genes variation information, and each mutant gene information is set up phase respectively Seemingly spend synopsis.
A kind of disease risks prognoses system the most according to claim 8, it is characterised in that described risk profile module general The all similarity synopsis joining module acquisition are comprehensively ranked up according to similarity is descending, generate risk profile table, institute State risk profile table and show the risk of described gene sequencing object information.
10. according to a kind of disease risks prognoses system described in any one of claim 7-9, it is characterised in that described analysis mould Gene sequencing object information is analyzed specifically including following steps by block:
Form according to gene sequencing object information and size selection analysis flow process;
By the analysis process of selection to all or part of information in gene sequencing object information with reference to genome alignment, obtain To mutant gene information.
CN201610552427.1A 2016-07-13 2016-07-13 A kind of disease risks Forecasting Methodology and system Pending CN106202936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610552427.1A CN106202936A (en) 2016-07-13 2016-07-13 A kind of disease risks Forecasting Methodology and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610552427.1A CN106202936A (en) 2016-07-13 2016-07-13 A kind of disease risks Forecasting Methodology and system

Publications (1)

Publication Number Publication Date
CN106202936A true CN106202936A (en) 2016-12-07

Family

ID=57476671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610552427.1A Pending CN106202936A (en) 2016-07-13 2016-07-13 A kind of disease risks Forecasting Methodology and system

Country Status (1)

Country Link
CN (1) CN106202936A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599613A (en) * 2016-12-15 2017-04-26 博奥生物集团有限公司 Method for judging genetic tumor variation site classification
CN106778049A (en) * 2017-03-13 2017-05-31 成都育芽科技有限公司 A kind of accurate medical rescue in community based on big data platform and its method
CN106834107A (en) * 2017-03-10 2017-06-13 首度生物科技(苏州)有限公司 A kind of prediction tumour system for being based on the sequencing of two generations
CN106951730A (en) * 2017-03-21 2017-07-14 为朔医学数据科技(北京)有限公司 A kind of pathogenic grade of genetic mutation determines method and device
CN106971071A (en) * 2017-03-27 2017-07-21 为朔医学数据科技(北京)有限公司 A kind of Clinical Decision Support Systems and method
CN107103207A (en) * 2017-04-05 2017-08-29 浙江大学 Based on the multigroup accurate medical knowledge search system and implementation method for learning variation features of case
CN107122607A (en) * 2017-04-28 2017-09-01 为朔医学数据科技(北京)有限公司 A kind of method and device for generating therapeutic regimen report
CN107267613A (en) * 2017-06-28 2017-10-20 安吉康尔(深圳)科技有限公司 Sequencing data processing system and SMN gene detection systems
CN107301323A (en) * 2017-08-14 2017-10-27 安徽医科大学第附属医院 A kind of construction method of the disaggregated model related to psoriasis
CN107391908A (en) * 2017-06-28 2017-11-24 天方创新(北京)信息技术有限公司 Exercise risk appraisal procedure and device
CN107610779A (en) * 2017-10-25 2018-01-19 医渡云(北京)技术有限公司 Disease Assessment Scale and risk appraisal procedure and device
CN108710782A (en) * 2018-05-16 2018-10-26 为朔医学数据科技(北京)有限公司 Genotype conversion method, device and electronic equipment
CN109545277A (en) * 2018-11-21 2019-03-29 广州市康健基因科技有限公司 A kind of methods of marking and system of scd gene mutation point
CN109817299A (en) * 2019-02-14 2019-05-28 北京安智因生物技术有限公司 A kind of relevant genetic test report automatic generating method of disease and system
CN110301899A (en) * 2019-07-01 2019-10-08 湘南学院附属医院 A kind of cardiovascular and cerebrovascular disease information detecting system and method
CN110660055A (en) * 2019-09-25 2020-01-07 北京青燕祥云科技有限公司 Disease data prediction method and device, readable storage medium and electronic equipment
CN111243661A (en) * 2020-01-13 2020-06-05 北京奇云诺德信息科技有限公司 Gene physical examination system based on gene data
CN111370131A (en) * 2018-12-26 2020-07-03 陈治平 Method and system for screening biomarkers via disease trajectories
CN111602201A (en) * 2018-12-21 2020-08-28 北京哲源科技有限责任公司 Method for obtaining deterministic events in cells, electronic device and storage medium
CN111723261A (en) * 2019-03-22 2020-09-29 昆明逆火科技股份有限公司 Search engine-based DNA comparison algorithm
CN113808663A (en) * 2021-09-01 2021-12-17 基诺莱(重庆)生物技术有限公司 Artificial intelligence-based gene variation site matching method, system and equipment
TWI823203B (en) * 2021-12-03 2023-11-21 臺中榮民總醫院 Automated multi-gene assisted diagnosis of autoimmune diseases

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1255948A (en) * 1997-03-10 2000-06-07 医疗科学系统有限公司 Prediction of coronary artery disease
CN101379502A (en) * 2006-02-03 2009-03-04 沃尔科公司 Quantitative HIV phenotype or tropism assay
CN101617227A (en) * 2006-11-30 2009-12-30 纳维哲尼克斯公司 Genetic analysis systems and method
CN102542179A (en) * 2010-10-27 2012-07-04 三星Sds株式会社 Apparatus and method for extracting biomarkers
CN105229649A (en) * 2013-03-15 2016-01-06 百世嘉(上海)医疗技术有限公司 For the human genome analysis of variance of disease association and the system and method for report

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1255948A (en) * 1997-03-10 2000-06-07 医疗科学系统有限公司 Prediction of coronary artery disease
CN101379502A (en) * 2006-02-03 2009-03-04 沃尔科公司 Quantitative HIV phenotype or tropism assay
CN101617227A (en) * 2006-11-30 2009-12-30 纳维哲尼克斯公司 Genetic analysis systems and method
CN102542179A (en) * 2010-10-27 2012-07-04 三星Sds株式会社 Apparatus and method for extracting biomarkers
CN105229649A (en) * 2013-03-15 2016-01-06 百世嘉(上海)医疗技术有限公司 For the human genome analysis of variance of disease association and the system and method for report

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599613B (en) * 2016-12-15 2019-02-05 博奥生物集团有限公司 A method of determining the classification of hereditary tumor variant sites
CN106599613A (en) * 2016-12-15 2017-04-26 博奥生物集团有限公司 Method for judging genetic tumor variation site classification
CN106834107A (en) * 2017-03-10 2017-06-13 首度生物科技(苏州)有限公司 A kind of prediction tumour system for being based on the sequencing of two generations
CN106778049A (en) * 2017-03-13 2017-05-31 成都育芽科技有限公司 A kind of accurate medical rescue in community based on big data platform and its method
CN106951730A (en) * 2017-03-21 2017-07-14 为朔医学数据科技(北京)有限公司 A kind of pathogenic grade of genetic mutation determines method and device
CN106971071A (en) * 2017-03-27 2017-07-21 为朔医学数据科技(北京)有限公司 A kind of Clinical Decision Support Systems and method
CN107103207A (en) * 2017-04-05 2017-08-29 浙江大学 Based on the multigroup accurate medical knowledge search system and implementation method for learning variation features of case
CN107103207B (en) * 2017-04-05 2020-07-03 浙江大学 Accurate medical knowledge search system based on case multigroup variation characteristics and implementation method
CN107122607A (en) * 2017-04-28 2017-09-01 为朔医学数据科技(北京)有限公司 A kind of method and device for generating therapeutic regimen report
CN107267613A (en) * 2017-06-28 2017-10-20 安吉康尔(深圳)科技有限公司 Sequencing data processing system and SMN gene detection systems
CN107391908A (en) * 2017-06-28 2017-11-24 天方创新(北京)信息技术有限公司 Exercise risk appraisal procedure and device
CN107301323A (en) * 2017-08-14 2017-10-27 安徽医科大学第附属医院 A kind of construction method of the disaggregated model related to psoriasis
CN107610779A (en) * 2017-10-25 2018-01-19 医渡云(北京)技术有限公司 Disease Assessment Scale and risk appraisal procedure and device
CN107610779B (en) * 2017-10-25 2021-10-22 医渡云(北京)技术有限公司 Disease evaluation and disease risk evaluation method and device
CN108710782A (en) * 2018-05-16 2018-10-26 为朔医学数据科技(北京)有限公司 Genotype conversion method, device and electronic equipment
CN109545277A (en) * 2018-11-21 2019-03-29 广州市康健基因科技有限公司 A kind of methods of marking and system of scd gene mutation point
CN111602201B (en) * 2018-12-21 2023-08-01 北京哲源科技有限责任公司 Method for obtaining deterministic event in cell, electronic device and storage medium
CN111602201A (en) * 2018-12-21 2020-08-28 北京哲源科技有限责任公司 Method for obtaining deterministic events in cells, electronic device and storage medium
CN111370131B (en) * 2018-12-26 2023-06-09 陈治平 Method and system for screening biomarkers via disease trajectories
CN111370131A (en) * 2018-12-26 2020-07-03 陈治平 Method and system for screening biomarkers via disease trajectories
CN109817299A (en) * 2019-02-14 2019-05-28 北京安智因生物技术有限公司 A kind of relevant genetic test report automatic generating method of disease and system
CN111723261A (en) * 2019-03-22 2020-09-29 昆明逆火科技股份有限公司 Search engine-based DNA comparison algorithm
CN110301899A (en) * 2019-07-01 2019-10-08 湘南学院附属医院 A kind of cardiovascular and cerebrovascular disease information detecting system and method
CN110660055B (en) * 2019-09-25 2022-11-29 北京青燕祥云科技有限公司 Disease data prediction method and device, readable storage medium and electronic equipment
CN110660055A (en) * 2019-09-25 2020-01-07 北京青燕祥云科技有限公司 Disease data prediction method and device, readable storage medium and electronic equipment
CN111243661A (en) * 2020-01-13 2020-06-05 北京奇云诺德信息科技有限公司 Gene physical examination system based on gene data
CN113808663A (en) * 2021-09-01 2021-12-17 基诺莱(重庆)生物技术有限公司 Artificial intelligence-based gene variation site matching method, system and equipment
TWI823203B (en) * 2021-12-03 2023-11-21 臺中榮民總醫院 Automated multi-gene assisted diagnosis of autoimmune diseases

Similar Documents

Publication Publication Date Title
CN106202936A (en) A kind of disease risks Forecasting Methodology and system
JP7145907B2 (en) Systems and Methods for Detection and Treatment of Diseases Exhibiting Disease Cell Heterogeneity and Communication Test Results
JP7487163B2 (en) Detection and diagnosis of cancer evolution
CN106227992A (en) A kind of recommendation method and system of therapeutic scheme
JP6991134B2 (en) Population-based treatment recommendations using cell-free DNA
Morrissy et al. Spatial heterogeneity in medulloblastoma
US20200395097A1 (en) Pan-cancer model to predict the pd-l1 status of a cancer cell sample using rna expression data and other patient data
US20140229495A1 (en) Method for processing genomic data
CA3050055C (en) Methods and processes for assessment of genetic variations
CA3167253A1 (en) Methods and systems for a liquid biopsy assay
TW201926095A (en) Models for targeted sequencing
Xie et al. Advances in artificial intelligence to predict cancer immunotherapy efficacy
CN109599157A (en) A kind of accurate intelligent diagnosis and treatment big data system
Jonnagaddala et al. Integration and analysis of heterogeneous colorectal cancer data for translational research
Gao et al. A radiogenomics biomarker based on immunological heterogeneity for non-invasive prognosis of renal clear cell carcinoma
Zong et al. Developing a FHIR-based framework for phenome wide association studies: a case study with a pan-cancer cohort
Xu et al. Single‐Cell Sequencing Analysis Based on Public Databases for Constructing a Metastasis‐Related Prognostic Model for Gastric Cancer
Zeng et al. Discovery of genetic biomarkers for Alzheimer’s disease using adaptive convolutional neural networks ensemble and genome-wide association studies
Gevaert et al. Imaging-AMARETTO: an imaging genomics software tool to interrogate multiomics networks for relevance to radiography and histopathology imaging biomarkers of clinical outcomes
US20240052419A1 (en) Methods and systems for detecting genetic variants
Alemzadeh et al. A Visual Analytics Approach for Patient Stratification and Biomarker Discovery.
Ferro dos Santos et al. Computational deconvolution of DNA methylation data from mixed DNA samples
Yuan et al. Multimodal data integration using deep learning predicts overall survival of patients with glioma

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161207