JP6820621B2 - 相互依存性の特定方法 - Google Patents
相互依存性の特定方法 Download PDFInfo
- Publication number
- JP6820621B2 JP6820621B2 JP2019509406A JP2019509406A JP6820621B2 JP 6820621 B2 JP6820621 B2 JP 6820621B2 JP 2019509406 A JP2019509406 A JP 2019509406A JP 2019509406 A JP2019509406 A JP 2019509406A JP 6820621 B2 JP6820621 B2 JP 6820621B2
- Authority
- JP
- Japan
- Prior art keywords
- event
- data
- samples
- fisher
- calculated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims description 55
- 238000000729 Fisher's exact test Methods 0.000 claims description 53
- 238000010197 meta-analysis Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 4
- 108090000623 proteins and genes Proteins 0.000 description 73
- 206010028980 Neoplasm Diseases 0.000 description 29
- 201000011510 cancer Diseases 0.000 description 27
- 239000000523 sample Substances 0.000 description 25
- 238000004364 calculation method Methods 0.000 description 16
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 13
- 230000014509 gene expression Effects 0.000 description 13
- 238000004458 analytical method Methods 0.000 description 9
- 201000010099 disease Diseases 0.000 description 9
- 230000035772 mutation Effects 0.000 description 9
- 101000599940 Homo sapiens Interferon gamma Proteins 0.000 description 8
- 101001071437 Homo sapiens Metabotropic glutamate receptor 1 Proteins 0.000 description 8
- 102100037850 Interferon gamma Human genes 0.000 description 8
- 102100036834 Metabotropic glutamate receptor 1 Human genes 0.000 description 8
- 210000004027 cell Anatomy 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 241000282412 Homo Species 0.000 description 6
- 238000007792 addition Methods 0.000 description 6
- 210000000481 breast Anatomy 0.000 description 6
- 241000209094 Oryza Species 0.000 description 5
- 235000007164 Oryza sativa Nutrition 0.000 description 5
- 238000001558 permutation test Methods 0.000 description 5
- 235000009566 rice Nutrition 0.000 description 5
- 102100028802 Calsyntenin-3 Human genes 0.000 description 4
- 101000916414 Homo sapiens Calsyntenin-3 Proteins 0.000 description 4
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000012239 gene modification Methods 0.000 description 4
- 230000004931 aggregating effect Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 230000037361 pathway Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 208000037968 sinus cancer Diseases 0.000 description 3
- 206010006187 Breast cancer Diseases 0.000 description 2
- 208000026310 Breast neoplasm Diseases 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- 102000001301 EGF receptor Human genes 0.000 description 2
- 108060006698 EGF receptor Proteins 0.000 description 2
- 108010033040 Histones Proteins 0.000 description 2
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 2
- 206010060862 Prostate cancer Diseases 0.000 description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 2
- 208000006265 Renal cell carcinoma Diseases 0.000 description 2
- 208000024770 Thyroid neoplasm Diseases 0.000 description 2
- 230000021736 acetylation Effects 0.000 description 2
- 238000006640 acetylation reaction Methods 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 239000012472 biological sample Substances 0.000 description 2
- 210000000988 bone and bone Anatomy 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000008777 canonical pathway Effects 0.000 description 2
- 208000029742 colonic neoplasm Diseases 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008995 epigenetic change Effects 0.000 description 2
- 125000000524 functional group Chemical group 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 201000005202 lung cancer Diseases 0.000 description 2
- 208000020816 lung neoplasm Diseases 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000026731 phosphorylation Effects 0.000 description 2
- 238000006366 phosphorylation reaction Methods 0.000 description 2
- 230000004481 post-translational protein modification Effects 0.000 description 2
- 108090000765 processed proteins & peptides Proteins 0.000 description 2
- 201000002510 thyroid cancer Diseases 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000034512 ubiquitination Effects 0.000 description 2
- 238000010798 ubiquitination Methods 0.000 description 2
- ODHCTXKNWHHXJC-VKHMYHEASA-N 5-oxo-L-proline Chemical compound OC(=O)[C@@H]1CCC(=O)N1 ODHCTXKNWHHXJC-VKHMYHEASA-N 0.000 description 1
- 230000005730 ADP ribosylation Effects 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 1
- 206010052747 Adenocarcinoma pancreas Diseases 0.000 description 1
- 208000007860 Anus Neoplasms Diseases 0.000 description 1
- 206010004593 Bile duct cancer Diseases 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000032170 Congenital Abnormalities Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 102000009024 Epidermal Growth Factor Human genes 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 1
- 206010067807 Gingival cancer Diseases 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 108010074328 Interferon-gamma Proteins 0.000 description 1
- 102000008070 Interferon-gamma Human genes 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- 206010023825 Laryngeal cancer Diseases 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 208000005450 Maxillary Sinus Neoplasms Diseases 0.000 description 1
- 235000015429 Mirabilis expansa Nutrition 0.000 description 1
- 244000294411 Mirabilis expansa Species 0.000 description 1
- 208000003445 Mouth Neoplasms Diseases 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- 201000004458 Myoma Diseases 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 102000016978 Orphan receptors Human genes 0.000 description 1
- 108070000031 Orphan receptors Proteins 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 102100026459 POU domain, class 3, transcription factor 2 Human genes 0.000 description 1
- 101710133394 POU domain, class 3, transcription factor 2 Proteins 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- ODHCTXKNWHHXJC-GSVOUGTGSA-N Pyroglutamic acid Natural products OC(=O)[C@H]1CCC(=O)N1 ODHCTXKNWHHXJC-GSVOUGTGSA-N 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 206010038019 Rectal adenocarcinoma Diseases 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- 206010062129 Tongue neoplasm Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- ODHCTXKNWHHXJC-UHFFFAOYSA-N acide pyroglutamique Natural products OC(=O)C1CCC(=O)N1 ODHCTXKNWHHXJC-UHFFFAOYSA-N 0.000 description 1
- 230000010933 acylation Effects 0.000 description 1
- 238000005917 acylation reaction Methods 0.000 description 1
- 208000009956 adenocarcinoma Diseases 0.000 description 1
- 201000005188 adrenal gland cancer Diseases 0.000 description 1
- 208000024447 adrenal gland neoplasm Diseases 0.000 description 1
- 230000029936 alkylation Effects 0.000 description 1
- 238000005804 alkylation reaction Methods 0.000 description 1
- 230000009435 amidation Effects 0.000 description 1
- 238000007112 amidation reaction Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000000941 bile Anatomy 0.000 description 1
- 208000026900 bile duct neoplasm Diseases 0.000 description 1
- 230000008236 biological pathway Effects 0.000 description 1
- 230000006287 biotinylation Effects 0.000 description 1
- 238000007413 biotinylation Methods 0.000 description 1
- 230000007698 birth defect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 201000004948 cheek mucosa cancer Diseases 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 230000006329 citrullination Effects 0.000 description 1
- 201000010897 colon adenocarcinoma Diseases 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 208000030381 cutaneous melanoma Diseases 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000006240 deamidation Effects 0.000 description 1
- 230000009615 deamination Effects 0.000 description 1
- 238000006481 deamination reaction Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 230000006126 farnesylation Effects 0.000 description 1
- 230000022244 formylation Effects 0.000 description 1
- 238000006170 formylation reaction Methods 0.000 description 1
- 201000004457 frontal sinus cancer Diseases 0.000 description 1
- 230000006251 gamma-carboxylation Effects 0.000 description 1
- 201000006585 gastric adenocarcinoma Diseases 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000006130 geranylgeranylation Effects 0.000 description 1
- 208000005017 glioblastoma Diseases 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 230000006238 glycylation Effects 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 1
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000005805 hydroxylation reaction Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 229960003130 interferon gamma Drugs 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 230000006122 isoprenylation Effects 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 206010023841 laryngeal neoplasm Diseases 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 description 1
- 230000006144 lipoylation Effects 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 201000005249 lung adenocarcinoma Diseases 0.000 description 1
- 201000005243 lung squamous cell carcinoma Diseases 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 201000004488 maxillary sinus cancer Diseases 0.000 description 1
- 208000019303 maxillary sinus carcinoma Diseases 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 235000013536 miso Nutrition 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 201000002094 pancreatic adenocarcinoma Diseases 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 238000003068 pathway analysis Methods 0.000 description 1
- 150000003905 phosphatidylinositols Chemical class 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- -1 polyethylene Polymers 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 230000013823 prenylation Effects 0.000 description 1
- 230000006340 racemization Effects 0.000 description 1
- 201000001281 rectum adenocarcinoma Diseases 0.000 description 1
- 238000006479 redox reaction Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 235000002020 sage Nutrition 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 201000003708 skin melanoma Diseases 0.000 description 1
- 235000014347 soups Nutrition 0.000 description 1
- 206010041823 squamous cell carcinoma Diseases 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 230000010741 sumoylation Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 201000006134 tongue cancer Diseases 0.000 description 1
- 230000006264 tyrosine-glycosylation Effects 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Biophysics (AREA)
- Bioethics (AREA)
- Biotechnology (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Public Health (AREA)
- Computational Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Physics (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Complex Calculations (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Description
(1)第1の事象の情報と第2の事象の情報をN個のサンプルについて含むデータを取得する工程を行う手段、
(2)前記の第1の事象の情報と第2の事象の情報をN個のサンプルについて含むデータから、第1の事象についての2値データと第2の事象についての2値データを含むデータセットを取得する工程を行う手段、
(3)N個のサンプルのそれぞれが第1の事象についての基準と第2の事象についての基準に基づき、2×2の分割表の類型のいずれに該当するかを判定して、N個のサンプルのそれぞれを、前記各類型に分類する工程を行う手段、
(4)N個のサンプルのそれぞれを、前記各類型に分類し、これをN個の全サンプルについて繰り返し、各類型に分類されたサンプルの数を集計して、前記データセットから2×2の分割表にサンプルの数を集計する工程を行う手段、
(5)前記サンプルの数を集計した2×2の分割表に基づいて、フィッシャーの正確確率Pを算出する工程を行う手段、及び
(6)前記算出したフィッシャーの正確確率Pと、前記Nをもとに、−log10P/(Nlog102)を算出する工程を行う手段
として機能させるためのプログラムを挙げることができる。
米国The Cancer Genome Atlas(TCGA)(http://cancergenome.nih.gov/)から、サンプル数1019の乳房浸潤癌患者のデータ(BRCA)をダウンロードした。このデータは、約20,000個の遺伝子についての情報を含んでいた。目的遺伝子としてのCLSTN3(Calsyntenin 3)のmRNA発現につき、野生型に比して2倍を超えるか、2倍以下かを基準として、各乳房浸潤癌患者を2類型に分類した。同様に、他の残りの遺伝子のmRNA発現についても、野生型に比して2倍を超えるか、2倍以下かを基準として、各乳房浸潤癌患者を2類型に分類した。分類後のデータをもとにして、上記の基準に応じて、CLSTN3(Calsyntenin 3)と、他の残りの遺伝子のそれぞれにつき、2×2の分割表に乳房浸潤癌患者の数を集計した。集計された数をもとに、前述した相互情報量の定義の式を用いて、各遺伝子につき、CLSTN3(Calsyntenin 3)との相互情報量を算出した。また、集計された数をもとに、各遺伝子につき、フィッシャーの正確確率pを算出した。各遺伝子につき、算出したCLSTN3(Calsyntenin 3)との相互情報量と、フィッシャーの正確確率pから求めた−log(p)の値を、グラフにプロットした。
点突然変異の有無を基準として各乳房浸潤癌患者を分類した場合においても同様の結果が得られた。
急性骨髄性白血病、膀胱尿路上皮癌、乳房浸潤癌、結腸腺癌、多形神経膠芽腫、頭頸部扁平上皮癌、腎臓腎細胞癌、腎臓乳頭細胞癌、肺腺癌、肺扁平上皮癌、卵巣漿液性嚢胞腺癌、膵臓腺癌、前立腺癌、直腸腺癌、皮膚メラノーマ、胃腺癌、甲状腺癌、子宮内膜癌、がん細胞株(CCLE)という計19種類のサンプルについて、それぞれのサンプルについてのデータをTCGA(http://cancergenome.nih.gov/)からダウンロードした。なお、上記のCCLEは症例データではなく、1021種類の株化癌細胞を用いたデータである。それぞれのサンプルについてのデータは、サンプルとして66〜1021症例を含み、約20,000個の遺伝子についての情報を含んでいた。
RB1(RB Transcriptional Corepressor 1)、IFNG(interferon gamma)及びGRM1(glutamate metabotropic receptor 1)をそれぞれ目的遺伝子としたほかは、実施例2と同様の方法を行った。それぞれの目的遺伝子につき、算出した値が高い遺伝子から順に並べた結果を、図3〜図5に示す。
スーパーマーケットチェーンのA店舗での1週間の売り上げについて、サンプル数約5000の購入履歴をPOSシステムからダウンロードする。このデータは、個々の購入の内容についての情報を含むものである。5000のサンプルについて、「おにぎり」のカテゴリーに属する商品を購入しているか否かを基準として、2類型に分類する。同様に、他の商品カテゴリー(商品カテゴリー数は約300)についても、購入しているか否かを基準として、2類型に分類する。実施例1と同様の方法により、「おにぎり」と各商品カテゴリーについての2×2の分割表においてサンプルを集計し、その集計結果に基づきフィッシャーの正確確率Pを算出する。これを約200の商品カテゴリーの全てについて行う。
東京証券取引所の第1部で株式が取引される銘柄(約2000銘柄)についての2017年の株価推移のデータをダウンロードする。2017年の取引日は約240日あり、それぞれの日をサンプルとする。次に、2017年におけるドル円相場のレート(円換算した1ドルの価格)のデータをダウンロードする。ドル円相場のレートのデータを用い、サンプル日におけるドル円相場のレートが、前日のレートよりも高くなっているか否かを基準として、2類型に分類する。次に、株価推移のデータを用い、各会社の株価について、株の取引開始時よりも取引終了時の方が高くなっているか否かを基準として、2類型に分類する。実施例1と同様の方法により、ドル円相場の変動と会社の株価の変動についての2×2の分割表においてサンプルを集計し、その集計結果に基づきフィッシャーの正確確率Pを算出する。これを約2000銘柄の株価について行う。
Claims (4)
- 第1の事象と第2の事象の相互依存性の特定方法であって、
N個のサンプルについて第1の事象の情報と第2の事象の情報を含むデータから、コンピュータが第1の事象についての2値データと第2の事象についての2値データを含むデータセットを取得する工程、
前記データセットから、コンピュータが2×2の分割表にサンプルの数を集計する工程、
前記2×2の分割表に基づいて、コンピュータがフィッシャーの正確確率Pを算出する工程、及び、
前記フィッシャーの正確確率Pと、前記Nをもとに、コンピュータが−log10P/Nを算出する工程
を含むことを特徴とする、方法。 - 請求項1に記載の方法であって、前記フィッシャーの正確確率Pが、
(1)N1個のサンプルについて、
第1の事象についての第1の基準及び第2の事象についての第1の基準に基づいて、第1の事象についての2値データと第2の事象についての2値データを含むデータセットから、コンピュータが算出した、フィッシャーの正確確率P1と、
(2)N2個のサンプルについて、
第1の事象についての第2の基準及び第2の事象についての第2の基準に基づいて、第1の事象についての2値データと第2の事象についての2値データを含むデータセットから、コンピュータが算出した、フィッシャーの正確確率P2とを含む、
複数のフィッシャーの正確確率を、コンピュータがメタ解析を用いて統合する工程を含む方法によりコンピュータが算出したものである、方法。 - 請求項1または2に記載の方法を実行させるためのコンピュータ用プログラム。
- 請求項3に記載のコンピュータ用プログラムを保存した記録媒体。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017072904 | 2017-03-31 | ||
JP2017072904 | 2017-03-31 | ||
PCT/JP2018/013877 WO2018181988A1 (ja) | 2017-03-31 | 2018-03-30 | 相互依存性の特定方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
JPWO2018181988A1 JPWO2018181988A1 (ja) | 2020-04-23 |
JP6820621B2 true JP6820621B2 (ja) | 2021-01-27 |
Family
ID=63678171
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019509406A Active JP6820621B2 (ja) | 2017-03-31 | 2018-03-30 | 相互依存性の特定方法 |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP6820621B2 (ja) |
WO (1) | WO2018181988A1 (ja) |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1830289A1 (en) * | 2005-11-30 | 2007-09-05 | Institut National De La Sante Et De La Recherche Medicale (Inserm) | Methods for hepatocellular carninoma classification and prognosis |
JP2009048455A (ja) * | 2007-08-21 | 2009-03-05 | Nippon Hoso Kyokai <Nhk> | 節間関係推定装置およびコンピュータプログラム |
JP2009069911A (ja) * | 2007-09-10 | 2009-04-02 | Mizuho Information & Research Institute Inc | 遺伝子関連解析装置及び遺伝子関連解析プログラム |
JP2013123420A (ja) * | 2011-12-15 | 2013-06-24 | World Fusion Co Ltd | 遺伝子セット作成方法 |
ES2731913T3 (es) * | 2014-01-30 | 2019-11-19 | Ares Genetics Gmbh | Pruebas de resistencia genética |
-
2018
- 2018-03-30 JP JP2019509406A patent/JP6820621B2/ja active Active
- 2018-03-30 WO PCT/JP2018/013877 patent/WO2018181988A1/ja active Application Filing
Also Published As
Publication number | Publication date |
---|---|
JPWO2018181988A1 (ja) | 2020-04-23 |
WO2018181988A1 (ja) | 2018-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jamshidi et al. | Evaluation of cell-free DNA approaches for multi-cancer early detection | |
Suwinski et al. | Advancing personalized medicine through the application of whole exome sequencing and big data analytics | |
Nabet et al. | Noninvasive early identification of therapeutic benefit from immune checkpoint inhibition | |
Liu et al. | Phenotype prediction and genome-wide association study using deep convolutional neural network of soybean | |
Ho et al. | ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis | |
Pang et al. | Pathway analysis using random forests classification and regression | |
Kebschull et al. | Molecular differences between chronic and aggressive periodontitis | |
Torang et al. | An elastic-net logistic regression approach to generate classifiers and gene signatures for types of immune cells and T helper cell subsets | |
US20220215900A1 (en) | Systems and methods for joint low-coverage whole genome sequencing and whole exome sequencing inference of copy number variation for clinical diagnostics | |
US10451626B2 (en) | Method for detecting a solid tumor cancer | |
Ostrovnaya et al. | Clonality: an R package for testing clonal relatedness of two tumors from the same patient based on their genomic profiles | |
Buckberry et al. | massiR: a method for predicting the sex of samples in gene expression microarray datasets | |
Toh et al. | Genetic risk score for ovarian cancer based on chromosomal-scale length variation | |
Kim et al. | MHC II immunogenicity shapes the neoepitope landscape in human tumors | |
JP6820621B2 (ja) | 相互依存性の特定方法 | |
Zandavi et al. | Disentangling single-cell omics representation with a power spectral density-based feature extraction | |
Ruan et al. | An empirical Bayes’ approach to joint analysis of multiple microarray gene expression studies | |
Royston et al. | Application of single-cell approaches to study myeloproliferative neoplasm biology | |
Asare et al. | Power enhancement via multivariate outlier testing with gene expression arrays | |
Bedo et al. | Precision-mapping and statistical validation of quantitative trait loci by machine learning | |
Salunkhe et al. | CytoPred: 7-gene pair metric for AML cytogenetic risk prediction | |
Vasmatzis et al. | Quantitating tissue specificity of human genes to facilitate biomarker discovery | |
Novianti et al. | An application of sequential meta-analysis to gene expression studies | |
Yoon et al. | Large scale data mining approach for gene-specific standardization of microarray gene expression data | |
Sun et al. | The types of tumor infiltrating lymphocytes are valuable for the diagnosis and prognosis of breast cancer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20190910 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20200929 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20201127 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20201216 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20201223 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 6820621 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |