CN111321202A - Gene fusion variation library construction method, detection method, device, equipment and storage medium - Google Patents

Gene fusion variation library construction method, detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN111321202A
CN111321202A CN201911419273.9A CN201911419273A CN111321202A CN 111321202 A CN111321202 A CN 111321202A CN 201911419273 A CN201911419273 A CN 201911419273A CN 111321202 A CN111321202 A CN 111321202A
Authority
CN
China
Prior art keywords
gene
fusion
reads
library
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911419273.9A
Other languages
Chinese (zh)
Inventor
黄晓强
刘菲菲
区小华
陈禹欣
杨娟
赵薇薇
于世辉
赵纤纤
冯菁华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kingmed Diagnostics Group Co ltd
Original Assignee
Guangzhou Kingmed Diagnostics Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kingmed Diagnostics Group Co ltd filed Critical Guangzhou Kingmed Diagnostics Group Co ltd
Priority to CN201911419273.9A priority Critical patent/CN111321202A/en
Publication of CN111321202A publication Critical patent/CN111321202A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Medicinal Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Chemical & Material Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a gene fusion mutation library construction method, a gene fusion mutation library detection device, computer equipment and a computer storage medium. The gene fusion mutation library construction method, the gene fusion mutation detection method and the gene fusion mutation detection device are based on a DNA probe hybridization capture multigene RNA targeted sequencing technology, target fusion genes are captured through hybridization of the fusion gene capture probes, and a gene fusion mutation library is constructed. Furthermore, the invention also designs a fusion gene quantitative analysis method, which can obtain the variation proportion of the fusion gene through calculation so as to obtain the accurate expression quantity value of the fusion gene.

Description

Gene fusion variation library construction method, detection method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of molecular biology and bioinformatics, in particular to a gene fusion variation library construction method, a gene fusion variation library detection device, a gene fusion variation library equipment and a storage medium.
Background
Cytogenetic studies have found that a series of hematological tumors, including AML, ALL, CML, NHLs and the like, have multiple chromosomal translocations, which result in abnormal expression of oncogenes and/or transcriptional expression of fusion genes, and ALL promote transformation and survival of cancer cells. These core driver genes (e.g., MLL, ALK, etc.) often have multiple fusion gene partners (partner) and may also have different breakpoints (breakpoints) with the same fusion gene, thereby forming different subtypes, e.g., there are 54 known fusion partners for the MLL gene, and the cosmoc database contains up to 15 fusion subtypes for the KMT 2A-AFF 1 fusion gene (https:// cancer. sanger. ac. uk/COSMIC/fusion/overview?fid ═ 359723& gid ═ 271430). These fusion gene variations affect clinical prognosis and can direct molecular typing and targeted therapy of hematological tumors. It is therefore desirable to develop a genomic detection reagent to identify gene fusion variants in hematological tumors.
RT-PCR and Fluorescence In Situ Hybridization (FISH) are two commonly used gene fusion detection techniques. Both of them detect a single specific type of known gene fusion, with narrow application range and low efficiency, and even impossible to detect new gene fusion variations. Therefore, the deficiency of the fusion gene detection technology still limits the auxiliary diagnosis and accurate medical treatment of the blood tumor.
Disclosure of Invention
In view of the above, it is desirable to provide a method for constructing a gene fusion mutation library, a method for detecting a gene fusion mutation library, an apparatus, a computer device, and a computer storage medium, which have a wide application range and high detection efficiency and can detect a newly-discovered gene fusion mutation.
A method for constructing a gene fusion variation library comprises the following steps:
extracting total RNA of the sample, and removing rRNA in the total RNA;
reverse transcribing the total RNA after rRNA is removed, synthesizing double-stranded cDNA, and synthesizing by using dUTP instead of dTTP when synthesizing a second strand of the double-stranded cDNA;
performing end repair and adding a connecting joint to the synthesized double-stranded cDNA;
digesting dUTP in the double-stranded DNA after the end repair and the addition of the connecting joint by enzyme digestion to generate a gap in the double-stranded cDNA;
amplifying the double-stranded DNA after enzyme digestion to construct a cDNA pre-library;
hybridizing and capturing a target fusion cDNA in the cDNA pre-library using a fusion gene capture probe, the target fusion cDNA being formed by fusing at least two different genes, the fusion gene capture probe comprising a sequence capable of complementary pairing with a sequence of one of the genes of the target fusion cDNA;
and amplifying the captured target fusion cDNA to obtain the gene fusion variation library.
In one embodiment, the fusion gene capture probe is designed as follows:
(1) the fusion gene capture probe is designed aiming at a core gene in target fusion cDNA, wherein the core gene refers to a gene which has a plurality of gene partners and is easy to generate fusion variation, or a key gene in a cell growth or proliferation signal pathway, or a driving gene;
(2) the fusion gene capture probe is designed aiming at the transcript sequence of the core gene;
(3) the fusion gene capture probe is designed aiming at a core gene in the hg19 reference genome, and the coverage density is a2 × double-tile sequence;
(4) the length of the fusion gene capture probe is 120 bp;
(5) the fusion gene capture probe needs to be compared to a human transcriptome sequence during design, the number of all Blast matches is counted, if the number of the Blast matches is not more than 50, the fusion gene capture probe is qualified, and if the number of the Blast matches is more than 50, the fusion gene capture probe is redesigned in a mode of replacing mismatched bases until the highest matching performance of the target gene sequence is obtained and the number of the Blast matches is not more than 50.
In one embodiment, the 5' end of the fusion gene capture probe is labeled with a linker for capture;
optionally, the linker is biotin or streptavidin.
In one embodiment, the sample total RNA is total RNA of a peripheral blood or bone marrow sample.
In one embodiment, the end repair is the addition of one dATP at the 3' end of the synthesized double-stranded cDNA;
the joint format introduced by adding the connecting joint is P5-Real1primer-DNAINSERT-IndexReadprimer-index-P7, and specifically comprises the following steps: 5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCT TCCGATC T-DNA fragment sequence to be detected-GTTCGTCTTCTGCCGTATGCTCTA-index-C ACTGACCTCAAGTCTGCACACGAGAAGGCTAG-P, wherein P5 and P7 are joints, Real1primer and IndexReadprimer are primer sequences, DNAINSERT is the DNA fragment sequence to be detected, index is a unique sample label of 12nt, and P is a phosphate group.
In one embodiment, the amplification of the digested double-stranded DNA and the captured target fusion cDNA is performed using primers paired with the adaptor P5 and P7 sequences.
A gene fusion variation detection method comprises the following steps:
obtaining sequencing data of a gene fusion variation library, wherein the gene fusion variation library is an amplification library of a target fusion gene obtained by hybridizing and capturing a transcription sequence of a sample to be detected through a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence which can be complementarily paired with a sequence of one gene of the target fusion gene;
comparing the sequencing data with human transcriptome and genome data, and screening reads capable of being matched with at least two genes simultaneously;
and analyzing whether the reads which can be matched with at least two genes simultaneously meet the preset threshold requirement, and if so, indicating that a plurality of genes contained in the reads are subjected to gene fusion.
In one embodiment, the step of aligning the sequencing data to human transcriptome and genome data and screening for reads that can simultaneously match to at least two genes further comprises:
and performing quality evaluation on the sequencing data, and removing low-quality reads to obtain clean sequencing data.
In one embodiment, the culling low quality reads comprises:
removing reads containing the linker sequence;
removing reads with mass value lower than 15 and low mass base ratio ≧ 50%;
the reads with N content larger than 1% are removed.
In one embodiment, the method further comprises the step of rejecting false positive events in the clean sequencing data according to a preset control standard after comparing the sequencing data with human transcriptome and genome data;
specifically, the screened gene fusion variant events are annotated, false and true are removed, and the gene fusion variant events meeting the following standards are removed:
different genes of the fusion gene are paralogous with each other;
different genes of the fusion gene are pseudogenes;
this gene fusion variation has been detected in normal healthy persons.
In one embodiment, the preset threshold requirement refers to: if the fusion gene variation has clinical significance, the number of unique spanning reads matched with the two genes is more than 3; if the fusion gene variation is of unknown clinical significance, more than 10 unique spanning reads can be matched to the two genes simultaneously.
In one embodiment, the method further comprises the following steps:
calculating the variation ratio of the fusion gene according to the following formula:
Figure RE-GDA0002461271430000051
wherein,
Figure RE-GDA0002461271430000052
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the refgeneFPKM is the normalized expression value of the reference gene;
the FPKM is defined as the Reads Per Kibase of exon model Per Million mapped Reads, i.e., the number of Reads aligned to every 1K bases of an exon in every 1 Million aligned Reads.
A gene fusion mutation detection device comprising:
the system comprises a sequencing data acquisition module, a gene fusion variation library analysis module and a data analysis module, wherein the sequencing data acquisition module is used for acquiring sequencing data of a gene fusion variation library, the gene fusion variation library is an amplification library of a target fusion gene obtained by hybridizing and capturing a transcription sequence of a sample to be detected through a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence which can be complementarily paired with a sequence of one gene of the target fusion gene;
the comparison screening module is used for comparing the sequencing data with human transcriptome and genome data and screening reads which can be matched with at least two genes simultaneously; and
and the fusion analysis module is used for analyzing whether the reads which can be matched with at least two genes simultaneously meet the preset threshold requirement, and if so, the fact that the genes contained in the reads are subjected to gene fusion is indicated.
In one embodiment, the method further comprises the following steps:
and the variation ratio calculation module is used for calculating the variation ratio of the fusion gene according to the following formula:
Figure RE-GDA0002461271430000061
wherein,
Figure RE-GDA0002461271430000062
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the refgeneFPKM is the normalized expression value of the reference gene;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the FPKM is defined as the Reads Per Kibase of exon model Per Million mapped Reads, i.e., the number of Reads aligned to every 1K bases of an exon in every 1 Million aligned Reads.
A computer device having a processor and a memory, the memory storing a computer program, the processor implementing the steps of the method for detecting a genetic fusion mutation as described in any of the above embodiments when executing the computer program.
A computer storage medium having stored thereon a computer program that, when executed, performs the steps of the method for detecting a genetic fusion mutation as described in any of the above embodiments.
A single driver gene may be genetically fused to multiple other genes (chaperone genes) to form, after transcription of the fused gene, the junction of the core gene exon and the chaperone gene exon (i.e., breakpoint). The gene fusion mutation library construction method, the gene fusion mutation detection method and the gene fusion mutation detection device are based on a DNA probe hybridization capture multigene RNA targeted sequencing technology, target fusion genes are captured through hybridization of the fusion gene capture probes, and a gene fusion mutation library is constructed.
The gene fusion mutation library construction method, the gene fusion mutation detection method and the device can be used for detecting known or newly-discovered gene rearrangement, gene deletion, gene duplication and other gene mutation information related to various blood tumor hot spot fusion genes. Compared with the traditional fluorescence quantitative method, the technical concept of the invention is more comprehensive and efficient, and has efficiency and economy.
Furthermore, the invention also designs a fusion gene quantitative analysis method, which can obtain the variation proportion of the fusion gene through calculation so as to obtain the accurate expression quantity value of the fusion gene.
Drawings
FIG. 1 is a schematic flow chart of a method for detecting a mutation in a fusion gene according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a fusion genetic variation detection apparatus according to an embodiment of the present invention.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The fusion gene refers to a gene which is spliced together by a chromosome rearrangement mechanism and the like on different gene coordinates and is transcribed to form a new fusion protein, and the expression form of the gene A/gene B or a gene A-gene B, such as BCR-ABL1, wherein the gene A and the gene B are fusion gene partners.
The selected gene is a key core fusion gene, the core gene refers to the gene which has high fusion variation frequency, and the core gene is found to have a plurality of fusion gene partners, or refers to a key gene in a cell growth or proliferation signal pathway, or a driver gene (driver gene).
The "reads" refers to a sequence fragment obtained by high-throughput sequencing.
The sequencing quality refers to the accuracy of the base in the read sequence.
The "human transcriptome" is the combination of the products of all gene expressions in human cells.
The human genome is hg 19.
The Paralogs (Paralogs) are those proteins in a species that are derived from gene replication and may evolve new functions related to the original. To describe homologous genes that have been isolated within the same species as a result of gene replication.
The pseudogene can be considered as a non-functional copy of genomic DNA in the genome that closely resembles the sequence of the encoding gene.
The Body Map 2.0 is transcriptome sequencing data for a panel of human normal tissues.
The "gene distance" refers to the distance between the gene coordinates of two genes.
The invention provides a method for constructing a gene fusion variation library, which comprises the following steps:
extracting total RNA of the sample, and removing rRNA in the total RNA;
performing reverse transcription on the total RNA from which the rRNA is removed to synthesize double-stranded cDNA, and synthesizing by using dUTP instead of dTTP when synthesizing a second strand of the double-stranded cDNA;
performing end repair and adding a connecting joint on the synthesized double-stranded cDNA;
digesting dUTP in the double-stranded DNA after the end repair and the addition of the connecting joint by enzyme digestion to generate a gap in the double-stranded cDNA;
amplifying double-stranded DNA after enzyme digestion to construct a cDNA pre-library;
hybridizing a target fusion cDNA in a captured cDNA pre-library using a fusion gene capture probe, the target fusion cDNA being formed by the fusion of at least two different genes, the fusion gene capture probe comprising a sequence capable of complementary pairing with the sequence of one of the genes of the target fusion cDNA;
amplifying the captured target fusion cDNA to obtain a gene fusion variation library.
In a specific example, the sample total RNA is total RNA of a peripheral blood or bone marrow sample. After the total RNA of the sample is extracted, the method preferably further comprises the step of determining the concentration of the nucleic acid and the A260/A280 value.
In a specific example, the removing of the rRNA is performed by hybridizing the total RNA with an rRNA synthetic single-stranded DNA probe, and removing the rRNA by hybridizing the rRNA synthetic single-stranded DNA probe with the rRNA in the total RNA.
In one specific example, end repair is the addition of one dATP at the 3' end of the synthesized double-stranded cDNA; the format of the joint introduced by adding the connecting joint is as follows: P5-Real1 primer-DNAINSERT-IndexReadprimer-index-P7. Specifically, the linker sequence is: 5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC T-DNA fragment sequence to be detected-GTTCGTCTTCTGCCGTATGCTCTA-index-C ACTGACCTCAAGTCTGCACACGAGAAGGCTAG-P, wherein P5 (5'-AATGATACGGCGACCACCGA-3', SEQ ID NO:1) and P7 (5'-CAAGCAGAAGACGGCATACGAGAT-3', SEQ ID NO:2) are joints, Real1primer (GATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, SEQ ID NO:3) and IndexReadprimer (GTTCGTCTTCTGCCGTATGCTCTA, SEQ ID NO:4) are primer sequences, DNAINSERT is the DNA fragment sequence to be detected, index is a unique sample label of 12nt, and P is a phosphate group.
In one specific example, the fusion gene capture probe is designed as follows: (1) the fusion gene capture probe is designed aiming at a core gene in target fusion cDNA, wherein the core gene refers to a gene which has a plurality of gene partners and is easy to generate fusion variation, or a key gene in a cell growth or proliferation signal pathway, or a driving gene;
(2) the fusion gene capture probe is designed aiming at the transcript sequence of the core gene;
(3) the fusion gene capture probe is designed aiming at a core gene in the hg19 reference genome, and the coverage density is2 × imbricated sequence (2 × tiling);
(4) the length of the fusion gene capture probe is 120 bp;
(5) the fusion gene capture probe needs to be compared to a human transcriptome sequence during design, the number of all Blast matches (BLAST hits) is counted, if the number of the Blast matches is not more than 50, the fusion gene capture probe is qualified, and if the number of the Blast matches is more than 50, the fusion gene capture probe is redesigned in a mode of replacing mismatched bases until the highest matching performance on a target gene sequence is obtained and the number of the Blast matches is not more than 50.
For example, in some specific examples, 54 core genes, ABL, CREBBP, CRLF, MECOM, TP, TSLP, LMO, PRDM, MYC, ETV, RARA, NUP214, BCL, MYB, IRF, CBFB, CEBPB, ZNF384, RUNX, FGFR, MALT, ERG, NPM, PAX, JAK, PICALM, FLT, GLIS, PDGFRB, PML, TLX, ITK, FGFR, IL2, TAL, NTRK, NUP, EPOR, RBM, CSF1, KMT2, BCL, BCR, LYN, TLX, ccbpa, TCF, cend, ABL, ALK, pdgfr, IGLL, IGHA, transcript serial numbers from the ensel database, and the overlapping 2-mbwa sequence (tilingl) designed based on their sequences (5' tag probes) can be selected for hematological tumors (leukemias and lymphomas) to obtain a biomarker probe.
Further, the 5' -end of the fusion gene capture probe is labeled with a linker for capture, for example, a linker for immobilization on a substrate, such as biotin or streptavidin.
In one specific example, amplification of digested double-stranded DNA and amplification of captured target fusion cDNA is performed using primers paired with adaptor P5 and P7 sequences.
As shown in FIG. 1, the present invention also provides a method for detecting gene fusion variation, which comprises the following steps:
step S110: obtaining sequencing data of a gene fusion variation library, wherein the gene fusion variation library is an amplification library of a target fusion gene obtained by hybridizing and capturing a transcription sequence of a sample to be detected through a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence which can be complementarily paired with a sequence of one gene of the target fusion gene;
step S120: comparing the sequencing data with human transcriptome and genome data, and screening reads capable of being matched with at least two genes simultaneously;
step S130: analyzing whether reads which can be matched with at least two genes simultaneously meet the preset threshold requirement, and if so, indicating that a plurality of genes contained in the reads are subjected to gene fusion.
In one specific example, the gene fusion variant library can be subjected to high-throughput sequencing using, but not limited to, a Novaseq 6000 high-throughput sequencer, and the sequencing depth can be, but is not limited to, 5000X.
In one specific example, the step of aligning the sequencing data with the human transcriptome and genomic data and screening for reads that can simultaneously match to at least two genes further comprises:
and performing quality evaluation on the sequencing data, and removing low-quality reads to obtain clean sequencing data.
Specifically, raw data can be converted by using but not limited to bcl2fastq software to obtain a raw fastq file, quality evaluation can be performed on the raw fastq data by using the fastQC software, and low-quality reads can be removed by using but not limited to trimmatic software to obtain the clean sequencing data.
Further, in one particular example, the culling low quality reads includes:
removing reads containing the linker sequence;
removing reads with mass value lower than 15 and low mass base ratio ≧ 50%;
the reads with N content larger than 1% are removed.
In a specific example, the gene fusion mutation detection method further comprises the step of removing false positive events in clean sequencing data according to a preset control standard after comparing the sequencing data with the human transcriptome and genome data;
specifically, the screened gene fusion variant events are annotated, false and true are removed, and the gene fusion variant events meeting the following standards are removed:
different genes of the fusion gene are paralogous with each other;
different genes of the fusion gene are pseudogenes;
the gene fusion variation has been detected in normal healthy persons (e.g., Body Map 2.0 is a transcriptome of normal human tissue, and the gene fusion variation detected by analyzing the data is judged to be false positive).
Specifically, all reads can be aligned to the human transcriptome and genome using, but not limited to, BOWTIE, STAR, SPOTLIGHT, etc. software, and reads that match to transcripts of both genes simultaneously can be screened. False positive events are then eliminated by a series of criteria such as paralogs (paralogs), pseudogenes, Body Map 2.0, gene distance, etc. If the reads matched to two genes at the same time exceed the preset threshold requirement, the two genes are determined to be subjected to gene fusion.
More specifically, the preset threshold requirement refers to: if the fusion gene variation has clinical significance, the number of unique spanning reads matched with the two genes is more than 3 (the spanning reads are reads matched with a junction (junction) of gene fusion); if the fusion gene variation is of unknown clinical significance, more than 10 unique spanning reads can be matched to the two genes simultaneously.
Further, the gene fusion mutation detection method provided by the invention also comprises the step of calculating the mutation proportion of the fusion gene according to the following formula:
Figure RE-GDA0002461271430000121
wherein,
Figure RE-GDA0002461271430000122
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the refgeneFPKM is the normalized expression value of the reference gene;
the FPKM is defined as Reads Per Kibase of exon model Per Million mappedreads, i.e., every 1 Million (10)9) The number of reads aligned to every 1K bases of an exon in reads aligned to each alignment.
The gene transcript quantitative model is obtained by calculation according to stringtie software, and mainly aims at the Pair-end sequencing expression quantity. The difference between FPKM and RPKM is that one is fragment and one is read. For single-ended sequencing data, FPKM is equivalent to RPKM (RPKM total extensions/(Millions) extension (KB)) since Cufflinks calculates a read as a fragment. For double-ended sequencing, if a pair of paired-reads are aligned, then the paired-reads are referred to as a fragment, and if only one of the paired-reads is aligned and the other is not aligned, then the aligned Read is referred to as a fragment.
Based on the same idea as the above detection method, as shown in fig. 2, the present invention also provides a gene fusion mutation detection apparatus 200, comprising:
a sequencing data acquisition module 210, configured to acquire sequencing data of a gene fusion variant library, where the gene fusion variant library is an amplification library of a target fusion gene obtained by capturing a transcription sequence of a sample to be detected through hybridization with a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence that can be complementarily paired with a sequence of one of the genes of the target fusion gene;
a comparison screening module 220 for comparing the sequencing data with the human transcriptome and genome data, screening reads that can be matched to at least two genes simultaneously; and
and a fusion analysis module 230, configured to analyze whether reads that can be simultaneously matched to at least two genes meet a preset threshold requirement, and if so, indicate that gene fusion occurs in multiple genes included in the reads.
Optionally, the gene fusion mutation detection apparatus 200 further includes:
a variation ratio calculating module 240, configured to calculate a variation ratio of the fusion gene according to the following formula:
Figure RE-GDA0002461271430000141
wherein,
Figure RE-GDA0002461271430000142
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the refgeneFPKM is the normalized expression value of the reference gene;
the FPKM is defined as Reads Per Kibase of exon model Per Million mappedreads, i.e., every 1 Million (10)9) The number of reads aligned to every 1K bases of an exon in reads aligned to each alignment.
Based on the above embodiments, the present invention further provides a computer device for genetic fusion mutation detection, which has a processor and a memory, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the genetic fusion mutation detection method according to any of the above embodiments.
It will be understood by those skilled in the art that all or part of the processes of the above methods may be implemented by a computer program, which may be stored in a non-volatile computer-readable storage medium, and in the embodiments of the present invention, the program may be stored in the storage medium of a computer system and executed by at least one processor in the computer system to implement the processes of the embodiments including the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Accordingly, the present invention also provides a computer storage medium for genetic fusion mutation detection, wherein a computer program is stored thereon, and when being executed, the computer program implements the steps of the genetic fusion mutation detection method according to any of the above embodiments.
A single driver gene may be genetically fused to multiple other genes (chaperone genes) to form, after transcription of the fused gene, the junction of the core gene exon and the chaperone gene exon (i.e., breakpoint). The gene fusion mutation library construction method, the gene fusion mutation detection method and the gene fusion mutation detection device are based on a DNA probe hybridization capture multigene RNA targeted sequencing technology, target fusion genes are captured through hybridization of the fusion gene capture probes, and a gene fusion mutation library is constructed.
The gene fusion mutation library construction method, the gene fusion mutation detection method and the device can be used for detecting known or newly-discovered gene rearrangement, gene deletion, gene duplication and other gene mutation information related to various blood tumor hot spot fusion genes. Compared with the traditional fluorescence quantitative method, the technical concept of the invention is more comprehensive and efficient, and has efficiency and economy.
Furthermore, the invention also designs a fusion gene quantitative analysis method, which can obtain the variation proportion of the fusion gene through calculation so as to obtain the accurate expression quantity value of the fusion gene.
The construction method and detection method of the gene fusion variant library of the present invention will be described in further detail below with reference to the specific case of library construction and detection methods.
1) DNA probe design based on mRNA sequence
The transcription sequence of the fusion gene is captured by DNA probe hybridization, high-throughput sequencing is carried out, and the hot spot or new fusion form in which the fusion gene participates can be obtained by biological information analysis.
For hematological tumors (leukemias and lymphomas), 54 core genes, i.e., ABL1, CREBBP, CRLF 1, MECOM, TP 1, TSLP, LMO 1, PRDM1, MYC, ETV 1, RARA, NUP214, BCL 1, MYB, IRF 1, CBFB, CEBPB, ZNF384, RUNX1, FGFR1, MALT1, ERG, NPM1, PAX 1, JAK 1, PICALM, FLT 1, GLIS 1, frpdgb, PML, TLX1, itbpa, FGFR1, IL 21, TAL1, NTRK 1, NUP 1, EPOR, RBM1, CSF 11, KMT2 72, BCL 1, BCR, bnn, TLX1, ccmbl 1, tcmbl 1, pdslnd 1, seq id 1, mrna 1, pdgf 1, mrna 1.
2) Total RNA extraction from samples
Total RNA from peripheral blood or bone marrow samples of leukemia lymphoma patients was extracted using QIAGEN QIAsymphony RNA Kit (Cat # 931636). The specific operation steps are detailed in the specification of the manufacturer.
Measuring the concentration of nucleic acid and A260/A280 value (expected value is between 1.9 and 2.1) by using (1) a NanoDrop spectrophotometer; (2) using a QubitTMRNA HS Assay Kit (Cat. # Q32855) was used to determine the nucleic acid concentration.
3) Elimination of ribosomal rRNA
Hybridizing 500ng of the total RNA extracted in the step 2) with a single-stranded DNA probe synthesized by rRNA, and carrying out enzyme digestion on the rRNA by RNaseH, wherein the specific operation steps are detailed in the specification of a NEBNext rRNA deletion Kit. By using
Figure RE-GDA0002461271430000161
XP beads purify RNA samples after elimination of rRNA.
4) Reverse transcription to synthesize cDNA
The RNA was fragmented by incubation at 94 ℃ for 6 minutes in a PCR instrument. The fragmented RNA was reverse transcribed into single-stranded c' DNA using reverse transcriptase (retrotransposase).
5) Synthesis of second Strand of cDNA
The single-stranded c' DNA was synthesized into double-stranded cDNA using DNA Polymerase I, Large (Klenow) Fragment. dUTP is used here instead of dTTP. Thus, the second strand cDNA is embedded in dUTP. Double-stranded cDNA was purified using AMPure XP Beads.
6) Tip repair
Double stranded cDNA was treated with NEBNext Ultra II End Prep Enzyme Mix and one dATP was added at the 3' End.
7) Connecting joint
The Ligase, Index linker containing 12nt unique sequence and the end-repair cDNA were mixed and incubated for 60 minutes at 16 ℃ in a PCR instrument to obtain a linker-ligated cDNA library.
Joint format: P5-Read1primer-DNA INSERT-IndexReadprimer-index-P7.
8) Preparation of second Strand nicks of cDNA by enzymatic cleavage
Uracil DNA Glycosylase (UDG) and Endonuclease VIII mix were added to the above system, which synergistically digested dUTP in the cDNA library fragments to produce gaps.
9) Library amplification
The above cDNA library was amplified in a PCR instrument using KAPA HiFi HotStart ReadyMix, primers (P5: 5'-AATGATACGGCGACCACCGA-3', SEQ ID NO: 1; P7: 5'-CAAGCAGAAGACGGCATACGAGAT-3', SEQ ID NO:2) paired with linker P5, P7 sequences. The cDNA pre-library was purified using AMPure XP Beads.
10) Probe capture hybridization
100ng of the prepared cDNA library was combined with
Figure RE-GDA0002461271430000171
Universal bottles-TS Mix and Human Cot-1DNA were mixed and dried to a dry powder using a vacuum filtration system (60 ℃). Then adding hybridization buffer solution and fusion gene probe library, mixing, incubating for 30 seconds at 95 ℃ in a PCR instrument, and hybridizing for 16-18 hours at 65 ℃.
The system and streptomycin avidin magnetic beads are combined
Figure RE-GDA0002461271430000181
M-270Streptavidin beads were mixed and incubated on a PCR instrument at 65 ℃ for 45min, during which time remixing was performed at 15min intervals. All transcribed sequence fragments containing the fusion gene were screened.
The cDNA library captured by the above hybridization was amplified in a PCR instrument using KAPA HiFi HotStart ReadyMix, primers (P5: 5'-AATGATACGGCGACCACCGA-3', SEQ ID NO: 1; P7: 5'-CAAGCAGAAGACGGCATACGAGAT-3', SEQ ID NO:2) paired with linker P5, P7 sequences. And (5) purifying the target cDNA library by adopting AMPure XP Beads to obtain a library to be sequenced.
11) Illumina platform sequencing
The library to be sequenced was sequenced using a Novaseq 6000 high throughput sequencer to an average sequencing depth of 5000 x. The sequencing procedure is detailed in the manufacturer's instructions.
12) Sequencing data analysis
A. Sequencing data preprocessing
And converting the original data by using bcl2fastq software to obtain a raw fastq file, performing quality evaluation on the raw fastq data by using fastqc software, and removing low-quality reads by using Trimmomatic software to obtain the clean fastq file.
B. Identification of fusion genes
And (3) comparing all reads with a human transcriptome and genome by using BOWTIE, STAR and SPOTLIGHT software, and screening the reads matched with the transcripts of the two genes simultaneously. False positive events are then eliminated by a series of criteria such as paralogs (paralogs), pseudogenes, Body Map 2.0, gene distance, etc. If the reads matched to two genes at the same time exceed the set threshold value, the two genes are determined to be subjected to gene fusion.
C. Example of results of analysis of fused Gene assay data
With the present invention we tested 3 cases of leukemia samples and obtained the following results:
the 3 samples all have fusion genes in which MLL (KMT2A) participates, and the break point sequences of the MLL gene and the partner gene thereof can be simultaneously captured only by a probe targeting the MLL gene transcript sequence, so that the specific fusion form thereof is identified by alignment analysis, and fusionFPKM is calculated as an index of the expression amount thereof.
The results are shown in Table 1 below.
TABLE 1
Figure RE-GDA0002461271430000191
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Sequence listing
<110> Guangzhou gold area medical inspection group GmbH
<120> gene fusion variant library construction method, detection method, apparatus, device and storage medium
<140>2019114192739
<141>2019-12-31
<160>7
<170>SIPOSequenceListing 1.0
<210>1
<211>20
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>1
aatgatacgg cgaccaccga 20
<210>2
<211>24
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>2
caagcagaag acggcatacg agat 24
<210>3
<211>38
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>3
gatctacact ctttccctac acgacgctct tccgatct 38
<210>4
<211>24
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>4
gttcgtcttc tgccgtatgc tcta 24
<210>5
<211>86
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>5
tccccgccca agtatccctg taaaacaaaa accaaaagaa aagtctgaac aacccagtcc 60
tgccagctcc agctccagct ccagct 86
<210>6
<211>86
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>6
tccccgccca agtatccctg taaaacaaaa accaaaagaa aaggaaatga cccattcatg 60
gccgcctcct ttgacagcaa tacata 86
<210>7
<211>86
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>7
aattccagca gatggagtcc acaggatcag agtggacttt aaggattctg tttcactgag 60
gccatctatc cgatttcaag gaagcc 86

Claims (16)

1. A method for constructing a gene fusion variation library is characterized by comprising the following steps:
extracting total RNA of the sample, and removing rRNA in the total RNA;
reverse transcribing the total RNA after rRNA is removed, synthesizing double-stranded cDNA, and synthesizing by using dUTP instead of dTTP when synthesizing a second strand of the double-stranded cDNA;
performing end repair and adding a connecting joint to the synthesized double-stranded cDNA;
digesting dUTP in the double-stranded DNA after the end repair and the addition of the connecting joint by enzyme digestion to generate a gap in the double-stranded cDNA;
amplifying the double-stranded DNA after enzyme digestion to construct a cDNA pre-library;
hybridizing and capturing a target fusion cDNA in the cDNA pre-library using a fusion gene capture probe, the target fusion cDNA being formed by fusing at least two different genes, the fusion gene capture probe comprising a sequence capable of complementary pairing with a sequence of one of the genes of the target fusion cDNA;
and amplifying the captured target fusion cDNA to obtain the gene fusion variation library.
2. The method of claim 1, wherein the fusion gene capture probe is designed according to the following rules:
(1) the fusion gene capture probe is designed aiming at a core gene in target fusion cDNA, wherein the core gene refers to a gene which has a plurality of gene partners and is easy to generate fusion variation, or a key gene in a cell growth or proliferation signal pathway, or a driving gene;
(2) the fusion gene capture probe is designed aiming at the transcript sequence of the core gene;
(3) the fusion gene capture probe is designed aiming at a core gene in the hg19 reference genome, and the coverage density is a2 × double-tile sequence;
(4) the length of the fusion gene capture probe is 120 bp;
(5) the fusion gene capture probe needs to be compared to a human transcriptome sequence during design, the number of all Blast matches is counted, if the number of the Blast matches is not more than 50, the fusion gene capture probe is qualified, and if the number of the Blast matches is more than 50, the fusion gene capture probe is redesigned in a mode of replacing mismatched bases until the highest matching performance of the target gene sequence is obtained and the number of the Blast matches is not more than 50.
3. The method of constructing a library of fusion variants according to claim 1 or 2 wherein the 5' end of the fusion gene capture probe is labeled with a linker for capture;
optionally, the linker is biotin or streptavidin.
4. The method of constructing a library of fused variants according to claim 1 or 2, wherein the total RNA in the sample is total RNA in a peripheral blood or bone marrow sample.
5. The method of constructing a library of fusion variants according to claim 1 or 2 wherein the end-repair is the addition of a dATP to the 3' end of the double-stranded cDNA synthesized;
the joint format introduced by adding the connecting joint is P5-Real1primer-DNAINSERT-IndexReadprimer-index-P7, and specifically comprises the following steps: 5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC T-DNA fragment sequence to be detected-GTTCGTCTTCTGCCGTATGCTCTA-index-CACTGACCTCAAGTCTGCACACGAGAAGGCTAG-P, wherein P5 and P7 are joints, Real1primer and IndexReadprimer are primer sequences, DNAINSERT is the DNA fragment sequence to be detected, index is a unique sample label of 12nt, and P is a phosphate group.
6. The method of constructing a library of fusion variants according to claim 5 wherein the amplification of the digested double-stranded DNA and the captured target fusion cDNA is performed using primers that pair to the sequences of adaptors P5 and P7.
7. A gene fusion mutation detection method is characterized by comprising the following steps:
obtaining sequencing data of a gene fusion variation library, wherein the gene fusion variation library is an amplification library of a target fusion gene obtained by hybridizing and capturing a transcription sequence of a sample to be detected through a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence which can be complementarily paired with a sequence of one gene of the target fusion gene;
comparing the sequencing data with human transcriptome and genome data, and screening reads capable of being matched with at least two genes simultaneously;
and analyzing whether the reads which can be matched with at least two genes simultaneously meet the preset threshold requirement, and if so, indicating that a plurality of genes contained in the reads are subjected to gene fusion.
8. The method of detecting genetic fusion mutations according to claim 7, wherein the step of comparing the sequencing data to human transcriptome and genome data and screening for reads that match to at least two genes simultaneously further comprises:
and performing quality evaluation on the sequencing data, and removing low-quality reads to obtain clean sequencing data.
9. The method of detecting genetic fusion mutations according to claim 8, wherein said knocking out low quality reads comprises:
removing reads containing the linker sequence;
removing reads with mass value lower than 15 and low mass base ratio ≧ 50%;
the reads with N content larger than 1% are removed.
10. The method of detecting genetic fusion mutations according to claim 9 further comprising the step of rejecting false positive events in the clean sequencing data according to a predetermined control criterion after comparing the sequencing data to human transcriptome and genomic data;
specifically, the screened gene fusion variant events are annotated, false and true are removed, and the gene fusion variant events meeting the following standards are removed:
different genes of the fusion gene are paralogous with each other;
different genes of the fusion gene are pseudogenes;
this gene fusion variation has been detected in normal healthy persons.
11. The method of detecting genetic fusion mutation according to claim 10 wherein the predetermined threshold requirement is: if the fusion gene variation has clinical significance, the number of unique spanning reads matched with the two genes is more than 3; if the fusion gene variation is of unknown clinical significance, more than 10 unique spanning reads can be matched to the two genes simultaneously.
12. The method of detecting a genetic fusion mutation according to any one of claims 7 to 11, further comprising:
calculating the variation ratio of the fusion gene according to the following formula:
Figure FDA0002351935760000041
wherein,
Figure FDA0002351935760000042
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the refgeneFPKM is the normalized expression value of the reference gene;
the FPKM is defined as the Reads Per Kibase of exon model Per Million mapped Reads, i.e., the number of Reads aligned to every 1K bases of an exon in every 1 Million aligned Reads.
13. A gene fusion mutation detection device, comprising:
the system comprises a sequencing data acquisition module, a gene fusion variation library analysis module and a data analysis module, wherein the sequencing data acquisition module is used for acquiring sequencing data of a gene fusion variation library, the gene fusion variation library is an amplification library of a target fusion gene obtained by hybridizing and capturing a transcription sequence of a sample to be detected through a fusion gene capture probe, the target fusion gene is formed by fusing at least two different genes, and the fusion gene capture probe contains a sequence which can be complementarily paired with a sequence of one gene of the target fusion gene;
the comparison screening module is used for comparing the sequencing data with human transcriptome and genome data and screening reads which can be matched with at least two genes simultaneously; and
and the fusion analysis module is used for analyzing whether the reads which can be matched with at least two genes simultaneously meet the preset threshold requirement, and if so, the fact that the genes contained in the reads are subjected to gene fusion is indicated.
14. The apparatus for detecting gene fusion mutation according to claim 13, further comprising:
and the variation ratio calculation module is used for calculating the variation ratio of the fusion gene according to the following formula:
Figure FDA0002351935760000051
wherein,
Figure FDA0002351935760000052
the fusion supporting read pages refer to the pairs of reads supporting the gene fusion;
the # mappable reads refers to the number of reads of the genome on the alignment;
the weighted-average of insert-size-read length refers to the weighted average length of cDNA fragments inserted into the library;
the refgeneFPKM is the normalized expression value of the reference gene;
the FPKM is defined as the Reads Per Kibase of exon model Per Million mapped Reads, i.e., the number of Reads aligned to every 1K bases of an exon in every 1 Million aligned Reads.
15. A computer device having a processor and a memory, the memory storing a computer program, the processor implementing the steps of the gene fusion mutation detection method according to any one of claims 7 to 12 when executing the computer program.
16. A computer storage medium having a computer program stored thereon, wherein the computer program when executed implements the steps of the method of detecting a genetic fusion mutation according to any one of claims 7 to 12.
CN201911419273.9A 2019-12-31 2019-12-31 Gene fusion variation library construction method, detection method, device, equipment and storage medium Pending CN111321202A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911419273.9A CN111321202A (en) 2019-12-31 2019-12-31 Gene fusion variation library construction method, detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911419273.9A CN111321202A (en) 2019-12-31 2019-12-31 Gene fusion variation library construction method, detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111321202A true CN111321202A (en) 2020-06-23

Family

ID=71165123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911419273.9A Pending CN111321202A (en) 2019-12-31 2019-12-31 Gene fusion variation library construction method, detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111321202A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111979307A (en) * 2020-08-31 2020-11-24 伯科生物科技有限公司 Targeted sequencing method for detecting gene fusion
CN112397144A (en) * 2020-10-29 2021-02-23 无锡臻和生物科技有限公司 Method and device for detecting gene mutation and expression quantity
CN112662771A (en) * 2020-12-30 2021-04-16 苏州大学附属第一医院 Targeting capture probe of tumor fusion gene and application thereof
CN114300051A (en) * 2021-12-22 2022-04-08 北京吉因加医学检验实验室有限公司 Method and device for calculating fusion gene frequency
CN114395619A (en) * 2021-12-29 2022-04-26 福建和瑞基因科技有限公司 High-throughput sequencing method and internal reference quality control product
CN115083516A (en) * 2022-07-13 2022-09-20 北京先声医学检验实验室有限公司 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology
CN115662520A (en) * 2022-10-27 2023-01-31 黑龙江金域医学检验实验室有限公司 Detection method of BCR/ABL1 fusion gene and related equipment
EP4400599A3 (en) * 2021-05-20 2024-08-28 Sophia Genetics S.A. Capture probes and uses thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104894111A (en) * 2015-05-18 2015-09-09 李卫东 DNA targeted capture array for leukemia chromosome aberration high-throughput sequencing
CN106929504A (en) * 2015-12-30 2017-07-07 安诺优达基因科技(北京)有限公司 Detect the kit of acute promyelocytic leukemia correlation fusion gene
CN108486235A (en) * 2018-03-07 2018-09-04 北京圣谷智汇医学检验所有限公司 A kind of method and system of high-efficiency and economic detection fusion gene
WO2019144582A1 (en) * 2018-01-26 2019-08-01 厦门艾德生物医药科技股份有限公司 Probe and method for high-throughput sequencing targeted capture target region used for detecting gene mutations as well as known and unknown gene fusion types

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104894111A (en) * 2015-05-18 2015-09-09 李卫东 DNA targeted capture array for leukemia chromosome aberration high-throughput sequencing
CN106929504A (en) * 2015-12-30 2017-07-07 安诺优达基因科技(北京)有限公司 Detect the kit of acute promyelocytic leukemia correlation fusion gene
WO2019144582A1 (en) * 2018-01-26 2019-08-01 厦门艾德生物医药科技股份有限公司 Probe and method for high-throughput sequencing targeted capture target region used for detecting gene mutations as well as known and unknown gene fusion types
CN108486235A (en) * 2018-03-07 2018-09-04 北京圣谷智汇医学检验所有限公司 A kind of method and system of high-efficiency and economic detection fusion gene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
方向东等编著, 天津:天津科技翻译出版有限公司 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111979307A (en) * 2020-08-31 2020-11-24 伯科生物科技有限公司 Targeted sequencing method for detecting gene fusion
CN111979307B (en) * 2020-08-31 2022-07-08 伯科生物科技有限公司 Targeted sequencing method for detecting gene fusion
WO2022089033A1 (en) * 2020-10-29 2022-05-05 无锡臻和生物科技有限公司 Method and device for detecting genetic mutation and expression
CN112397144B (en) * 2020-10-29 2021-06-15 无锡臻和生物科技股份有限公司 Method and device for detecting gene mutation and expression quantity
CN112397144A (en) * 2020-10-29 2021-02-23 无锡臻和生物科技有限公司 Method and device for detecting gene mutation and expression quantity
CN112662771A (en) * 2020-12-30 2021-04-16 苏州大学附属第一医院 Targeting capture probe of tumor fusion gene and application thereof
CN112662771B (en) * 2020-12-30 2024-04-02 苏州大学附属第一医院 Targeting capture probe of tumor fusion gene and application thereof
EP4400599A3 (en) * 2021-05-20 2024-08-28 Sophia Genetics S.A. Capture probes and uses thereof
CN114300051A (en) * 2021-12-22 2022-04-08 北京吉因加医学检验实验室有限公司 Method and device for calculating fusion gene frequency
CN114395619A (en) * 2021-12-29 2022-04-26 福建和瑞基因科技有限公司 High-throughput sequencing method and internal reference quality control product
CN114395619B (en) * 2021-12-29 2024-04-30 福建和瑞基因科技有限公司 High-throughput sequencing method and internal reference quality control product
CN115083516A (en) * 2022-07-13 2022-09-20 北京先声医学检验实验室有限公司 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology
CN115662520A (en) * 2022-10-27 2023-01-31 黑龙江金域医学检验实验室有限公司 Detection method of BCR/ABL1 fusion gene and related equipment

Similar Documents

Publication Publication Date Title
US20230193381A1 (en) Compositions and methods for accurately identifying mutations
CN111321202A (en) Gene fusion variation library construction method, detection method, device, equipment and storage medium
US11898198B2 (en) Universal short adapters with variable length non-random unique molecular identifiers
AU2018266377B2 (en) Universal short adapters for indexing of polynucleotide samples
CA3220983A1 (en) Optimal index sequences for multiplex massively parallel sequencing
CN108359723B (en) Method for reducing deep sequencing errors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination