CN115083516B - Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology - Google Patents

Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology Download PDF

Info

Publication number
CN115083516B
CN115083516B CN202210818316.6A CN202210818316A CN115083516B CN 115083516 B CN115083516 B CN 115083516B CN 202210818316 A CN202210818316 A CN 202210818316A CN 115083516 B CN115083516 B CN 115083516B
Authority
CN
China
Prior art keywords
exon
probe
depth
gene
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210818316.6A
Other languages
Chinese (zh)
Other versions
CN115083516A (en
Inventor
巩小芬
邓望龙
杨雪雨
张超
李诗濛
任用
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Xiansheng Medical Diagnosis Co ltd
Nanjing Xiansheng Medical Laboratory Co ltd
Beijing Xiansheng Medical Examination Laboratory Co ltd
Original Assignee
Jiangsu Xiansheng Medical Diagnosis Co ltd
Nanjing Xiansheng Medical Laboratory Co ltd
Beijing Xiansheng Medical Examination Laboratory Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Xiansheng Medical Diagnosis Co ltd, Nanjing Xiansheng Medical Laboratory Co ltd, Beijing Xiansheng Medical Examination Laboratory Co ltd filed Critical Jiangsu Xiansheng Medical Diagnosis Co ltd
Priority to CN202210818316.6A priority Critical patent/CN115083516B/en
Publication of CN115083516A publication Critical patent/CN115083516A/en
Application granted granted Critical
Publication of CN115083516B publication Critical patent/CN115083516B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Genetics & Genomics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The application belongs to the technical field of credit generation analysis, and particularly relates to a method for detecting gene fusion Panel design and evaluation based on a target RNA sequencing technology.

Description

Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology
Technical Field
The application belongs to the technical field of biogenic analysis, and particularly relates to a method for designing and evaluating Panel based on gene fusion detection by a targeted RNA sequencing technology.
Background
Gene fusion (gene fusion) refers to the fusion of partial or complete sequences of two different genes together to form a new gene due to some mechanism (such as genomic variation) (fig. 1). The fusion gene has important significance for clinical diagnosis, drug treatment and prognosis. The fusion gene is represented by the connection of two gene exons on the front and the back on the RNA level, and the fusion site is relatively fixed, so the fusion gene is not limited by an intron probe during detection, more new fusion types can be found, and the sensitivity is higher.
RNA is transcribed from DNA, and due to complex transcriptional modification processes such as alternative splicing (FIG. 2), there may be multiple transcript sequences in a gene, and different transcripts have differences in exon composition, which may result in different transcripts producing different functional protein products. In order to enrich multiple transcripts of a target gene, the traditional design idea is to combine exon regions of all transcripts of the gene, and design based on genome sequence, and the method has the following problems: on the one hand, as shown in fig. 2, when the exon composition of different transcripts is similar, there are a large number of regions that can be covered by the same set of probes, resulting in redundancy of probe coverage, thereby increasing cost; on the other hand, as shown in FIG. 3, if an exon region is short, probes with sufficient binding length are not designed, and only indirect coverage by adjacent exon probes is needed, which may result in missed fusion.
The existing RNA probe capture performance evaluation has great difficulty: the RNA capture library of the sample is easily influenced by gene expression quantity, the sequencing depth of high-expression genes is high, the sequencing depth of low-expression genes is low, and a reasonable evaluation result of probe capture performance cannot be obtained; although the gDNA capture library can exclude the influence of different gene expression amounts on the probe indexes, the RNA probe is a discontinuous region on the genome, so that the gDNA library cannot be effectively captured by a shorter exon region, and whether the probe is uniformly captured to each targeted region cannot be evaluated.
In summary, the design of the target RNA fusion detection Panel is influenced by limited probe binding due to the short length of the exon and redundant probe coverage due to the same exon of different transcripts. Meanwhile, the evaluation of probe capture performance is affected by the expression levels of different genes, and gDNA cannot effectively evaluate a shorter exon coverage area.
In view of this, the present application is specifically made.
Disclosure of Invention
Aiming at the technical problems, the design method for detecting the Panel through targeted RNA gene fusion is developed based on an NGS platform, probe enrichment effect evaluation is carried out by combining RNA and a gDNA capture library, and scientificity and rationality of design and capture performance evaluation of the Panel through targeted RNA fusion gene detection are guaranteed.
The application firstly provides a design method for detecting Panel by gene fusion based on a targeted RNA sequencing technology, which comprises the following steps:
1) Determining a candidate gene;
2) Determining the longest transcript of each gene according to the annotation information of the candidate gene structure, and performing probe coverage on all CDS regions of the longest transcript; if the untranslated region UTR region has gene fusion with definite clinical significance, probe coverage is carried out on the untranslated region UTR region;
3) And according to the gene structure annotation information of the candidate genes, comparing the CDS of the non-longest transcript corresponding to each gene with the CDS of the longest transcript one by one, if the CDS is different, reserving the corresponding CDS, and performing probe coverage.
4) For the candidate genes containing special type gene fusion, probe coverage is further carried out on the region near the breakpoint of the special type gene fusion, and the detection sensitivity is increased.
5) And designing probes based on the coverage of the probes, obtaining the bed files of the capture intervals, and assembling the gene fusion detection Panel.
Further, if the CDS sequence reserved in the step 3) is not long enough to bind to the probe, the sequence is externally amplified to an upstream CDS and a downstream CDS so that the probe has enough binding length;
further, the specific types of gene fusions in 4) include, but are not limited to: MET exon14 skiping, AR-V7, EGFR vIII, EGFR KDD.
The application also relates to a data analysis method for evaluating the capture performance of the probe, which comprises the following steps:
1) Constructing a library for a sample by using the gene fusion detection Panel prepared by the method to obtain a corresponding gDNA capture library and an RNA capture library, and respectively counting the sequencing depth of a probe coverage area by using an uncaptured rRNA chain specificity removal library (rRNA-deleted total RNA library) as a reference;
2) The enrichment factor FC of the sample at each exon was calculated, and at the same time, the lower quartile LQ of all exon enrichment factors was calculated. The enrichment factor FC is: the ratio of the sequencing depth of the target RNA library in each exon in the same sample to the sequencing depth of a de-rRNA strand specific library (rRNA-deplexed total RNA library) without probe capture, in a unit amount of sequencing data;
FC=(target RNA depth/raw data)/(RNA depth/raw data)
3) Calculating the relative sequencing depth RD of the sample at each exon to obtain a DNA sample score gScore and an RNA sample score rScore;
the relative sequencing Depth RD is the sequencing Depth Raw Depth of each bed interval divided by the Median Depth of the sequencing depths of all the bed intervals of the sample;
RD=(bed Raw Depth)/(sample bed Median Depth);
determining a composite score, eScore, based on gsore and rsore;
eScore=max(gScore,rScore)。
further, in the step 2), if the enrichment factor FC is greater than the lower quartile LQ of all exon enrichment factors, it indicates that the exon probe has a better capture performance, and is not included in the evaluation range.
Further, in the step 3), for the interval with the exon length of more than 120bp, the eScore is the larger value of the two exons gScore and rScore, and for the interval with the exon length of less than 120bp, the eScore is the maximum value of the two adjacent exons gScore of the gScore and the rScore;
further, in the above 3), when eScore > =0.2, it indicates that the capture performance of the interval is better in gDNA or RNA samples.
The application also relates to a probe capture performance evaluation system, which comprises a module for executing the steps of any one of the methods.
The present application also relates to a computer-readable medium, in which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out any of the above-mentioned methods.
The application also relates to an electronic device comprising a processor and a memory, wherein one or more readable instructions are stored on the memory, and when executed by the processor, the one or more readable instructions implement any of the above methods.
The beneficial technical effect of this application:
1) According to the method, a set of comprehensive and simplified RNA level gene fusion Panel design strategy is constructed, each transcript can be effectively covered, meanwhile, redundant probes covering consistent regions repeatedly are removed, and the sensitivity of clinical key fusion detection is improved.
2) The application solves the difficulty in the evaluation of the capture performance of the RNA level gene fusion probe by combining the evaluation algorithm of the eSacre of gDNA and RNA.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1, schematic diagram of gene fusion;
FIG. 2, schematic of alternative splicing;
FIG. 3 is a diagram showing the design scheme of the capture region of the conventional gene fusion gene for detecting RNA level;
FIG. 4 is a schematic view of coverage extension;
FIG. 5, result of shorter exon coverage;
FIG. 6 is a supplementary area coverage result graph;
FIG. 7 is a graph showing the result of sequencing depth of partial exon in the MET gene;
FIG. 8 is a graph showing the results of sequencing depth of partial exons of AR gene;
figure 9, fold enrichment in probe coverage.
Detailed Description
The technical solutions of the present application will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The following terms or definitions are provided solely to aid in the understanding of the present application. These definitions should not be construed to have a scope less than understood by those skilled in the art.
Unless defined otherwise below, all technical and scientific terms used in the detailed description of the present application are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present application.
As used in this application, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of 8230A" is considered to be a preferred embodiment of the term "comprising". If in the following a certain group is defined to comprise at least a certain number of embodiments, this should also be understood as disclosing a group which preferably only consists of these embodiments.
Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.
The terms "about" and "substantially" in this application denote the interval of accuracy that a person skilled in the art can understand while still guaranteeing the technical effect of the feature in question. The term generally means ± 10%, preferably ± 5% of the indicated value.
Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments described herein are capable of operation in other sequences than described or illustrated herein.
According to some aspects of the present application, the present application relates to a method for designing gene fusion detection Panel based on a targeted RNA sequencing technology, which comprises the following steps:
1) Determining a candidate gene;
2) Determining the longest transcript of each gene according to the annotation information of the candidate gene structure, and performing probe coverage on all CDS regions of the longest transcript; if the region UTR region has gene fusion with definite clinical significance, probe coverage is carried out on the region UTR region;
3) And according to the gene structure annotation information of the candidate genes, comparing the CDS of the non-longest transcript corresponding to each gene with the CDS of the longest transcript one by one, if the CDS is different, reserving the corresponding CDS, and performing probe coverage.
4) For the candidate gene containing the special type gene fusion, the probe coverage is further carried out on the region near the breakpoint of the special type gene fusion so as to increase the detection sensitivity.
5) Gene fusion detection Panel was based on probe preparation.
In some embodiments, the CDS sequence retained in 3) is extended to the upstream and downstream CDSs if not long enough to bind to the probe, so that the probe has sufficient binding length;
in some embodiments, the specific types of gene fusions in 4) include, but are not limited to: MET exon14 skiping, AR-V7, EGFR vIII, EGFR KDD.
According to some aspects of the present application, the data analysis method for probe capture performance evaluation of the present application comprises the following steps:
1) Constructing a library for a sample by using the gene fusion detection Panel prepared by the method to obtain a corresponding gDNA capture library and an RNA capture library, and respectively counting the sequencing depth of a probe coverage area by taking an uncaptured rRNA-depleted total RNA library as a reference;
2) The enrichment factor FC of the sample at each exon was calculated, and at the same time, the lower quartile LQ of all exon enrichment factors was calculated. The enrichment factor FC is: under the unit sequencing data quantity, the ratio of the sequencing depth of the target RNA library in each exon of the same sample to the sequencing depth of the rRNA-deplated total RNA library which is not captured by the probe;
FC=(target RNA depth/raw data)/(RNA depth/raw data)
3) Calculating the relative sequencing depth RD of the sample at each exon to obtain a DNA sample score gScore and an RNA sample score rScore; the relative sequencing Depth RD is the sequencing Depth of each exon, raw Depth, divided by the Median Depth of all exon sequencing depths of the sample;
RD=(bed Raw Depth)/(sample bed Median Depth);
determining a composite score, eScore, based on gsore and rsore;
eScore=max(gScore,rScore)。
in some embodiments, the 2) is not included in the evaluation range if the fold enrichment FC is greater than the lower quartile LQ of all exon fold enrichments, indicating that the exon probe capture performance is better.
In some embodiments, the 3) is the larger of the two of gScore and rScore for intervals with exon lengths greater than 120bp, and the maximum of gScore and rScore for intervals with exon lengths less than 120 bp;
in some embodiments, the 3) indicates that the interval has better capture performance in gDNA or RNA samples when eScore > = 0.2.
Embodiments of the present application will be described in detail below with reference to examples, but those skilled in the art will appreciate that the following examples are only illustrative of the present application and should not be construed as limiting the scope of the present application. The examples, in which specific conditions are not specified, were conducted under conventional conditions or conditions recommended by the manufacturer. The reagents or instruments used are not indicated by manufacturers, and are all conventional products available on the market.
Experimental examples the method and System establishment of the present application
1. Panel design method
Based on hg19 reference genome and annotated transcript information, the design method was as follows:
1) Determining candidate genes according to the purpose;
2) Determining the longest transcript of each gene according to a gene structure annotation file (such as Ensembl, release-75, http:// ftp. Ensembl. Org/pub/grch37 /), performing probe coverage on all CDS regions of the longest transcript, and supplementing a UTR region if the UTR region (untranslated region) has definite clinical fusion;
3) The CDS of the non-longest transcripts corresponding to the gene are compared with the region reserved in 1) one by one for difference, if there is a difference, the corresponding CDS is reserved, as shown in FIG. 4, for the CDS sequence of the supplemented non-longest transcript, if the length is shorter, the CDS sequences are externally amplified to the upstream and downstream CDS, so that the probe has enough binding length.
4) For the particular type of intragenic fusion tested: taking MET exon14 skiping and AR-V7 as examples, on the basis of the above design strategy, probe coverage is carried out near the fusion breakpoint of the specific detection region, so as to increase the detection sensitivity.
2. Probe evaluation method
Data analysis for probe capture performance evaluation (taking leukocyte samples as an example):
1) Based on the method, WBC samples are respectively subjected to library building, the sequencing depth of a probe coverage area is respectively counted for the obtained gDNA capture library, RNA capture library (target RNA) and the uncaptured rRNA-truncated total RNA library, and the uncaptured rRNA-truncated total RNA library is used as a contrast.
2) And (3) calculating the enrichment factor FC of the sample in each exon, namely the ratio of the sequencing depth of the target RNA library in each exon of the same sample to the sequencing depth of the rRNA-deleted total RNA library which is not captured by the probe under the unit sequencing data volume. Meanwhile, the lower quartile LQ of all exon enrichment multiples was calculated. If the enrichment multiple is greater than the lower quartile LQ of all exon enrichment multiples (the determination of the value is shown in FIG. 9, and it can be seen from the figure that the lower quartile LQ of all FC values can obviously distinguish abnormal values with lower enrichment multiples, so the value is set as a filtering threshold value in the present application), when the value is greater than the threshold value, the probe capture performance of the exon interval is better, and the exon interval is not included in the subsequent evaluation range any more;
FC = target (RNA depth/raw data)/(RNA depth/raw data).
3) And calculating the relative sequencing Depth RD of the sample in each bed interval, namely dividing the sequencing Depth Raw Depth of each bed interval by the Median Depth of the sequencing Depth in all exon of the sample, wherein the gScore and the rScore are the relative sequencing depths of the DNA sample and the RNA sample respectively. For the interval with the exon length of more than 120bp, the eSCore is the larger value of the gScore and the rScore, and for the interval with the exon length of less than 120bp, the eSCore is the maximum value of the gScore, the rScore and the gScore of two adjacent exons according to the principle that the probe is continuously laid in an RNA sequence. eScore > =0.2 samples, indicating that this interval is at least better capturing in gDNA or RNA samples.
RD=(bed Raw Depth)/(sample bed Median Depth);
gScore=RD(gDNA);
rScore=RD(RNA);
eScore=max(gScore,rScore)。
Example 1 data simulation
Panel capture region design results and comparisons with conventional designs
This example utilized the Panel design method described above in this application. Based on hg19 reference genome and gene structure annotation information, 329 transcript sequences of 148 genes are finally obtained, 1852 exon regions on the genome are covered, all transcript regions of the genes are covered, the reliability of hybridization capture and the sensitivity of fusion detection are ensured, and by removing redundant probes repeatedly covering consistent sections, the coverage of each transcript is more reasonable, the probe utilization rate is improved, and the cost is reduced.
Probes are designed according to the exon of all transcripts of 148 genes, the comparison result of each index with the traditional method is shown in the table 1, the probe area laid by the traditional design method is 1,777,265bp compared with the Panel coverage area, and after the probes are simplified according to the method, the probe area laid is only 372,151bp, and the redundant sequence is removed by 79%.
Table 1Panel coverage area comparison
Figure BDA0003743014820000081
Example 2 Probe coverage Performance evaluation
1) Evaluation results show that shorter exon coverage is better: as shown in FIG. 5, the No. 3 exon of the gene GOPC is only 24bp (NM-020399), but the capture result shows that the exon coverage is better and is consistent with the exon coverage level at two ends; meanwhile, the exon coverage of the supplemented non-longest transcript is also better, as shown in fig. 6, the gene NRG1, the supplemented transcript NM _001159996, the exon length of which is 40bp, still has better coverage depth after being supplemented.
2) The probe added aiming at the special fusion form has better coverage effect: as shown in fig. 7, for the MET exon14 skiping jump event, the capture areas of exon13-exon15 are increased, so that it is obvious that the coverage depth of exon12, exon13, exon15, exon16 is obviously higher than that of other exon. Similarly, when designing a probe for AR-V7 (exon 4-8 deletion), the probe is additionally added to exon3-3'UTR, and FIG. 8 shows the coverage depth of part of exon in the AR gene, and it can be seen that the coverage depth of exon2, exon3, exon8 and 3' UTR is obviously higher than that of other exon. The 2 probe capture regions designed for the special rearrangement types can ensure better coverage near special sites, thereby increasing the detection sensitivity of the special fusion types.
Example 3 evaluation of Probe Capture Performance
The higher the enrichment factor, the better the capture performance of the probe, fig. 9 is a distribution diagram of the enrichment factor in the coverage interval of the probe, and the lower quartile FQ of the enrichment factor FC in all intervals is 322, i.e., the enrichment factor in the interval of 75% or more is 322 or more. For the region with low enrichment factor, the rScore is further evaluated by using three indexes of gScore, rScore and eScore, the rScore is easily affected by the expression amount, the capture performance of the probe in the interval cannot be evaluated by the rScore for the gene with partial non-expression or low expression amount, the gScore is affected by the length of the exon when being evaluated, the eScore is the evaluation result combining the rScore and the gScore, and the effective evaluation result can be obtained in the interval of 99.78% (1848) when the cut off of the eScore is set to be 0.2, which is far higher than 94% (1741) of the rScore and 98.38% (1822) of the gScore.
The foregoing descriptions of specific exemplary embodiments of the present application have been presented for purposes of illustration and description. It is not intended to limit the application to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the present application and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the present application and various alternatives and modifications thereof. It is intended that the scope of the application be defined by the claims and their equivalents.

Claims (6)

1. A data analysis method for evaluating probe capture performance is characterized by comprising the following steps:
1) Constructing a library for a sample by utilizing gene fusion detection Panel to obtain a corresponding gDNA capture library and an RNA capture library, and respectively counting the sequencing depth of a probe coverage area by taking an uncaptured rRNA chain-removed specific library as a reference;
2) According to the position information stored in the bed file, calculating the enrichment multiple FC of each exon and the lower quartile LQ of all exon enrichment multiples of the sample; the enrichment factor FC is the ratio of the sequencing depth of a target RNA library in each exon of the same sample to the sequencing depth of a de-rRNA strand specific library which is not captured by a probe under the unit sequencing data volume:
FC=(target RNA depth/raw data)/(RNA depth/raw data);
3) Calculating the relative sequencing depth RD of the sample in each bed interval to obtain a DNA sample score gScore and an RNA sample score rScore;
the relative sequencing Depth RD is the sequencing Depth of each exon, raw Depth, divided by the Median Depth of all exon sequencing depths of the sample;
RD=(bed Raw Depth)/(sample bed Median Depth);
determining a composite score, eScore, based on gsore and rsore;
eScore=max(gScore,rScore);
when the eScore > =0.2, the capture performance of the interval in the gDNA or RNA sample is better;
the gene fusion detection Panel is prepared by the following method:
a. determining a candidate gene;
b. determining the longest transcript of each gene according to the candidate gene structure annotation file, and performing probe coverage on all CDS regions of the longest transcript; if the untranslated region UTR region has gene fusion with definite clinical significance, probe coverage is carried out on the untranslated region UTR region;
c. according to the gene structure annotation information of the candidate genes, the CDS of the non-longest transcript corresponding to each gene is compared with the CDS of the longest transcript one by one in a difference mode, if the CDS is different, the corresponding CDS is reserved, and probe coverage is carried out;
d. for the candidate genes containing special type gene fusion, further performing probe coverage on the region near the breakpoint of the special type gene fusion;
e. and designing probes based on the coverage of the probes, obtaining the bed files of the capture intervals, and assembling the gene fusion detection Panel.
2. The method according to claim 1, wherein in the step 2), when a certain exon enrichment factor FC is larger than the lower quartile LQ of all exon enrichment factors, the exon probe has better capture performance and is not included in the evaluation range.
3. The method according to claim 1, wherein in 3), for the interval with the exon length greater than 120bp, the eCORE is the larger of the gScore and the rScore; for intervals with exon lengths less than 120bp, the eScore is the maximum value among gScore, the gScore adjacent exons at both ends of the rScore.
4. A probe capture performance evaluation system comprising means for performing the steps of the method of any one of claims 1-3.
5. A computer-readable medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 3.
6. An electronic device comprising a processor and a memory, the memory having stored thereon one or more readable instructions which, when executed by the processor, implement the method of any of claims 1-3.
CN202210818316.6A 2022-07-13 2022-07-13 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology Active CN115083516B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210818316.6A CN115083516B (en) 2022-07-13 2022-07-13 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210818316.6A CN115083516B (en) 2022-07-13 2022-07-13 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology

Publications (2)

Publication Number Publication Date
CN115083516A CN115083516A (en) 2022-09-20
CN115083516B true CN115083516B (en) 2023-03-21

Family

ID=83260583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210818316.6A Active CN115083516B (en) 2022-07-13 2022-07-13 Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology

Country Status (1)

Country Link
CN (1) CN115083516B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110719957A (en) * 2017-04-06 2020-01-21 王磬 Methods and kits for targeted enrichment of nucleic acids
CN111647648A (en) * 2020-05-21 2020-09-11 北斗生命科学(广州)有限公司 Gene panel for detecting breast cancer gene mutation and detection method and application thereof
CN112397144A (en) * 2020-10-29 2021-02-23 无锡臻和生物科技有限公司 Method and device for detecting gene mutation and expression quantity

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1231587C (en) * 2002-12-11 2005-12-14 中国农业大学 Beet black withered virus as expression carrier of foreigh gene
CA2583690C (en) * 2004-10-12 2016-04-05 The Rockefeller University Microrna constructs for the suppression of the expression of targeted genes or the down regulation of targeted genes and methods therefore
AU2006291054B2 (en) * 2005-09-12 2011-10-13 The Brigham And Women's Hospital, Inc. Recurrent gene fusions in prostate cancer
CN103667438B (en) * 2013-01-07 2015-04-01 赵晨 Method for screening HRDs disease-causing mutation and gene chip hybridization probe designing method involved in same
CN104657628A (en) * 2015-01-08 2015-05-27 深圳华大基因科技服务有限公司 Proton-based transcriptome sequencing data comparison and analysis method and system
CN111321202A (en) * 2019-12-31 2020-06-23 广州金域医学检验集团股份有限公司 Gene fusion variation library construction method, detection method, device, equipment and storage medium
CN111696627B (en) * 2020-03-26 2024-02-23 上海生物芯片有限公司 Design method of long-chain RNA specific probe
CN113249469B (en) * 2021-07-05 2021-10-15 迈杰转化医学研究(苏州)有限公司 Clonal eosinophilia hypereosinophilia fusion gene detection probe composition, kit and application thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110719957A (en) * 2017-04-06 2020-01-21 王磬 Methods and kits for targeted enrichment of nucleic acids
CN111647648A (en) * 2020-05-21 2020-09-11 北斗生命科学(广州)有限公司 Gene panel for detecting breast cancer gene mutation and detection method and application thereof
CN112397144A (en) * 2020-10-29 2021-02-23 无锡臻和生物科技有限公司 Method and device for detecting gene mutation and expression quantity

Also Published As

Publication number Publication date
CN115083516A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
JP6806854B2 (en) Methods for multi-resolution analysis of cell-free nucleic acids
CN105378110B (en) Gene fusions and gene variants associated with cancer
CN104657628A (en) Proton-based transcriptome sequencing data comparison and analysis method and system
Metge et al. FUCHS—towards full circular RNA characterization using RNAseq
CN110168648A (en) The verification method and system of sequence variations identification
CN110241221B (en) Kit and system for prognosis prediction of metastatic colorectal cancer
WO2023115662A1 (en) Method for detecting variant nucleic acids
CN110033829A (en) The fusion detection method of homologous gene based on difference SNP marker object
Ip et al. Diagnosis and prevention of thalassemia
WO2022253288A1 (en) Methylation sequencing method and device
JP2022130525A (en) Rna editing as biomarkers for mood disorders test
Klostermeier et al. A tissue-specific landscape of sense/antisense transcription in the mouse intestine
EP3095056B1 (en) Combined cytology and molecular testing for early detection of esophageal adenocarcinoma
CN106282195A (en) Gene mutation body and application thereof
EP3844760A1 (en) Genetic variant detection based on merged and unmerged reads
Zhuang et al. Identifying miRNA-mRNA networks associated with COPD phenotypes
CN115101128A (en) Method for evaluating off-target risk of hybridization capture probe
CN115083516B (en) Panel design and evaluation method for detecting gene fusion based on targeted RNA sequencing technology
JP2010107461A (en) Blood diagnosis method and dialyzer for artificial dialysis patient
CN108866154A (en) Noninvasive antenatal haplotype reconstruction method based on DNA long fragment capture and three generations's sequencing
CN105838720B (en) PTPRQ gene mutation body and its application
CN114891873A (en) Biomarker for evaluating aortic dissection risk and application thereof
Kielpinski et al. Reproducible analysis of sequencing-based RNA structure probing data with user-friendly tools
JP2014530629A5 (en)
WO2021041968A1 (en) Systems and methods for predicting and monitoring treatment response from cell-free nucleic acids

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant