AU2018391843A1 - Sequencing data-based ITD mutation ratio detecting apparatus and method - Google Patents

Sequencing data-based ITD mutation ratio detecting apparatus and method Download PDF

Info

Publication number
AU2018391843A1
AU2018391843A1 AU2018391843A AU2018391843A AU2018391843A1 AU 2018391843 A1 AU2018391843 A1 AU 2018391843A1 AU 2018391843 A AU2018391843 A AU 2018391843A AU 2018391843 A AU2018391843 A AU 2018391843A AU 2018391843 A1 AU2018391843 A1 AU 2018391843A1
Authority
AU
Australia
Prior art keywords
itd
detection results
detection
test samples
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
AU2018391843A
Other versions
AU2018391843B2 (en
Inventor
Ruilin JING
Dawei Li
Hailiang Wang
Juan Wang
Zhaoling Xuan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Annoroad Gene Technology (beijing) Co Ltd
Original Assignee
ANNOROAD GENE TECHNOLOGY BEIJING CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANNOROAD GENE TECHNOLOGY BEIJING CO Ltd filed Critical ANNOROAD GENE TECHNOLOGY BEIJING CO Ltd
Publication of AU2018391843A1 publication Critical patent/AU2018391843A1/en
Application granted granted Critical
Publication of AU2018391843B2 publication Critical patent/AU2018391843B2/en
Priority to AU2022218581A priority Critical patent/AU2022218581B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • Organic Chemistry (AREA)
  • Pathology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Oncology (AREA)
  • General Engineering & Computer Science (AREA)
  • Hospice & Palliative Care (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A sequencing data-based ITD mutation ratio detecting method. Said method comprises: acquiring sequencing data of a sample to be detected; extracting the ITD characteristics of said sample to be detected; and obtaining an ITD mutation ratio on the basis of an ITD characteristic coefficient and the ITD characteristics of said sample to be detected.

Description

SEQUENCING DATA-BASED ITD MUTATION RATIO DETECTING APPARATUS AND METHOD TECHNICAL FIELD
The present invention relates to the field of gene mutation detection, and
particularly relates to a method, an apparatus and an electronic device for the
quantitative detection of ITD mutation based on sequencing data.
BACKGROUND
With the rapid development of next-generation sequencing technology and its
increasing involvement in scientific research and clinical detection of the field of
cancer; we have reached a new level of understanding of their occurrence and
development, clinical manifestations and pathogenesis. Numerous studies have shown
that the occurrence of cancer is closely related to somatic mutations, and these
mutations often appear solely in subclones of certain tumors. The study of subclonal
mutations provides a new direction for disease progression and prognostic
stratification.
High-depth next-generation sequencing technology can detect subclonal
mutations. By sequencing specific genome regions using the target sequence capture
sequencing technology, great depth with low cost can be achieved, thereby a large
number of sample data can be accumulated, and this provides a favorable condition
for accurate estimation of the distribution of false positive rate at specific mutation
sites in the genome.
The most common type of FLT3 gene mutation in acute myeloid leukemia (AML)
is the internal tandem duplication (ITD), followed by the Tyrosine-kinase domain
(TKD) mutation. FLT3-ITD mutation usually involves exon 14 and 15, or exon 11
and/or 12. AML patients are often found to have internal tandem duplications (ITDs)
in the juxtamembrane (JM) region of FLT3, i.e. insertion of several repeating
elements of oligonucleotide in an end-to-end order, these elements can be copies of
uncertain number of nucleotide bases but the number of bases are usually a multiple of 3 so that the reading frame remains intact, thereby extending the JM region, while other domains are unaffected. These mutations play an important role in the pathogenesis of AML. The incidence rate of FLT3-ITD in AML patients is about 24% for adults, 10%-15% for children, and about 15% for secondary AML patients. The NCCN Clinical Practice Guidelines in Oncology: Acute Myeloid Leukemia (2016) pointed out that the prognosis of patients carrying FLT3-ITD mutations with normal karyotype is poor and should be stratified into the high-risk group. Also, in the section about the treatment of patients with relapsed refractory AML, it was mentioned that patients with FLT3-ITD mutation may consider using a demethylating drug
(5-azacytidine or decitabine) together with sorafenib [NCCN-AML]. Therefore, quantitative detection of FLT3-ITD is essential for AML patients.
Currently, the predominant FLT3-ITD detection method is the cDNA-based PCR
amplification. Limitation of this approach is that only quantitative detection can be performed, the position and sequence information of the ITD mutation, however,
cannot be acquired simultaneously. If the acquisition of sequence information is
needed, sanger sequencing must be performed and the result can be acquired only
when the allele frequency of the ITD mutation is above 10%. That is, for patients with ITD mutation ratio below 10%, only the quantitative information can be obtained and
the sequence and position information cannot be gained.
Another FLT3-ITD detection method is to use NGS sequencing and ITD mutation is determined by an bioinformatics algorithm (e.g., PINDEL). This method
is based on the high-depth target sequence capture sequencing, and it is realized by
using the information from captured target regions. However, since the capture process may lose the target region in where the ITD is located, any presently available
NGS-based sequencing algorithm (Pindel et al.) is therefore only suitable for
qualitative analysis, and accurate quantitative results cannot be obtained. In addition, this method also has the following inevitable inherent limitations: 1. due to the
template sequence content of ITD is significantly different from the normal genome, it has a great impact on the capture process. It is possible that relatively high proportion
of ITDs could end up with only limited sequencing reads to support the existence of the mutation after target capture, and so it is hard to reach a correct conclusion; 2. also due to the inevitable sequencing bias in target capture, the resulting ITDs can only be qualitatively measured, and accurate quantitative results cannot be obtained. Moreover, the clinical detection field is currently more inclined to use just one test on AML patients to obtain more detailed information and NGS could fully meet the requirement at this point. It needs only one blood test to complete multiple mutation detections of SNV, CNV, INDEL and ITD. Therefore, there is an urgent need to develop a new algorithm that enables the quantitative detection of FLT3-ITD based on the NGS platform.
SUMMARY OF THE INVENTION
Technical problem to be solved by the present invention
In view of the aforementioned problems in the prior art, the present invention develops a method for the quantitative detection of ITD mutation based on sequencing
data --- using dual platforms (NGS platform and standard detection platform) to
perform tests on a large number of collected ITD positive samples, i.e., using the PCR
amplification-capillary electrophoresis as the gold standard to obtain the quantitative ITD mutation results, and based on the length of these detected ITDs to find their
appropriate matches in the NGS data. Then through supervised machine learning,
these samples are used as the training set to perform training, and finally a sequencing data-based model capable of directly predicting quantitative ITD mutation outcome is
obtained, and the purpose of quantitatively detecting ITDs by next-generation
sequencing is achieved. That is, based on a large number of samples with gold-standard ITD quantitative detection results, the present invention screens the
characteristics related to ITD quantitative detection, and performs model training,
thereby obtaining a model by which the accurate ITD quantitative detection result of the sample to be tested can be determined by using only the sequencing data and the
ITD related characteristics. That is, the present invention includes:
1. An apparatus for the quantitative detection of ITD mutation based on sequencing data, including: a data acquisition module, which is used to acquire the sequencing data of samples to be tested; a data pre-processing module, which is connected to the data acquisition module, and is configured to extract characteristics of ITD of samples to be tested, wherein the ITD characteristics are the ITD characteristics of the whole region of nucleotide sequences or the ITD characteristics of specific regions of nucleotide sequences; a quantification module, which is connected to the data pre-processing module, and is configured to obtain the ITD mutation allele frequency based on the ITD characteristic coefficients and the ITD characteristics of samples to be tested; and a detection result output module, which is connected to the quantification module, and is configured to output the ITD mutation allele frequency as the quantitative detection result of the ITD mutation. In the present invention, the sequencing data may be fastq data obtained by converting the NGS data acquired by sequencing with a high-throughput sequencer via the existing software. The sequencing approach may be to first capture the target sequence in the exon region or other specified regions of the sample, and then sequencing it (i.e., target sequence capture sequencing) by using a high-throughput sequencer. Alternatively, whole-genome sequencing (WGS) of the sample may also be feasible. In the present invention, the ITD mutation may come from samples of species that usually have ITD mutations. Preferably, samples are from mammals, more preferably humans. Specified regions are usually selected for ITD mutation detection, such as internal tandem duplications (ITDs) of the juxtamembrane region of the Fms-like tyrosine kinase 3 (FLT3) gene in patients with acute myeloid leukemia
(AML) (hg19, NCBI version 37; chr3:28608000-28608600) In the present invention, the quantitative detection of a sample is the detection of
variant allele frequency (VAF) of the ITD mutation of the sample. The gold standard ITD detection method is generally a method accepted by those skilled in the art and
capable of accurately obtaining the VAF of ITD mutation, the position of ITD mutation or the length of ITD mutation (including the total length of the ITD and the length of the ITD repeats) etc., for example, the gold standard for current ITD quantitative detection may be specific amplification by polymerase chain reaction (PCR) in combination with capillary electrophoresis (CE) (PCR-CE) to detect VAF of
ITD mutations. 2. The apparatus according to item 1, wherein the quantification module includes
a quantitative model sub-module for obtaining an ITD mutation allele frequency based on the ITD characteristic coefficients and ITD characteristics of samples to be
tested, wherein quantitative model set in the quantitative model sub-module is
represented by the following equation (1),
(w,x) =wO +w1 x 1 +w 2 x 2 +w 3 x3 +--+wnx....(1) in the equation (1), y(w, x) represents the ITD mutation allele frequency, won
represent the ITD characteristic coefficients, and xon represent the ITD characteristics.
3. The apparatus according to item 2, wherein the quantitative model is
configured by a coefficient training sub-module, and is configured to acquire the ITD
characteristic coefficients won, and wherein the coefficient training sub-module includes:
a detection result acquisition unit, which is configured to acquire first detection
results of first test samples and second detection results of first test samples as a training set,
a machine learning unit, which is configured to use the first detection results of
first test samples and second detection results of first test samples as a training set, and obtain the ITD characteristic coefficients wo-n through machine learning of the
training set,
wherein, the first detection results are the high-throughput sequencing data, and the
second detection results are the mutation allele frequency value obtained from the ITD standard detection, such as the gold-standard ITD detection method.
4. The apparatus according to item 2, wherein the quantitative model is configured by a coefficient training sub-module, and is connected to the data pre-processing module for acquiring the ITD characteristic coefficients wo, the coefficient training sub-module includes, a detection result acquisition unit, which is configured to acquire first detection results of first test samples and second detection results of first test samples, and acquire first detection results of second test samples and second detection results of the second test samples, a machine learning unit, which is configured to use the first detection results of first test samples and second detection results of first test samples as a training set, and obtain the ITD characteristic coefficients wo-n by the machine learning of the training set, a machine learning test unit, which is configured to perform tests using the first detection results of second test samples, and then compare the mutation allele frequency value calculated by the equation (1) with the second detection results of second test samples, a test result assessment unit, which is configured to assess whether the result of comparison meets the expectation, a machine learning revise unit, which is configured to determine the values of the
ITD characteristic coefficients wo-n when the comparison result meets the expectation,
and to modify (such as increase, decrease or re-provide) the ITD characteristics xo-n adopted in the equation (1) when the comparison result does not meet the expectation,
and reset the ITD characteristic coefficients won,
wherein, the first detection results are the high-throughput sequencing data, and the
second detection results are the mutation allele frequency value obtained from the
ITD standard detection, such as the gold-standard ITD detection method. 5. The apparatus according to item 3 or 4, wherein the assessment is to be made
whether the comparison result meets the expectation according to the following equation (2): nsamples'1
,nsamples1
in the equation (2), yj represents the second detection results of second test
samples, , represents the mutation allele frequency value calculated by equation (1),
ji, represents the mean value of the second detection results of the second test samples.
specifically, the concrete assessment method includes: setting an expected value
(set value) for R 2 , the comparison result is concluded as meeting the requirement if R 2
is above the set value, and it is concluded that the comparison result does not meet the
requirement if R 2 is less than the set value. A preferred set value is 0.9.
6. The apparatus according to any one of items 1-5, wherein the ITD
characteristic is selected from one or two or more of the following characteristics: the
position of the occurring ITD, the length of the ITD, the nucleotide sequence
characteristics of the ITD, the nucleotide sequence characteristics before and after the
position of the occurring ITD, and the nucleotide sequence characteristics of a
particular sequence.
In the apparatus of the present invention, the length of the ITD representing the
ITD characteristic may include, but is not limited to, the total length of the ITD
segment or the length of the repeating segment. Sequence characteristics representing
a particular sequence may include, but is not limited to, sequence complexity.
Nucleotide sequence characteristics representing the ITD characteristic may include,
but is not limited to, sequence complexity or GC content, and the like. Sequence
complexity can be evaluated by using blast software (different parameters).
In the apparatus of the present invention, the number of ITD characteristics is not
particularly limited, and the number of ITD characteristics that may be selected is, for
example, 500 to 2000, and preferably, the number of the ITD characteristics is, for
example, about 1,500.
7. A method for the quantitative ITD mutation detection based on sequencing
data, including:
acquiring the sequencing data of samples to be tested; extracting ITD characteristics of the sequencing data of samples to be tested, wherein the ITD characteristics are the ITD characteristics of the whole region of the nucleic acid sequence or the ITD characteristics of specific regions of nucleic acid sequences; quantitatively detecting the ITD mutation allele frequency of samples to be tested, and obtaining the ITD mutation allele frequency (the quantitative detection result) based on the ITD characteristic coefficients and the ITD characteristics of samples to be tested.
8. The method according to item 7, wherein quantitative detection step is
performed by a quantitative model represented by the following equation (1),
(w,x) =wO +w1 x 1 +w 2 x 2 +w 3 x3 +--+wnx.....(1) in the equation (1), y(w, x) represents the ITD mutation allele frequency, won
represent the ITD characteristic coefficients, and xon represent the ITD characteristics.
9. The method according to item 8, wherein the method for acquiring the ITD
characteristic coefficients won of the quantitative model includes:
acquiring first detection results of first test samples and second detection results of first test samples as a training set,
obtaining the ITD characteristic coefficients wo-n by the machine learning of the
training set, wherein,
the first detection result is the high-throughput sequencing data, and the second
detection result is the mutation allele frequency value of the ITD standard detection, such as the ITD gold-standard detection method.
10. The method according to item 8, wherein the method for acquiring the ITD
characteristic coefficients won of the quantitative model includes: acquiring first detection results of first test samples and second detection results
of first test samples, and acquiring first detection results of second test samples and second detection results of second test samples,
using the first detection results of first test samples and the second detection results of first test sample as a training set, and obtaining the ITD characteristic coefficients wo-n by the machine learning of the training set, the first detection results of second test samples are used for testing, and the mutation allele frequency value calculated by the equation (1) is compared with the second detection results of second test samples to assess whether the comparison result meets the expectation, if the comparison result meets the expectation, the ITD characteristic coefficients wo-, are determined; if the comparison result does not meet the expectation, the ITD characteristics xo, adopted in the equation (1) are modified (such as increased, decreased or re-provided), and the ITD characteristic coefficients won are reset, wherein, the first detection result is the high-throughput sequencing data, and the second detection result is the mutation ratio value of the ITD standard detection, such as the ITD gold-standard detection method.
11. The method according to item 9 or 10, wherein the assessment is to be made
whether the comparison result meets the expectation according to the following
equation (2):
,nsamp1es1y' 2 R nsampesl (2),,)
in the equation (2), yj represents the second detection results of the second test
samples, , represents the mutation allele frequency value calculated by the equation (1), y; represents the mean value of the second detection results of second test
samples. 12. The method according to any one of items 7-11, wherein the ITD
characteristic is selected from one or two or more of the following characteristics: the
position of the occurring ITD, the length of the ITD, the nucleotide sequence characteristics of the ITD, the nucleotide sequence characteristics before and after the
position of the ITD, and the nucleotide sequence characteristics of a particular sequence.
In the method of the present invention, the length of ITD representing the ITD characteristics may include, but is not limited to, the total length of the ITD segment or the length of the repeating segment. Sequence characteristic representing a particular sequence may include, but is not limited to, sequence complexity. Nucleotide sequence characteristic representing the ITD characteristic may include, but is not limited to, sequence complexity or GC content, and the like. Sequence complexity can be evaluated by using blast software (different parameters).
In the method of the present invention, the number of ITD characteristics is not particularly limited, and the number of ITD characteristics that may be selected is, for
example, 500 to 2000, and preferably, the number of the ITD characteristics is, for
example, about 1,500. 13. An electronic device, including:
a processor; and
a memory, in which the computer program instructions are stored, and when the computer program instructions are executed by the processor, the method for the
quantitative ITD mutation detection based on sequencing data according to any one of
items 7-12 is performed by the processor.
BRIEF DESCRIPTION OF THE DRAWINGS
Various other advantages and benefits of the present application will become
apparent to those skilled in the art by reading the detailed description in the preferred embodiments below. The drawings are only for the purpose of illustrating the
preferred embodiments, and should not to be considered as a limitation on this
application. Fig. 1 is a schematic diagram showing an apparatus for the quantitative ITD
mutation detection based on sequencing data according to an embodiment of the
present application; Fig. 2 is a schematic diagram showing a quantification module in the apparatus
for the quantitative ITD mutation detection based on sequencing data according to an embodiment of the present application;
Fig. 3 is a schematic diagram showing a coefficient training sub-module in the apparatus for the quantitative ITD mutation detection based on sequencing data according to an embodiment of the present application;
Fig. 4 is a schematic diagram showing a coefficient training sub-module in the apparatus for the quantitative ITD mutation detection based on sequencing data
according to an embodiment of the present application; Fig. 5 is a flow chart showing a method for the quantitative ITD mutation
detection based on sequencing data according to an embodiment of the present application;
Fig. 6 is a schematic diagram showing an electronic device according to an
embodiment of the present application; Fig. 7 is a graph showing the detection result according to a preferred
embodiment of the present application.
DETAILED DESCRIPTION OF THE INVENTION
The technical terms mentioned in the specification have the same meanings as
those generally understood by the skilled in the art, and if there is a conflict, the
definition in the present specification shall prevail. In general, the terms used in this specification have the following meanings.
Machine learning: machine learning is a branch of artificial intelligence. The
research of artificial intelligence is a natural and clear thread from focusing on "reasoning" to focusing on "knowledge", and then focusing on "learning". Obviously,
machine learning is a way to realize artificial intelligence, i.e., to solve problems in
artificial intelligence by machine learning. In the past 30 years, machine learning has developed into an inter-disciplinary subject involving many areas, such as probability
theory, statistics, approximation theory, convex analysis, and computational
complexity theory. Machine learning theory is primarily about designing and analyzing algorithms that allow computers to automatically "learn". Machine learning
algorithms are a class of algorithms that automatically analyze data to obtain patterns from it and use them to predict unknown data. Since the learning algorithms involve a
large number of statistical theories, machine learning is closely related to inductive statistics, also known as the statistical learning theory. In terms of algorithm design, machine learning theory focuses on achievable, effective learning algorithms. Many inference problems have the difficulty of no program to follow, so part of the machine learning research is to develop the approximation algorithm that is easy to handle.
Machine learning has been widely used in the areas such as data mining, computer vision, natural language processing, biometrics, search engines, medical diagnostics,
detection of credit card fraud, securities market analysis, DNA sequencing, speech and handwriting recognition, strategy games and robotics.
Target sequence capture sequencing: is to customize genomic regions of interest
into specific probes and hybridize with genomic DNA in a sequence capture chip (or solution), after enriching the DNA segments of target genomic regions, they are
sequenced by using the next generation sequencing technology.
ITD: internal tandem duplication.
Summary of the application
As mentioned above, there is currently a need to quantitatively detect the ITD
mutation based solely on sequencing data, but accurate (being close or substantially consistent with the ITD standard detection method) quantification (ITD mutation
allele frequency) detection results cannot be obtained based on sequencing data only
through the existing software or algorithms. The ITD standard detection method described in the present invention can be, such as gold-standard detection method, for
example, a method of PCR amplification-capillary electrophoresis, or other
commonly recognized detection methods capable of accurately obtaining the ITD mutation allele frequency. Sequencing as described herein generally refers to the next
generation sequencing, i.e., NGS sequencing.
The existing method for directly detecting the ITD mutation ratio by NGS sequencing, such as PINDEL, is based on high-depth target region capture sequencing,
and it is realized by using the information from the captured target regions. Since the capture process may cause missing capture of the segments where the ITD occurs,
accurate quantitative results cannot be obtained.
The inventors of the present application found that, through collecting the
characteristics related to the ITD and the corresponding characteristic coefficients in
the sequencing data, the quantitative detection results of ITD mutations of samples to be detected can be nearly consistent with the detection results of the gold-standard.
The quantitative ITD mutation detection of samples to be tested may employ, for example, the ITD-related characteristics and the corresponding characteristic
coefficients described in the present application, and the acquisition of the ITD-related characteristics and corresponding characteristic coefficients may also
employ, for example, the machine learning method described herein.
Therefore, the basic idea of the present application is to solve the above technical problems, and quantitatively determine the ITD mutations by obtaining the ITD
related characteristics and the corresponding characteristic coefficients.
Specifically, the present application provides an apparatus, a method, and an electronic device for the quantitative ITD mutation detection based on sequencing
data, wherein firstly acquiring sequencing data of samples to be tested, then extracting
the ITD characteristics of the sequencing data of samples to be tested, and obtaining
the ITD mutation allele frequency of samples to be tested based on ITD characteristic coefficients and ITD characteristics of samples to be tested, and wherein ITD
characteristics are the ITD characteristics of the whole region of a nucleic acid
sequence or the ITD characteristics of the specific region of a nucleic acid sequence; Herein, those skilled in the art can understand that the apparatus, method and
electronic device for quantitatively detecting ITD mutation based on sequencing data
provided by the present application can be used for the quantitative ITD mutation detection of various sequencing data, for example, the data of whole genome
sequencing, and target sequence capture sequencing, etc., as long as this sequencing
method is commonly used in the current sequencing methods for detecting ITD mutation. Therefore, even if the sequencing data of target sequence capture is mainly
described below as an example, the embodiments of the present application are not limited thereto.
After introduction of the basic principles of the present application, the exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It is apparent that the described embodiments are only a part of the embodiments of the present application, and are not intended to show all embodiments. It should be understood that the present application is not limited by the exemplary embodiments described herein.
Exemplary Apparatus Fig. 1 is a schematic diagram showing an apparatus for the quantitative ITD
mutation detection based on sequencing data according to an embodiment of the
present application. As shown in Fig. 1, an apparatus 1708 for the quantitative ITD mutation detection based on sequencing data according to an embodiment of the
present application includes:
a data acquisition module 100 for samples to be tested, which is configured to acquire the sequencing data of samples to be tested;
a data pre-processing module 200 for samples, which is connected to the data
acquisition module, and is configured to extract the ITD characteristics of samples to
be tested, wherein said ITD characteristics are the ITD characteristics of the whole region of a nucleic acid sequence or the ITD characteristics of the specific region of a
nucleic acid sequence;
a quantification module 300, which is connected to the data pre-processing module, and is configured to obtain ITD mutation allele frequency based on the ITD
characteristic coefficients and the ITD characteristics of samples to be tested; and
a detection result output module 400, which is connected to the quantification module, and is configured to output the ITD mutation allele frequency as a detection
result of ITD mutation allele frequency of samples to be tested.
In the quantification module 300, particularly in the embodiment of the present
application as shown in Fig. 2, further includes: a quantitative model sub-module 310 for obtaining ITD mutation allele frequency based on the ITD characteristic coefficients and the ITD characteristics of
samples to be tested, wherein quantitative model set in the quantitative model sub-module is represented by the following equation (1),
(w, x) = wO + w 1 x 1 + w 2x 2 + w 3 x3 + '--+wnxn....(1) in the equation (1), 9(w, x) represents ITD mutation allele frequency, won represent ITD characteristic coefficients, and xo, represent ITD characteristics.
The quantitative model sub-module 310 is configured to acquire the ITD characteristic coefficients wo, or to acquire the ITD characteristics xo, and the ITD
coefficients wo- corresponding thereto. Particularly, in the embodiment of the present application, as shown in Fig. 2, the ITD characteristics or the ITD characteristics and
the ITD coefficients corresponding thereto of the quantitative model sub-module 310
are configured by a coefficient training sub-module 320. In the coefficient training sub-module 320, particularly, in a preferred example of
the present invention, as shown in Fig. 3, a data acquisition unit 321 for a sequencing
sample is configured to acquire the first detection results of first test samples and the second detection results of first test samples as a training set; and a machine learning
unit 322 is configured to use the first detection results of first test samples and the
second detection results of first test samples as a training set, and obtain the ITD
characteristic coefficients wonby the machine learning of the training set, wherein the first detection result is high-throughput sequencing data, and the second detection
result is the mutation allele frequency value of ITD gold-standard detection method.
In still another preferred example of the present invention, as shown in Fig. 4, a coefficient training sub-module 320 includes a data acquisition unit 323 for
sequencing samples, which is configured to acquire the first detection results of first
test samples and the second detection results of first test samples, and acquire the first detection results of second test samples and second detection results of second test
samples; a machine learning unit 324 is configured to use the first detection result of
first test samples and the second detection results of first test samples as a training set, and obtain the ITD characteristic coefficients wo-n by the machine learning of the
training set. A machine learning detection unit 325, which is configured to perform a test by using the first detection results of second test samples, and compare the
mutation allele frequency value calculated by the equation (1) with the second detection results of second test samples; a test result assessment unit 326, which is configured to assess whether the comparison result meets the expectation; a machine learning revise unit 327, which is configured to determine values of ITD characteristic coefficients wo, when the comparison result meets the expectation, and revise the
ITD characteristics xo, adopted in the equation (1) when the comparison result does not meet the expectation, re-providing (or re-selecting) and determining the ITD
characteristic coefficients wo-n. Wherein the first detection result is high-throughput sequencing data, and the second detection result is the mutation allele frequency value
of ITD gold-standard detection method.
As described above, the apparatus 1708 for detecting ITD mutation allele frequency based on sequencing data according to examples of the present application
can be used in various terminal devices, such as servers for targeted capture
sequencing, and the like. In one example, the apparatus 1708 according to the present example can be integrated into a terminal device as a software module and/or
hardware module. For example, the apparatus 1708 may be a software module in an
operating system of the terminal device, or may be an application program developed
for the terminal device; of course, the apparatus 1708 may also be one of a number of hardware modules of the terminal device.
Alternatively, in another example, the apparatus 1708 for the quantitative ITD
mutation detection based on sequencing data and the terminal device can also be separate devices, and the apparatus 1708 can be connected to the terminal device via a
wired and/or wireless network, and the interactive information is transmitted
according to the arranged data format.
Exemplary Method
Fig. 5 is a flowchart showing a method for the quantitative ITD mutation detection based on sequencing data according to an embodiment of the present
application. As shown in Fig. 5, a method for the quantitative ITD mutation detection based on sequencing data according to an embodiment of the present application
includes: S100, acquiring the sequencing data of samples to be tested; S200, extracting ITD characteristics of the sequencing data of samples to be tested, wherein the ITD characteristics are the ITD characteristics of the whole region of a nucleic acid sequence or the ITD characteristics of the specific region of a nucleic acid sequence; S300, quantitatively detecting the ITD mutation allele frequency of samples to be tested, and obtaining the quantitative ITD mutation detection result of samples to be tested based on the ITD characteristic coefficients and the ITD characteristics of samples to be tested.
Exemplary Electronic Device
Hereinafter, an electronic device according to an embodiment of the present application will be described with reference to Fig. 6.
Fig. 6 illustrates a block diagram of an electronic device according to an
embodiment of the present application. As shown in Fig. 6, an electronic device 10 includes one or more processors 11
and memory 12.
The processor 11 may be a central processing unit (CPU) or other form of
processing unit with data processing capability and/or instruction executing capability, and may control other components in the electronic device 10 to perform desired
functions.
The memory 12 may include one or more computer program products, which may include various forms of computer readable storage media, such as a volatile
memory and/or a nonvolatile memory. The volatile memory may include, for example,
a random access memory (RAM), and/or a cache, and the like. The nonvolatile memory may include, for example, a read only memory (ROM), a hard disk, a flash
memory, and the like. One or more computer program instructions can be stored in the
computer readable storage medium, and the processor 11 can execute the program instructions to realize the method for the quantitative ITD mutation detection based on
sequencing data in each of the above embodiments according to the present application, and/or other desired functions. Various contents such as the
above-described ITD characteristics, ITD characteristic coefficients, and the like can also be stored in the computer readable storage medium.
In one example, electronic device 10 may also include an input apparatus 13 and an output apparatus 14 that are interconnected by a bus system and/or other form of
connections (not shown).
For example, the input apparatus 13 can include, for example, a keyboard, a mouse, and the like.
The output apparatus 14 can output various kinds of information to the outside, such as the detection result of the quantitative ITD mutation detection and the like.
The output apparatus 14 can include, for example, a display, a speaker, a printer, and a
communication network and the remote output apparatus connected thereto, and the like.
Of course, for simplicity, only some components of the electronic device 10
related to the present application are shown in Fig. 6 and the components such as the bus, the input/output interface, and the like are omitted. In addition to this, the
electronic device 10 may also include any other suitable components depending on
the concrete cases of the applications.
Exemplary Computer Program Product and Computer Readable Storage Medium
In addition to the method and apparatus described above, the embodiment of the
present application can also be a computer program product including computer program instructions, when the computer program instructions are executed by a
processor, they make the processor perform the steps of the method for the
quantitative ITD mutation detection based on sequencing data according to each of the embodiments of the present application described in the above section of "exemplary method" in this specification.
As for the computer program product, any combination of one or more programming languages can be used for writing the program codes for performing the
operations of embodiments of the present application, and the programming languages include object-oriented programming languages, such as Java, C++, etc., and also
include conventional procedural programming languages, such as the "C" language, or similar programming languages. The program codes can be executed entirely on the user's computing device, partially on the user's device, as a stand-alone software package, partially on the user's computing device while partially on the remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application can also be computer readable storage medium with computer program instructions stored therein, when the
computer program instructions are executed by a processor, they make the processor perform the steps of the method for the quantitative detection of ITD mutation based
on sequencing data according to each of the embodiments of the present application
described in the above section of "exemplary method" in this specification. The computer readable storage medium can employ any combination of one or
more readable mediums. The readable medium may be a readable signal medium or a
readable storage medium. A readable storage medium can include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or
semiconductor system, apparatus, or element, or any combination of the above. More
concrete examples (non-exhaustively listed) of readable storage media include: an
electrical connection with one or more wires, a portable disk, a hard disk, a random access memory (RAM), a read only memory (ROM), an erasable programmable
read-only memory (EPROM, or flash memory), an optical fiber, a portable compact
disk read only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the above.
Embodiments Hereinafter, a concrete embodiment of the apparatus for the quantitative ITD
mutation detection based on sequencing data according to the present application will
be described with reference to Figs. 1-5, so as to explain the present invention in more details, but the present invention is not limited by these embodiments.
First, the targeted capture sequencing data and PCR-CE ITD quantitative test results of the FLT3 gene ITD frequently-occurring region (chr3: 28608000-28608600)
from 80 patients with acute myeloid leukemia (AML) are collected and divided equally into ten groups, taking nine of them (the first test samples) as the basis of a training set for establishing the quantitative model, and the coefficient training sub-module is used to obtain the ITD characteristics and the ITD characteristic coefficients corresponding thereto for the present embodiment. The remaining one
(the second test samples) is used to check whether the test result of the above quantitative model meets the expectation. When the test result is concluded not to
meet the expectation, the ITD characteristics adopted by the quantitative model are revised (increased, decreased, or re-provided), and the ITD characteristic coefficients
are reprocessed.
Therefore, it can be understood that the quantification module of the embodiment includes a quantitative model sub-module, wherein quantitative model set in the
quantitative model sub-module is represented by the following equation: y(w, x) wO + wix1 + w 2x 2 + w 3x 3 + -+wx. wherein, y(w, x) represents ITD mutation allele frequency, wo-n represent ITD characteristic coefficients, and xo-n represent ITD characteristics.
In this embodiment, the quantitative model is configured by the coefficient
training sub-module, and is used to acquire the ITD characteristic coefficientswo-n and the ITD characteristics xo,.
Particularly, in this embodiment, the coefficient training sub-module is
configured with a detection result acquisition unit, which is configured to acquire the first detection results of the first test samples and the second detection results of the
first test samples, and to acquire the first detection result of the second test sample and
the second detection result of the second test sample; a machine learning unit, which is configured to use the first detection results of the first test samples and the second
detection results of the first test samples as a training set, and obtain the ITD
characteristic coefficients wonby the machine learning of the training set; a machine learning test unit, which is configured to perform tests by using the first detection
results of the second test samples, and compare the mutation allele frequency value calculated by the quantitative model with the second detection results of the second
test samples; a test result assessment unit, which is configured to assess whether the comparison result meets the expectation; a machine learning revise unit, which is configured to determine the values of the ITD characteristic coefficients wo, when the comparison result meets the expectation, and revise the ITD characteristics xon adopted by the quantitative model when the comparison result does not meet the expectation, re-providing and determining the ITD characteristic coefficients won; wherein the first detection result is high-throughput sequencing data, and the second detection result is a mutation ratio value of ITD standard detection. The assessment of whether the comparison result meets the requirement is to be made by the following equation (2):
P2- samP~es1Y_,) >fsamplesl (2),,)
in the equation (2), y, represents the second detection results of the second test
samples, , represents the mutation allele frequency value calculated by equation (1), ji, represents the mean value of the second detection results of the second test samples.
The quantitative model, the ITD characteristics xon and the characteristic
coefficients wo-nof the present embodiment are obtained by the above apparatus and operation.
Then, the targeted capture sequencing data (in the format of fastq file) and
PCR-CE ITD quantitative test results of the FLT3 gene ITD frequently-occurring area (chrl3: 28608000- 28608600) from 30 patients with acute myeloid leukemia (AML)
are collected.
And then the ITD characteristics of the sequencing data from above samples are extracted by the data pre-processing module, wherein the ITD characteristics are the
ITD characteristics of the target capture area. These ITD characteristics include, but
are not limited to: the length of the insert segment (TD mutant segment), the complexity of the insert segment (TD mutant segment), the supporting sequencing
reads number of the insert segment (lTD mutant segment), and the position of the insert segment (lTD mutant segment), the depth of the insert segment (TD mutant
segment).
Finally, the detection result of the ITD mutation allele frequency is obtained by the quantitative calculation module based on the ITD characteristic coefficients and
the ITD characteristics of samples to be tested. The detection result is output by an
output module, as shown in the following example, Chr13 28608251 21_ TGAGATCATATTCATATTCTC INS 0.0809866666666666
In the example of the displayed output detection result, the first and the second
item are the absolute position of the ITD occurring in the genome, the third item is the
length of the ITD, the fourth item is the sequence of the ITD, and the fifth term is the
type of the ITD, and the final one is the quantitative (ITD mutation allele frequency) detection result (the ITD quantitative result of this example is 8.09%).
By using the apparatus and method for the quantitative ITD mutation detection
based on the sequencing data according to the present embodiment, the quantitative detection results of all 30 samples are shown in Fig. 7. In Fig. 7, the abscissa is the
sample number, and the ordinate is the ITD mutation allele frequency value. The
model prediction curve is the ITD mutation allele frequency value obtained by using the apparatus and method for detecting the ITD mutation allele frequency based on
sequencing data according to the preferred embodiment; and the NGS result curve is
the ITD mutation allele frequency directly calculated without the model training, and the gold standard curve is the quantitative detection result by using the PCR-CE
method. The R 2 values of the model prediction curve and the NGS curve obtained by
the equation (2) of the present embodiment are 0.9951 and 0.875 respectively. It can be seen from this, that the result of the ITD mutation allele frequency obtained by the
apparatus and method for the quantitative ITD mutation detection based on sequencing data according to the preferred embodiment is more relevant to the gold
standard and has a higher degree of conformity.
The fundamentals of the present application have been described above in conjunction with particularly embodiments. However, it should be noted that the
benefits, advantages, effects, and the like mentioned in the present application are merely examples and not limitations, and the benefits, advantages, effects, etc. are not
considered to be required in each of the embodiments of the present application. In addition, the specific details of the above disclosure are only for the purpose of illustration and for ease of understanding, and are not intended to limit the present application. The above details do not limit the application to be implemented by following the above specific details.
INDUSTRIAL APPLICABILITY
According to the present invention, provided are an apparatus and a method capable of detecting an ITD mutation quantitatively while acquiring the ITD
sequencing information.

Claims (13)

1. An apparatus for the quantitative ITD mutation detection based on sequencing data,
including:
a data acquisition module, which is configured to acquire sequencing data of samples to be tested;
a data pre-processing module, which is connected to the data acquisition module, and is
configured to extract ITD characteristics of samples to be tested, wherein the ITD characteristics are ITD characteristics of the whole region of nucleotide sequences or ITD characteristics of specific
regions of nucleotide sequences;
a quantification module, which is connected to the data pre-processing module, and is configured to obtain ITD mutation allele frequency based on the ITD characteristic coefficients and
ITD characteristics of samples to be tested; and
a detection result output module, which is connected to the quantification module, and is configured to output the ITD mutation allele frequency as the quantitative detection result of the ITD mutation of samples to be tested.
2. The apparatus according to claim 1, wherein the quantification module includes a
quantitative model sub-module for obtaining ITD mutation allele frequency based on the ITD characteristic coefficients and ITD characteristics of samples to be tested, wherein quantitative
model set in the quantitative model sub-module is represented by the following equation (1),
y(w,x) = wO +w1 x 1 +w 2x 2 +w 3x 3 +.--+wnx...(1)
in the equation (1), f(w, x) represents ITD mutation allele frequency, won represent ITD characteristic coefficients, and xo, represent ITD characteristics.
3. The apparatus according to claim 2, wherein the quantitative model is configured by a coefficient training sub-module, and is configured to acquire the ITD characteristic coefficients wo,
and wherein the coefficient training sub-module includes:
a detection result acquisition unit, which is configured to acquire first detection results of first test samples and second detection results of first test samples as a training set,
a machine learning unit, which is configured to use the first detection results of first test
samples and second detection results of first test samples as a training set, and obtain the ITD characteristic coefficients wo, through machine learning of the training set, wherein, the first detection results are high-throughput sequencing data, and the second detection results are the mutation allele frequency value of the ITD standard detection.
4. The apparatus according to claim 2, wherein the quantitative model is configured by a coefficient training sub-module, and is connected to the data pre-processing module for acquiring the ITD characteristic coefficients wo, the coefficient training sub-module includes:
a detection result acquisition unit, which is configured to acquire first detection results of first
test samples and second detection results of first test samples, and acquire first detection results of second test samples and second detection results of second test samples,
a machine learning unit, which is configured to use the first detection results of first test
samples and second detection results of first test samples as a training set, and obtain the ITD characteristic coefficients wo, by the machine learning of the training set,
a machine learning test unit, which is configured to perform tests using the first detection
results of second test samples, and compare the mutation allele frequenct value calculated by the equation (1) with the second detection results of second test samples,
a test result assessment unit, which is configured to assess whether the comparison result meets
the expectation, a machine learning revise unit, which is configured to determine the values of ITD
characteristic coefficients wo, when the comparison result meets the expectation, and to modify the ITD characteristics xo-n adopted in the equation (1) when the comparison result does not meet the expectation, and reset the ITD characteristic coefficients wo,,
wherein, the first detection results are high-throughput sequencing data, and the second detection results
are the mutation allele frequency of the ITD standard detection.
5. The apparatus according to claim 3 or 4, wherein the assessment is to be made whether the comparison result meets the expectation according to the following equation (2): 2 ,nfsamplesl(y5 -)
1- sampes'1y_,) (2).
in the equation (2), yj represents the second detection results of second test samples, , represents the mutation allele frequency calculated by the equation (1), ji, represents the mean value of second detection results of second test samples.
6. The apparatus according to any one of claims 1-5, wherein the ITD characteristic is selected from one or two or more of the following characteristics: the position of the occurring ITD, the length of the ITD, the nucleotide sequence characteristics of the ITD, the nucleotide sequence characteristics before and after the position of the occurring ITD, and the nucleotide sequence characteristics of a particular sequence.
7. A method for the quantitative ITD mutation detection based on sequencing data, including:
acquiring sequencing data of samples to be tested; extracting ITD characteristics of sequencing data of samples to be tested, wherein the ITD
characteristics are ITD characteristics of the whole region of the nucleic acid sequences or ITD
characteristics of specific regions of nucleic acid sequences; quantitatively detecting the ITD mutation allele frequency of samples to be tested, and
obtaining the quantitative detection result of samples to be tested based on ITD characteristic
coefficients and ITD characteristics of samples to be tested.
8. The method according to claim 7, wherein the quantitative detection step is performed by a
quantitative model represented by the following equation (1),
y(w,x) = wO +w1 x 1 +w 2x 2 +w 3x 3 +.--+wnx...(1)
in the equation (1), y(w, x) represents the ITD mutation allele frequency, wo-, represent the ITD characteristic coefficients, and xo-n represent the ITD characteristics.
9. The method according to claim 8, wherein the method for acquiring the ITD characteristic coefficients wo-n of the quantitative model includes:
acquiring first detection results of first test samples and second detection results of first test samples as a training set,
obtaining the ITD characteristic coefficients wo-n by the machine learning of the training set,
wherein, the first detection result is the high-throughput sequencing data, and the second detection result
is the mutation allele frequency of the ITD standard detection.
10. The method according to claim 8, wherein the method for acquiring the ITD characteristic coefficients wo-n of the quantitative model includes: acquiring first detection results of first test samples and second detection results of first test samples, and acquiring first detection results of second test samples and second detection results of second test samples, using the first detection results of first test samples and second detection results of first test samples as a training set, and obtaining the ITD characteristic coefficients won by the machine learning of the training set, the first detection results of second test samples are used for testing, and the mutation allele frequency value calculated by the equation (1) is compared with the second detection results of second test samples to assess whether the comparison result meets the expectation, if the comparison result meets the expectation, the ITD characteristic coefficients wo, are determined; if the comparison result does not meet the expectation, the ITD characteristics xo, adopted in the equation (1) are modified, and the ITD characteristic coefficients wo, are reseted, wherein, the first detection result is the high-throughput sequencing data, and the second detection result is the mutation allele frequency of the ITD standard detection.
11. The method according to claim 9 or 10, wherein the assessment is to be made whether the comparison result meets the expectation according to the following equation (2): 1 2 ,nfsamples (y- R 1- >mfsamples-' 2 .(2) in the equation (2), y, represents the second detection results of second test samples, y represents the mutation allele frequency calculated by the equation (1), ji, represents the mean value of the second detection results of second test samples.
12. The method according to any one of claims 7-11, wherein the ITD characteristic is selected from one or two or more of the following characteristics: the position of the occurring ITD, the
length of the ITD, the nucleotide sequence characteristics of the ITD, the nucleotide sequence
characteristics before and after the position of the occurring ITD, and the nucleotide sequence characteristics of particular sequences.
13. An electronic device, including:
a processor; and a memory, in which the computer program instructions are stored, and when the computer program instructions are executed by the processor, the method for the quantitative ITD mutation detection based on sequencing data according to any one of claims 7-12 is performed by the processor.
AU2018391843A 2017-12-21 2018-12-20 Sequencing data-based ITD mutation ratio detecting apparatus and method Active AU2018391843B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2022218581A AU2022218581B2 (en) 2017-12-21 2022-08-18 Sequencing data-based itd mutation ratio detecting apparatus and method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201711395827.7 2017-12-21
CN201711395827 2017-12-21
PCT/CN2018/122394 WO2019120254A1 (en) 2017-12-21 2018-12-20 Sequencing data-based itd mutation ratio detecting apparatus and method

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2022218581A Division AU2022218581B2 (en) 2017-12-21 2022-08-18 Sequencing data-based itd mutation ratio detecting apparatus and method

Publications (2)

Publication Number Publication Date
AU2018391843A1 true AU2018391843A1 (en) 2020-08-06
AU2018391843B2 AU2018391843B2 (en) 2022-07-07

Family

ID=66994429

Family Applications (2)

Application Number Title Priority Date Filing Date
AU2018391843A Active AU2018391843B2 (en) 2017-12-21 2018-12-20 Sequencing data-based ITD mutation ratio detecting apparatus and method
AU2022218581A Active AU2022218581B2 (en) 2017-12-21 2022-08-18 Sequencing data-based itd mutation ratio detecting apparatus and method

Family Applications After (1)

Application Number Title Priority Date Filing Date
AU2022218581A Active AU2022218581B2 (en) 2017-12-21 2022-08-18 Sequencing data-based itd mutation ratio detecting apparatus and method

Country Status (4)

Country Link
CN (1) CN109943635A (en)
AU (2) AU2018391843B2 (en)
NZ (1) NZ766350A (en)
WO (1) WO2019120254A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424664B (en) * 2022-11-07 2023-03-10 北京雅康博生物科技有限公司 Method and device for evaluating man-made mutation degree

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040024532A1 (en) * 2002-07-30 2004-02-05 Robert Kincaid Method of identifying trends, correlations, and similarities among diverse biological data sets and systems for facilitating identification
CN101560564A (en) * 2009-04-08 2009-10-21 北京华生恒业科技有限公司 Detection device and detection system
CN105331606A (en) * 2014-08-12 2016-02-17 焦少灼 Nucleic acid molecule quantification method applied to high-throughput sequencing
CN105969856B (en) * 2016-05-13 2019-11-12 万康源(天津)基因科技有限公司 A kind of unicellular exon sequencing tumour somatic mutation detection method
CN106845155B (en) * 2016-12-29 2021-11-16 安诺优达基因科技(北京)有限公司 Device for detecting internal series repetition
EP3378950A1 (en) * 2017-03-21 2018-09-26 Sequencing Multiplex SLK Easy one-step amplification and labeling (eosal)

Also Published As

Publication number Publication date
AU2022218581B2 (en) 2023-09-28
WO2019120254A1 (en) 2019-06-27
NZ766350A (en) 2022-05-27
AU2018391843B2 (en) 2022-07-07
AU2022218581A1 (en) 2022-09-15
CN109943635A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
JP5171254B2 (en) Automated analysis of multiple probe target interaction patterns: pattern matching and allele identification
CN109411015B (en) Tumor mutation load detection device based on circulating tumor DNA and storage medium
CN112768089B (en) Method, apparatus and storage medium for predicting drug sensitivity status
CN108038352B (en) Method for mining whole genome key genes by combining differential analysis and association rules
RU2517286C2 (en) Classification of samples data
JP2005531853A (en) System and method for SNP genotype clustering
KR102044094B1 (en) Method for classifying cancer or normal by deep neural network using gene expression data
CN109887546B (en) Single-gene or multi-gene copy number detection system and method based on next-generation sequencing
CN110268072A (en) Determine the method and system of paralog gene
US20220277811A1 (en) Detecting False Positive Variant Calls In Next-Generation Sequencing
US20180196924A1 (en) Computer-implemented method and system for diagnosis of biological conditions of a patient
Wu et al. Aro: a machine learning approach to identifying single molecules and estimating classification error in fluorescence microscopy images
AU2022218581B2 (en) Sequencing data-based itd mutation ratio detecting apparatus and method
CN113823353B (en) Gene copy number amplification detection method, device and readable medium
CN101517579A (en) Method of searching for protein and apparatus therefor
Ziegler et al. MiMSI-a deep multiple instance learning framework improves microsatellite instability detection from tumor next-generation sequencing
WO2014083018A1 (en) Method and system for processing data for evaluating a quality level of a dataset
KR20210044400A (en) Method and apparatus for discovering biomarker for predicting cancer prognosis using heterogeneous platform of DNA methylation data
Qiu et al. Genomic processing for cancer classification and prediction-Abroad review of the recent advances in model-based genomoric and proteomic signal processing for cancer detection
CN113159529A (en) Risk assessment model and related system for intestinal polyp
Zhang et al. Radio-iBAG: Radiomics-based integrative Bayesian analysis of multiplatform genomic data
JPWO2002048915A1 (en) Methods for detecting associations between genes
CN114694752B (en) Method, computing device and medium for predicting homologous recombination repair defects
Lauria Rank-based miRNA signatures for early cancer detection
US20230005569A1 (en) Chromosomal and Sub-Chromosomal Copy Number Variation Detection

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)