CN108070656A - Lung cancer marker and its application - Google Patents

Lung cancer marker and its application Download PDF

Info

Publication number
CN108070656A
CN108070656A CN201711114229.8A CN201711114229A CN108070656A CN 108070656 A CN108070656 A CN 108070656A CN 201711114229 A CN201711114229 A CN 201711114229A CN 108070656 A CN108070656 A CN 108070656A
Authority
CN
China
Prior art keywords
lung cancer
marker
fusobacterium
lung
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711114229.8A
Other languages
Chinese (zh)
Other versions
CN108070656B (en
Inventor
王子榕
刘华勇
袁剑颖
李英镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huada Yinyuan Pharmaceutical Technology Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN201711114229.8A priority Critical patent/CN108070656B/en
Publication of CN108070656A publication Critical patent/CN108070656A/en
Application granted granted Critical
Publication of CN108070656B publication Critical patent/CN108070656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention proposes one group of lung cancer marker.The lung cancer marker includes at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_epidermidis,Campylobacter_concisus,Streptococcus_sp._I‑P16,Fusobacterium_nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Lung cancer marker according to embodiments of the present invention can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and have the advantages that high sensitivity, high specific and reproducible.

Description

Lung cancer marker and its application
Technical field
The present invention relates to field of biological detection, in particular it relates to lung cancer marker and its application, more specifically, The present invention relates to purposes, diagnosing or the assessment lung-cancer-risk of lung cancer marker, kit, reagent in reagent preparation box System.
Background technology
Lung cancer is one of most common malignant tumour in current China.The incidence and mortality of China's lung cancer just increases year by year It is high.The data of national tumour Register issue in 2014 show that 2010, cases of lung cancer 60.59 ten thousand was newly sent out in China, was occupied pernicious Tumour is the first, accounts for the 19.59% of malignant tumour new cases.And most of patients has been late period when finding, loses early treatment Chance.Lung cancer mainly has Small Cell Lung Cancer (small cell lung cancer, SCLC) (16.8%) and non-small cell lung Cancer (non-small cell lung cancer, NSCLC) (80.4%) two types.Non-small cell lung cancer mainly has three classes: Squamous cell carcinoma, adenocarcinoma of lung and maxicell lung cancer.Wherein adenocarcinoma of lung is most common lung cancer form (30%-65%).Lung cancer The cause of disease is not completely clear and definite so far.
However, there is the shortcomings of specificity is inadequate, accuracy and susceptibility be not high in the method for existing diagnosing, therefore, Specific higher, accuracy and the higher diagnostic method of susceptibility are that scientists wipe key issue to be solved.
The content of the invention
It is contemplated that it solves at least some of the technical problems in related technologies.
In the first aspect of the present invention, the present invention proposes one group of lung cancer marker.According to an embodiment of the invention, it is described Lung cancer marker includes at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_ argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_ epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16,Fusobacterium_ nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Lung cancer according to embodiments of the present invention Marker can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and with high sensitivity, high specific and again The advantages of renaturation is good.
In the second aspect of the present invention, the present invention proposes one group of lung cancer marker.According to an embodiment of the invention, it is described Marker has SEQ ID NO:At least one of nucleotide sequence shown in 1~10.Lung cancer mark according to embodiments of the present invention Object can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and with high sensitivity, high specific and repeatability The advantages of good.
In the third aspect of the present invention, the present invention proposes a kind of kit.According to an embodiment of the invention, including examination Agent, the reagent are used to detect at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_ argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_ epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16,Fusobacterium_ nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Reagent according to embodiments of the present invention Box can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and with high sensitivity, high specific and repeatability The advantages of good.
In the fourth aspect of the present invention, the present invention proposes purposes of the reagent in reagent preparation box, and the kit is used In diagnosing or assessment lung-cancer-risk, the reagent is used to detect at least one of following bacterium:Haemophilus_ influenzae,Corynebacterium_argentoratense,Fusobacterium_sp._4_8, Capnocytophaga_ochracea,Staphylococcus_epidermidis,Campylobacter_concisus, Streptococcus_sp._I-P16,Fusobacterium_nucleatum,Acidovorax_sp._JS42, Bacteroides_salanitronis.Inventor has found that above-mentioned bacterium can be used as lung cancer marker, effective for assessing lung cancer Risk or early stage of lung cancer diagnosis, have the advantages that high sensitivity, high specific and reproducible.
In the fifth aspect of the present invention, the present invention proposes a kind of diagnosing or the system for assessing lung-cancer-risk.Root According to the embodiment of the present invention, the system comprises:Measurement device, the measurement device are used to determine in the sample of object to be diagnosed The relative abundance of marker noted earlier;Determining device, the determining device are used for based on obtained in the measurement device The marker relative abundance determines the diagnostic result of the object.It according to the system in the embodiment of the present invention can be effective for Lung-cancer-risk or early stage of lung cancer diagnosis are assessed, and there is high sensitivity, high specific and reproducible.
Description of the drawings
Fig. 1 is diagnosing according to embodiments of the present invention or the system for assessing lung-cancer-risk;
Fig. 2 is the result according to embodiments of the present invention for differentiating patients with lung cancer in the first population sample using marker; And
Fig. 3 is the result according to embodiments of the present invention for differentiating patients with lung cancer in the second population sample using marker.
Specific embodiment
The embodiment of the present invention is described below in detail, the example of the embodiment is shown in the drawings.Below with reference to The embodiment of attached drawing description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Lung cancer marker
In one aspect of the invention, the present invention proposes one group of lung cancer marker.According to an embodiment of the invention, the lung Carcinoma marker includes at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_ argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_ epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16,Fusobacterium_ nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Lung cancer according to embodiments of the present invention Marker can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and with high sensitivity, high specific and again The advantages of renaturation is good.
In another aspect of the invention, the present invention proposes one group of lung cancer marker.According to an embodiment of the invention, it is described Marker has SEQ ID NO:At least one of nucleotide sequence shown in 1~10.
Wherein, SEQ ID NO:1 represents the nucleotide sequence of the genome of Haemophilus_influenzae, obtains Network address is https://www.ncbi.nlm.nih.gov/nuccore/NC_014922.1;
SEQ ID NO:2 represent the nucleotide sequence of the genome of Corynebacterium_argentoratense, Acquisition network address is https://www.ncbi.nlm.nih.gov/nuccore/NC_022198.1;
SEQ ID NO:3 represent the nucleotide sequence of the genome of Fusobacterium_sp._4_8, obtain network address and are https://www.ncbi.nlm.nih.gov/nuccore/NC_021281.1;
SEQ ID NO:4 represent the nucleotide sequence of the genome of Capnocytophaga_ochracea, obtain network address For https://www.ncbi.nlm.nih.gov/nuccore/NC_013162.1;
SEQ ID NO:5 represent the nucleotide sequence of the genome of Staphylococcus_epidermidis, obtain Network address is https://www.ncbi.nlm.nih.gov/nuccore/NC_002976.3;
SEQ ID NO:6 represent the nucleotide sequence of the genome of Campylobacter_concisus, obtain network address For https://www.ncbi.nlm.nih.gov/nuccore/NC_009802.1;
SEQ ID NO:7 represent the nucleotide sequence of the genome of Streptococcus_sp._I-P16, obtain network address For https://www.ncbi.nlm.nih.gov/nuccore/NC_022582.1;
SEQ ID NO:8 represent the nucleotide sequence of the genome of Fusobacterium_nucleatum, obtain network address For https://www.ncbi.nlm.nih.gov/nuccore/NC_003454.1;
SEQ ID NO:9 represent the nucleotide sequence of the genome of Acidovorax_sp._JS42, obtain network address and are https://www.ncbi.nlm.nih.gov/nuccore/NC_008782.1;
SEQ ID NO:10 represent the nucleotide sequence of the genome of Bacteroides_salanitronis, obtain net Location is https://www.ncbi.nlm.nih.gov/nuccore/NC_015164.1.
Lung cancer marker according to embodiments of the present invention can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, And there is high sensitivity, high specific and reproducible.
It should be noted that lung cancer marker according to embodiments of the present invention drawn by research it is related with lung cancer The bacterial species or bacterial genomes of connection, above-mentioned each species property relevant with lung cancer, therefore, without considering judgment accuracy In the case of either this is required it is relatively low in the case of can be used for lung cancer detection alone or in combination or risk is commented Estimate.But in a kind of preferred embodiment of the application, these marker groups may together for lung cancer detection or risk is commented Estimate, this will be described in detail in optimal technical scheme below.
Also need to supplementary notes, the biology assessed for lung cancer detection or risk according to embodiments of the present invention Marker combines, and is not direct having or commented without lung cancer detection or risk is carried out according to detection biomarker combinations Estimate, but, after biomarker combinations are detected, polynary system is brought by analyzing its relative abundance, and by relative abundance Model is counted, if Random Forest model is judged, whether is suffered from according to the probabilistic determination object to be measured that Random Forest model exports Lung cancer or assessment object to be measured suffer from the risk of lung cancer, this will be described in detail in technical solution below.
It should be noted that 10 species in marker combination according to embodiments of the present invention represented is the 10 of lung Kind microorganism;According to an embodiment of the invention, it is detected by the content of above-mentioned 10 kinds of microorganisms to lung, and to its phase Statistical analysis is carried out to the relation of abundance and lung cancer, establishes Random Forest model, judges object to be measured whether with lung cancer with this Or whether there is the risk for suffering from lung cancer.
It should also be noted that, lung microbes quantity is far above 10 kinds;But according to embodiments of the present invention, from 10 kinds of microorganisms are filtered out in Random Forest model, can be the detect and assess of lung cancer as the biomarker of lung cancer detection Provide a new approach.
Kit
In another aspect, the present invention proposes a kind of kit.According to an embodiment of the invention, including reagent, the reagent For detecting at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_ argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_ epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16,Fusobacterium_ nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Reagent according to embodiments of the present invention Box can be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, and with high sensitivity, high specific and repeatability The advantages of good.
Reagent for detecting above-mentioned bacterium is not particularly limited, as long as above-mentioned bacterium can effectively be detected.According to Specific embodiments of the present invention, the reagent include the universal primer suitable for genome sequencing.In genome sequencing, Microbe genome DNA is broken into the small fragment of 500bp at random, then adding in the universal primer at segment both ends carries out PCR amplification is sequenced, then obtained sequence alignment will be sequenced to reference database, determines to detect the relative amount of microorganism.
Purposes of the reagent in reagent preparation box
In another aspect, the present invention proposes purposes of the reagent in reagent preparation box, the kit is used for diagnosing Or assessment lung-cancer-risk, the reagent are used to detect at least one of following bacterium:Haemophilus_influenzae, Corynebacterium_argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea, Staphylococcus_epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16, Fusobacterium_nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis.Invention human hair Existing, above-mentioned bacterium can be used as lung cancer marker, be diagnosed effective for assessment lung-cancer-risk or the early stage of lung cancer, have highly sensitive Property, high specific and it is reproducible the advantages of.
It is appreciated that marker combination according to embodiments of the present invention is inherently studied for lung cancer, certainly may be used For the detection or risk assessment of lung cancer;And marker combination according to embodiments of the present invention can also be integrated into some specially For in the kit or instrument of lung cancer detection, to facilitate the detect and assess of lung cancer, being marked as long as employing the biological of the application Will object combines, all in the protection domain of the application.At the same time, since the biomarker combinations of the application can detect lung Cancer carries out risk assessment to lung cancer;Of course, it is possible to before contrasting detection medication and medication after lung cancer disease condition or Person's risk changes, so as to judge whether drug used is effective, to achieve the purpose that drug screening.
In another aspect, judge that the method for lung cancer is preparing lung cancer inspection by detecting biomarker this application discloses a kind of Application in survey or risk assessment kit or instrument;Wherein, biomarker combines for the lung cancer marker of the application.
According to an embodiment of the invention, judge that the method for lung cancer comprises the following steps by detecting biomarker:
1) sample collection is carried out to object to be measured, detects the biomarker combinations of the application in gathered sample, and Analyze the level of each species in biomarker combinations;
2) level for each species for measuring step 1) obtains detection knot compared with reference data set or reference value Fruit;
Preferably, the level of each species is the relative abundance of each species;Reference data set or reference value are from lung cancer The level of each species in the biomarker combinations of patient and non-lung cancer control.
It is highly preferred that the reference data set or reference value in step 2) are one group of table 3;By the level of each species and reference Data set or reference value are compared acquisition testing result, specifically include, and probability of illness is calculated using multivariate statistical model, Preferably, multivariate statistical model is Random Forest model.
The another further aspect of the application discloses a kind of method for the drug candidate for screening treatment lung cancer, comprises the following steps,
1) marker of the application combines in the sample before measure medication and after medication respectively, and in the combination of analysis mark object The level of each nucleic acid;
2) according to the level of each species in the sample compared before medication and after medication, drug candidate is judged;
In step 2), compare the level of each species in the sample before medication and after medication, specifically include, utilize multivariate statistics Probability of illness is calculated in model, it is preferable that multivariate statistical model is Random Forest model.
Diagnosing or the system for assessing lung-cancer-risk
In another aspect of the invention, the present invention proposes a kind of diagnosing or the system for assessing lung-cancer-risk.Root According to the embodiment of the present invention, with reference to figure 1, the system comprises:Measurement device 100, the measurement device 100 are used to determine follow-up The relative abundance of marker noted earlier in the sample of disconnected object;Determining device 200, the determining device 200 are used for based on institute The obtained marker relative abundance in measurement device is stated, determines the diagnostic result of the object.Implement according to the present invention Example system can effective for assessment lung-cancer-risk or the early stage of lung cancer diagnose, and with high sensitivity, high specific and The advantages of reproducible.
According to an embodiment of the invention, the determining device 200 includes the module for being adapted for carrying out following operation:The mark Will object relative abundance inputs multivariate statistical model, obtains probability of illness;By the probability of illness compared with predetermined threshold, the trouble Sick probability is higher than predetermined threshold, is instruction of the object with lung cancer or lung cancer excessive risk.
According to an embodiment of the invention, the multivariate statistical model is Random Forest model.
According to an embodiment of the invention, the Random Forest model is based on known patients with lung cancer and non-lung cancer control population Structure.
According to an embodiment of the invention, the predetermined threshold is 0.5.
According to an embodiment of the invention, the sample is bronchoalveolar lavage fluid.Bronchoalveolar lavage fluid is as biological marker analyte detection Sample has the advantages that convenient material drawing, operating procedure are simple and can continuous vitro detection.
According to an embodiment of the invention, the measurement device includes the module for being adapted for carrying out sequencing.
The lung cancer marker for lung cancer detection of the application, for lung cancer detection or risk assessment provide one it is new Approach can be used in the early diagnosis of lung cancer, avoid relying on the conventional detections such as symptom, iconography to pulmonary cancer diagnosis or treatment Delay.Other major advantages of the application include:
(a) biomarker of the application has high sensitivity, height specifically for the detection of lung cancer or risk assessment Property the advantages of, have important application value.
(b) bronchoalveolar lavage fluid sample has convenient material drawing, operating procedure simple and can as biological marker analyte detection sample The advantages that continuous vitro detection.
(c) marker of the application has the characteristics that reproducible for the detection of lung cancer or risk assessment.
The embodiment of the present invention is described below in detail, the example of the embodiment is shown in the drawings.Below with reference to The embodiment of attached drawing description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Embodiment
1. materials and methods
1.1 sample collection
The sample collection of this example is assisted to carry out by West China Hospital division of respiratory disease and Cardiac surgeon.Research object is mainly West China The patient that hospital's division of respiratory disease and thoracic surgery are gone to a doctor, the tissue of research object and is categorized as early stage of lung cancer patient and benign lesion, just Ordinary person group.Sample is taken according to bronchoalveolar lavage fluid medicine sampling standard flow, every sample 10-15ml carries out microorganism group and grinds Study carefully.All individuals for meeting more than standard all carry out detailed phenotypic information registration, to understand its medical history, family history, medication history And habits and customs etc., and endorsed informed consent form.
(1) each group patient includes exclusion criteria
1) lung cancer group
Inclusion criteria:
Age the definitive pathological diagnosis of 18-80 Sui patients with lung cancer, without other underlying diseases either systemic immunity disease (no severe cardiac, liver, kidney illness, non-pregnant woman.Apneumia outer tuberculosis, diabetes, tumour, mental disease, epilepsy medical history),
The treatment of antibiotic or immunosuppressor was not received before sampling in 2 weeks
Exclusion criteria:
Patient's essential information is lost
Fail to clarify a diagnosis
There are serious cardiopulmonary underlying diseases or disease of immune system
Receive antibiosis extract for treating within nearly 2 weeks
2) healthy group:
Inclusion criteria:
Volunteer of the age at 18-80 Sui
Chest imaging is without exception
Without HIV and other underlying diseases
The treatment for not receiving antibiotic or immunosuppressor in 2 weeks before sampling
Symptom (fever, cough etc.) and respiratory tract related drugs medication before sampling in 4 weeks without acute respiratory infection History
Exclusion criteria:
It now suffers from lung disease or has the past medical history of pulmonary disease
Lung function deviant
Chest imaging prompting is abnormal
Use antibiotic or immunosuppressor within nearly 2 weeks
Nearly 4 weeks respiratory symptoms and respiratory tract related drugs use
According to more than standard, this example filters out 85 suitable crowds, as the first group.In first group, respectively there are 65 Patients with lung cancer and 20 non-lung cancer individuals.
(2) sample making and transport
1) bronchoalveolar lavage fluid standard operation
10ml physiological saline (neat saline) into specimen cup is directly taken from normal saline bottle;
It is collected into after 10ml physiological saline is injected bronchoscope with asepsis injector in specimen cup (scope saline).
Pay attention to:Collect health group bronchoalveolar lavage fluid when, it is all enter group objects needed before branchofiberoscope operation is carried out No smoking 12h.
2) sample transport and processing
Sample transports under the conditions of 4 DEG C, processing in 30min;Centrifugation freezes under the conditions of -80 DEG C.
1.2DNA is extracted to be sequenced with NGS
(1) DNA is extracted
Use Qiagen ion exchange columns:500g, 4 DEG C of centrifugation 10min, removes cell;Then carried with Qiagen kits Take DNA.
(2) prepared by library
Nucleic acid fragment:500ng nucleic acid is taken to be interrupted in ultrasound in instrument and concentrates on 150-200bp into Break Row to segment, is beaten Disconnected result is detected with 2100.
It repairs end:The product that interrupts of 100ng previous steps is taken to carry out end reparation.Connector connects:Every part of sample uses Individual connector is to distinguish different samples.PCR amplification:The PCR that the product that connector connection is completed need to carry out 8-10 Xun Huan expands Increase to examine and obtain ripe library.
(3) machine is sequenced on
Involved examining order carries out on BGIseq-100 platforms, and operating process is according to BGIseq-100 normal streams Cheng Jinhang.Genome sequencing sample, data volume are base/ parts of 1G.
More than library construction and sequencing etc. are carried out by Guangzhou Hua Da gene.
The processing of 1.3 normal datas
Data prediction and quality control:In order to improve the accuracy of data analysis, the pretreatment of data, bag are carried out first It includes:Removal sequence is shorter than the sequence of 35bp and low-quality sequence (N base numbers>=10 or N base accountings>=10%).
Remove the sequence of people in data:Due to containing the DNA sequence dna of substantial amounts of people in sample, caused by reducing comparison Mistake compares and unnecessary influence is caused on subsequent analysis, and data are used Burrows-Wheeler first Alignment (BWA) software compares the sequence for mankind's reference gene group (hg19), removing people.
It is compared with bacterium database (1494 kinds of bacterial micro-organisms, i.e. 1494 bacterial species):By using Burrows- The comparison of WheelerAlignment (BWA) software is compared to bacterium database, obtains comparison result, further obtains each The comparison quantity of marker calculates the relative abundance of each species in each sample, wherein the relative abundance of a certain species is The abundance of the species and the ratio of the sum of all species abundances in the sample in some sample.
1.4 random forest grader
In order to establish a model that can differentiate abnormality sample, using in R softwares (3.3.2RC) RandomForest kits are fitted the relative abundance of the species of each sample with lung cancer status, using default parameters; Wherein, the species of each sample are the species in the sample present at least at 10%, that is to say, that reject the institute at each position There are the species that could be only detected in having the sample less than 10% in sample to be tested.5 10 folding cross validations are carried out afterwards, by 5 The error curve of secondary 10 folding cross validation is averaged, using the minimum error of curve after averagely plus the point standard error as The thresholding of acceptable error.In each group species for being less than thresholding in error in classification, wherein, species number is at least optimal species Combination, as the biomarker combinations for differentiating lung cancer.
1.5 biomarkers are verified
In order to verify biomarker that this example obtains, in addition this example employs independent test population, i.e. the second group It is verified.In second group, respectively there are 27 patients with lung cancer and 10 non-lung cancer individuals.
2. experimental result
2.1 with the relevant microorganism of disease
It is used for differentiating the bacterial species biomarker of lung cancer in order to obtain, this example establishes Random Forest model, specific to walk Suddenly it is:(1) using species relative abundance as input feature vector, the Random Forest model based on the first group is designed;(2) for random Forest model designs 10 folding cross validation algorithms, and the first group is divided into lung cancer individual and individual two classes of non-lung cancer, and is obtained respectively To the ROC curve of Random Forest model, using area AUC value under each ROC curve as evaluation index.
This example utilizes Random Forest model, and combines 10 folding cross validations, has obtained the optimal biological marker in each position Object, as shown in table 1, for differentiating lung cancer.Table 2 is the enrichment information of marker in the sample, and table 3 is marker in the first group The relative abundance information of sample.In this example, biomarker differentiate patients with lung cancer as a result, as shown in Figure 2.
Table 1:Biomarker
According to table 1, when carrying out sample detection, the relative abundance of each species is calculated, relative abundance input is random gloomy Woods model is obtained as a result, determining whether lung cancer.
Table 2:Each species abundance information of marker
In table 2, lung cancer group refers to the sample that lung cancer is suffered from 65 acquisition targets of the first group, and control group refers to first The sample of lung cancer is not suffered from 20 acquisition targets of group.
Table 3:Relative abundance information of each species of marker in the first group
Fig. 2 differentiates lung cancer for marker, and in figure, a figures are the increase with species quantity, to random forest discriminating lung cancer into The error rate distribution situation of 5 10 folding cross validations of row, the model are trained with the relative abundance of species in sample, and total is adopted With 65 lung cancer individuals and the bronchoalveolar lavage fluid sample of 20 non-lung cancer individuals, black line represents the average value of 5 experiments, ash Colo(u)r streak then represents 5 experiments respectively, and black vertical line represents species number in optimal combination;B figures are the group crossed by cross validation Recipient's operating curve of conjunction, area under the curve AUC are 0.8623, and shaded area represents 95% confidence interval, and diagonal represents AUC is 0.5 curve.
The biomarker group of bacterial species can differentiate lung cancer individual and non-lung cancer it can be seen from the result of Fig. 2 Body;The area under the curve AUC value of ROC is 0.8623.Wherein, AUC is area under the curve, and the value is bigger, i.e., closer to 1, represents Judgement is stronger, that is, it is more accurate to judge.
2.2 biomarkers are verified
The species biomarker that random forest obtains is verified that the results are shown in Table 4 in the second population sample. In table 4, marker prediction individual suffers from the probability of lung cancer, and thus obtained ROC curve is Fig. 3.In table 4, probability>0.5 thinks Judge that individual has by the marker at the position and suffer from the risk of lung cancer or with lung cancer.
Table 4:Marker predicts that the second population sample suffers from the probability of lung cancer
The results show marker of Fig. 3 judges lung cancer probability, AUC value 0.8537;As it can be seen that this marker have compared with High distinguishing ability can be used in the detection of lung cancer, which is consistent with the result of table 4.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment of the present invention or example.In the present specification, schematic expression of the above terms is not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It is combined in an appropriate manner in one or more embodiments or example.In addition, without conflicting with each other, the skill of this field Art personnel can tie the different embodiments described in this specification or example and different embodiments or exemplary feature It closes and combines.
Although the embodiment of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is impossible to limitation of the present invention is interpreted as, those of ordinary skill in the art within the scope of the invention can be to above-mentioned Embodiment is changed, changes, replacing and modification.

Claims (10)

1. one group of lung cancer marker, which is characterized in that including at least one of following bacterium:Haemophilus_ influenzae,Corynebacterium_argentoratense,Fusobacterium_sp._4_8, Capnocytophaga_ochracea,Staphylococcus_epidermidis,Campylobacter_concisus, Streptococcus_sp._I-P16,Fusobacterium_nucleatum,Acidovorax_sp._JS42, Bacteroides_salanitronis。
2. one group of lung cancer marker, which is characterized in that there is SEQ ID NO:At least one of nucleotide sequence shown in 1~10.
3. a kind of kit, which is characterized in that including reagent, the reagent is used to detect at least one of following bacterium: Haemophilus_influenzae,Corynebacterium_argentoratense,Fusobacterium_sp._4_8, Capnocytophaga_ochracea,Staphylococcus_epidermidis,Campylobacter_concisus, Streptococcus_sp._I-P16,Fusobacterium_nucleatum,Acidovorax_sp._JS42, Bacteroides_salanitronis。
4. kit according to claim 3, which is characterized in that the reagent is suitable for the general of genome sequencing Primer.
5. purposes of the reagent in reagent preparation box, the kit is for diagnosing or assessment lung-cancer-risk, the examination Agent is used to detect at least one of following bacterium:Haemophilus_influenzae,Corynebacterium_ argentoratense,Fusobacterium_sp._4_8,Capnocytophaga_ochracea,Staphylococcus_ epidermidis,Campylobacter_concisus,Streptococcus_sp._I-P16,Fusobacterium_ nucleatum,Acidovorax_sp._JS42,Bacteroides_salanitronis。
6. a kind of diagnosing or the system for assessing lung-cancer-risk, which is characterized in that including:
Measurement device, the measurement device are used to determine the phase of the marker of claim 1 or 2 in the sample of object to be diagnosed To abundance;
Determining device, the determining device are used to be based on the obtained marker relative abundance in the measurement device, really The diagnostic result of the fixed object.
7. system according to claim 6, which is characterized in that the determining device includes being adapted for carrying out following operation Module:
The marker relative abundance inputs multivariate statistical model, obtains probability of illness;
By the probability of illness compared with predetermined threshold, the probability of illness is higher than predetermined threshold, is that the object suffers from lung cancer Or the instruction of lung cancer excessive risk,
Optionally, the multivariate statistical model is Random Forest model,
Optionally, the Random Forest model is built based on known patients with lung cancer and non-lung cancer control population.
8. system according to claim 7, which is characterized in that the predetermined threshold is 0.5.
9. system according to claim 6, which is characterized in that the sample is bronchoalveolar lavage fluid.
10. system according to claim 6, which is characterized in that the measurement device includes the module for being adapted for carrying out sequencing.
CN201711114229.8A 2017-11-13 2017-11-13 Lung cancer marker and application thereof Active CN108070656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711114229.8A CN108070656B (en) 2017-11-13 2017-11-13 Lung cancer marker and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711114229.8A CN108070656B (en) 2017-11-13 2017-11-13 Lung cancer marker and application thereof

Publications (2)

Publication Number Publication Date
CN108070656A true CN108070656A (en) 2018-05-25
CN108070656B CN108070656B (en) 2021-11-09

Family

ID=62159828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711114229.8A Active CN108070656B (en) 2017-11-13 2017-11-13 Lung cancer marker and application thereof

Country Status (1)

Country Link
CN (1) CN108070656B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113913333A (en) * 2021-10-20 2022-01-11 南京世和基因生物技术股份有限公司 Lung cancer diagnosis marker and application
CN117004744A (en) * 2022-04-27 2023-11-07 数字碱基(南京)科技有限公司 Lung cancer prognosis evaluation method and model based on plasma microorganism DNA characteristics
CN117004744B (en) * 2022-04-27 2024-05-24 数字碱基(南京)科技有限公司 Lung cancer prognosis evaluation method and model based on plasma microorganism DNA characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011100483A1 (en) * 2010-02-10 2011-08-18 The Regents Of The University Of California Salivary biomarkers for lung cancer detection
CN105473738A (en) * 2013-08-06 2016-04-06 深圳华大基因科技有限公司 Biomarkers for colorectal cancer
CN105603066A (en) * 2016-01-13 2016-05-25 金锋 Mental disorder related intestinal tract microbial marker and application thereof
WO2016097769A1 (en) * 2014-12-19 2016-06-23 Aberystwyth University A method for diagnosing lung cancer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011100483A1 (en) * 2010-02-10 2011-08-18 The Regents Of The University Of California Salivary biomarkers for lung cancer detection
CN105473738A (en) * 2013-08-06 2016-04-06 深圳华大基因科技有限公司 Biomarkers for colorectal cancer
WO2016097769A1 (en) * 2014-12-19 2016-06-23 Aberystwyth University A method for diagnosing lung cancer
CN105603066A (en) * 2016-01-13 2016-05-25 金锋 Mental disorder related intestinal tract microbial marker and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SIMON J. S. CAMERON等: "A pilot study using metagenomic sequencing of the sputum microbiome suggests potential bacterial biomarkers for lung cancer", 《PLOS ONE》 *
STROUTS,F.R.等: "ACCESSION No.NC_014922.1 Haemophilus influenzae F3047 complete genome", 《GENBANK数据库》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113913333A (en) * 2021-10-20 2022-01-11 南京世和基因生物技术股份有限公司 Lung cancer diagnosis marker and application
CN113913333B (en) * 2021-10-20 2022-09-02 南京世和基因生物技术股份有限公司 Lung cancer diagnosis marker and application
CN117004744A (en) * 2022-04-27 2023-11-07 数字碱基(南京)科技有限公司 Lung cancer prognosis evaluation method and model based on plasma microorganism DNA characteristics
CN117004744B (en) * 2022-04-27 2024-05-24 数字碱基(南京)科技有限公司 Lung cancer prognosis evaluation method and model based on plasma microorganism DNA characteristics

Also Published As

Publication number Publication date
CN108070656B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
Anderson et al. Differences in detection rates of adenomas and serrated polyps in screening versus surveillance colonoscopies, based on the new hampshire colonoscopy registry
Anderson et al. Impact of fair bowel preparation quality on adenoma and serrated polyp detection: data from the New Hampshire colonoscopy registry by using a standardized preparation-quality rating
CN109777874A (en) It is a kind of suitable for ductal adenocarcinoma of pancreas diagnosis and Index for diagnosis blood plasma excretion body miRNA marker and application
CN109680060A (en) Methylate marker and its application in diagnosing tumor, classification
CN104450901B (en) The nucleic acid markers of quick diagnosis mucocutaneous lymphnode syndrome and test kit thereof
Schaff et al. Novel centrifugal technology for measuring sperm concentration in the home
Zhifeng et al. Consistency analysis of COVID-19 nucleic acid tests and the changes of lung CT
CN110272990A (en) Excretion body microRNA is as depression marker and its application
Greene et al. Matching colonoscopy and pathology data in population-based registries: development of a novel algorithm and the initial experience of the New Hampshire Colonoscopy Registry
CN107305596A (en) Patients with hilar cholangiocarcinoma prognostic predictive model
Anderson et al. Association of small versus diminutive adenomas and the risk for metachronous advanced adenomas: data from the New Hampshire Colonoscopy Registry
CN105603101A (en) Application of system for detecting expression quantity of eight miRNAs in preparation of product for diagnosing or assisting in diagnosing hepatocellular carcinoma
CN115376706B (en) Prediction model-based breast cancer drug scheme prediction method and device
CN103293250B (en) Diabetic nephropathy diagnostic kit and application thereof
CN107435062A (en) Screen good pernicious peripheral blood gene marker of small pulmonary nodules and application thereof
CN108949979A (en) A method of judging that Lung neoplasm is good pernicious by blood sample
CN109355406B (en) A kind of kit of the detection mycobacterium tuberculosis based on blood free nucleic acid
CN110838365A (en) Irritable bowel syndrome related flora marker and kit thereof
CN105925703A (en) Method for screening miRNA markers in PB (peripheral blood) of kidney cancer and kidney cancer diagnosis marker miR-210
CN108070656A (en) Lung cancer marker and its application
Gamé et al. Comparison of red blood cell volume distribution curves and phase-contrast microscopy in localization of the origin of hematuria
CN115561468A (en) Method for evaluating risk of having tumor or specific tumor
CN110396538A (en) Migraine biomarker and application thereof
CN109689890A (en) Biomarker combinations and its application for uterus adenomyosis detection
CN111613327B (en) System for developing multiple myeloma diagnosis model based on logistic regression and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1251020

Country of ref document: HK

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20191030

Address after: Building w2a, building B, building a, building 201203a5, high tech Industrial Village, No. 025, South 4th Road, high tech Zone, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Huada Yinyuan Pharmaceutical Technology Co., Ltd

Address before: 7, 7 floor, 518083 floor, Hua Da comprehensive garden, No. 21 Hong An street, Yantian District, Shenzhen, Guangdong,

Applicant before: BGI SHENZHEN CO LTD

GR01 Patent grant
GR01 Patent grant