WO2024015493A1 - Methods and systems for malignancy prediction of indeterminate pulmonary nodules - Google Patents

Methods and systems for malignancy prediction of indeterminate pulmonary nodules Download PDF

Info

Publication number
WO2024015493A1
WO2024015493A1 PCT/US2023/027597 US2023027597W WO2024015493A1 WO 2024015493 A1 WO2024015493 A1 WO 2024015493A1 US 2023027597 W US2023027597 W US 2023027597W WO 2024015493 A1 WO2024015493 A1 WO 2024015493A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
concentration
values
machine learning
score
Prior art date
Application number
PCT/US2023/027597
Other languages
French (fr)
Inventor
Gerard Davis
Original Assignee
Abbott Laboratories
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Abbott Laboratories filed Critical Abbott Laboratories
Publication of WO2024015493A1 publication Critical patent/WO2024015493A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57423Specifically defined cancers of lung
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis

Definitions

  • the present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or likely not to be malignant.
  • the method and systems of the present disclosure utilize certain subject values including: (i) a subject smoking pack years value; and (ii) a measurement of the size of the IPN in the subject; and certain assay values including: (iii) (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a
  • the subject values and the assay values are applied to at least one machine learning algorithm which is used to produce or output a score (a machine learning score) for the subject.
  • the machine learning score for the subject is compared to a reference score to determine whether an IPN is likely to be malignant or likely not to be malignant.
  • Lung cancer is the second most common cancer worldwide. In 2020, there were an estimated 2.2 million cases of lung cancer and over 1.7 million deaths globally (See, Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 countries, CA Cancer J Clin 71 (2021) 209- 249). In the United States, the most common form of lung cancer is non-small cell lung cancer (NSCLC) which accounts for approximately 80-85% of all lung cancers. Additionally, more than half of all NSCLC patients are also diagnosed with advanced and/or metastatic disease (NIH Surveillance, Epidemiology, and End Results (SEER program)).
  • NSCLC non-small cell lung cancer
  • NLST National Lung Screening Trial
  • High risk lung cancer populations that could benefit from this screening have been defined by the American Cancer Association as apparently healthy patients between 55 and 75 years, who have at least a 20-pack-year smoking history and who currently smoke or have quit smoking within the past 15 years (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2016) 297-316).
  • LDCT will not detect all lung cancers and can generate a high frequency of false positive findings.
  • IPNs are non-calcified lung nodules (typically 7-20 mm in size) that require further diagnostic workup for risk of malignancy (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173-1178).
  • IPN size itself correlates with larger nodule sizes carrying the greater risk of being cancerous: IPN ⁇ 5 mm: 0%-l%; 5-10 mm: 6%-28%; 11-20 mm: 37%-64%; >20 mm: 64%-82% (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173— 1178).
  • the present disclosure relates to a method of determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely malignant.
  • the method comprises the steps of:
  • C A125 Cancer Antigen 125
  • CEA Carcinoembryonic Antigen
  • HE4 Human Epididymis Protein 4
  • NSE Neuron-Specific Enolase
  • SCC Squamous Cell Carcinoma Antigen
  • ProGRP Pro Gastrin Releasing Peptide
  • processing system is configured to:
  • the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
  • the method involves obtaining an assay value comprising a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
  • the method involves obtaining an assay value comprising a subject CAI 25 concentration, a subject CEA concentration, a subject HE4 concentration, a subject Cyfra 21-1 concentration, a subject NSE concentration, a subject SCC concentration, a subject ProGRP concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
  • the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score.
  • the machine learning algorithm is a random forest algorithm that generates a random forest score.
  • the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score.
  • the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
  • the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
  • the biological sample is a whole blood sample, a serum sample, or a plasma sample.
  • the obtaining subject values, assay values, or subject values and assay values comprise receiving said subject values, assay values, or subject values and assay values from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
  • the obtaining subject values, assay values or subject values and assay values comprise electronically receiving said subject values.
  • the method further comprises manually or automatically inputting said subject values, assay values, or subject values and assay values into said processing system.
  • the processing system compares the machine learning score for the subject against the reference score.
  • the determination of whether the IPN is likely malignant or likely not malignant is displayed on a device.
  • the subject is a human.
  • the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof is determined using an immunoassay.
  • the at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof is determined using a clinical chemistry assay.
  • the present disclosure relates to a system comprising: [0037] a. subject values for a subject, wherein said subject values comprise:
  • a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and
  • a device comprising a processing system, wherein the processing system comprises a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm;
  • the at least one machine learning algorithm is configured to process the subject values and the assay values to produce a machine learning score for the subject, and [0045] further wherein the processing system is configured to:
  • iii) provide a reference score for comparison with the machine learning score, [0049] wherein the device displays that the IPN is (1) likely malignant if the machine learning score is higher than the reference score; or (2) not likely malignant if the machine learning score is the same as or below the reference score.
  • the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
  • the assay values comprise a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
  • the method involves obtaining an assay value comprising a subject CAI 25 concentration, a subject CEA concentration, a subject HE4 concentration, a subject Cyfra 21-1 concentration, a subject NSE concentration, a subject SCC concentration, a subject ProGRP concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
  • the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score.
  • the machine learning algorithm is a random forest algorithm that generates a random forest score.
  • the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score.
  • the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
  • the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
  • the biological sample is a whole blood sample, a serum sample, or a plasma sample.
  • the subject values, assay values, or subject values and assay values are received from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
  • the subject values, assay values, or subject values and assay values are received electronically.
  • the subject values, assay values, or subject values and assay values are manually or automatically inputted into said processing system.
  • the subject is a human.
  • the at least one of the subject CAI 25 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof is determined using an immunoassay.
  • At least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
  • the present disclosure relates to a method comprising the steps of:
  • a) providing one or more diagnostic assays configured to measure assay values comprising (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
  • CA125 Cancer Antigen
  • the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
  • processing system is configured to:
  • the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score.
  • the machine learning algorithm is a random forest algorithm that generates a random forest score.
  • the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score.
  • the at least one machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
  • the present disclosure relates to a system comprising: [0072] a) one or more diagnostic assays configured to measure assay values comprising (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration
  • a processing system comprising a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm;
  • the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
  • processing system is configured to:
  • the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score.
  • the machine learning algorithm is a random forest algorithm that generates a random forest score.
  • the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score.
  • the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
  • FIG. 1 shows the relative classification of AIM models by IPN category as described in the Example.
  • the present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or likely not to be malignant.
  • the methods and systems described herein utilize certain subject values including: (i) a subject smoking pack years value; and (ii) a measurement of the size of the IPN in the subject; and certain assay values including (iii) (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject
  • the subject values and assay values are applied to at least one machine learning algorithm which is used to produce a score (a machine learning score) for the subject.
  • the machine learning score for the subject is compared to a reference score to determine whether an IPN is likely to be malignant or likely not to be malignant.
  • the IPN is likely malignant. If the subject’s machine learning score is the same as or below (e.g., less than) the reference score, then the IPN is likely not to be malignant.
  • each intervening number there between with the same degree of precision is explicitly contemplated.
  • the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
  • Bio sample or “sample” as used interchangeably herein, includes, but is not necessarily limited to, bodily fluids such as blood-related samples (e.g., whole blood (including, for example, capillary whole blood samples), serum, plasma, and other blood- derived samples), urine, cerebral spinal fluid, bronchoalveolar lavage, and the like.
  • blood-related samples e.g., whole blood (including, for example, capillary whole blood samples), serum, plasma, and other blood- derived samples
  • urine cerebral spinal fluid
  • bronchoalveolar lavage e.g., bronchoalveolar lavage, and the like.
  • tissue sample e.g., whole blood (including, for example, capillary whole blood samples), serum, plasma, and other blood- derived samples
  • Another example of a biological sample is a tissue sample.
  • a biological sample may be fresh or stored (e.g., blood or blood fraction stored in a blood bank).
  • the biological sample may be a bodily fluid expressly obtained for use
  • the biological sample is whole blood.
  • Whole blood may be obtained from the subject using standard clinical procedures.
  • the biological sample is plasma.
  • Plasma may be obtained from whole blood samples by centrifugation of anti -coagulated blood. Such process provides a buffy coat of white cell components and a supernatant of the plasma.
  • the biological sample is serum. Serum may be obtained by centrifugation of whole blood samples that have been collected in tubes that are free of anti-coagulant. The blood is permitted to clot prior to centrifugation. The yellowish-reddish fluid that is obtained by centrifugation is the serum.
  • the sample is urine.
  • the sample may be pretreated as necessary by dilution in an appropriate buffer solution, heparinized, concentrated if desired, or fractionated by any number of methods including but not limited to ultracentrifugation, fractionation by fast performance liquid chromatography (FPLC), or precipitation of apolipoprotein B containing proteins with dextran sulfate or other methods.
  • FPLC fast performance liquid chromatography
  • the biological sample is a whole blood sample and the subject is a human.
  • the biological sample is a plasma sample and the subject is a human.
  • the biological sample is a serum sample and the subject is a human.
  • the biological sample is a capillary blood sample and the subject is a human.
  • “Decentralize”, “Decentralized”, or “Decentralization”, as used interchangeably herein, refers to, in the context of testing, the performance of one or more medical tests and/or assays outside of a traditional medical setting (e.g., a hospital, physician office, stand alone lab site, etc.) to one or more places such as urgent care clinics, retail clinics, pharmacies, grocery stores or convenience stores, residences (e.g., homes, apartments, etc.), workplaces, and/or government offices (e.g., U.S. Transportation and Safety Authority), etc.
  • a traditional medical setting e.g., a hospital, physician office, stand alone lab site, etc.
  • a traditional medical setting e.g., a hospital, physician office, stand alone lab site, etc.
  • residences e.g., homes, apartments, etc.
  • workplaces e.g., U.S. Transportation and Safety Authority
  • Hybrid-decentralization or “hybrid-decentralized” refers to situations in which a subject or patient collects a sample at a residence and/or workplace and ships the sample to a laboratory, avoiding a professional collection site (such as a hospital, physician’s office, or stand alone sample collection or lab site).
  • Higher throughput assay analyzer or a “non-point-of-care device”, as used interchangeably herein, refers to a device that is not a point-of-care device or a single use device.
  • a higher throughput assay analyzer or non-point-of-care device refers to any device that does not meet any of the limitations of a point-of-care or a single use device as defined herein.
  • a “higher throughput assay analyzer” or “non-point-of-care device” refers to an instrument that: (a) may be a relatively large instrument compared to a hand-held point-of-care device, e.g., such as ranging in size from that of a tabletop instrument (e.g., typically considered low- or medium- throughput) to a large room-size or multipleroom-size instrument (e.g., typically considered high throughput); (b) is not a handheld instrument; (c) is capable of performing an assay on more than one clinical sample simultaneously; and (d) any combination of (a)-(c).
  • a higher throughput assay analyzer may be a clinical chemistry analyzer, an immunoassay analyzer, or a combination thereof.
  • Exemplary higher throughput assay analyzers or non-point-of-care devices include, for example, the ARCHITECT or Alinity platforms produced by Abbott Laboratories.
  • Point-of-care device refers to a device used to provide medical diagnostic testing at or near the point-of-care (namely, typically, outside of a laboratory), at the time and place of patient care (such as in a hospital, physician’s office, urgent or other medical care facility, a patient’s home, a nursing home and/or a long term care and/or hospice facility).
  • point-of-care devices examples include those produced by Abbott Laboratories (Abbott Park, IL) (e.g., i-STAT and i-STAT Alinity, Universal Biosensors (Rowville, Australia) (see US 2006/0134713), Axis-Shield PoC AS (Oslo, Norway) and Clinical Lab Products (Los Angeles, USA).
  • Abbott Laboratories Abbott Park, IL
  • i-STAT and i-STAT Alinity Universal Biosensors (Rowville, Australia) (see US 2006/0134713)
  • Axis-Shield PoC AS Oslo, Norway
  • Clinical Lab Products Lis Angeles, USA.
  • Reference score refers to a value that is used to assess diagnostic, prognostic, or therapeutic efficacy and that has been linked or is associated herein with various clinical parameters (e.g., presence of disease (such as malignant versus non- malignant), stage of disease, severity of disease, progression, non-progression, or improvement of disease, etc.).
  • a mammal e.g., cow, pig, camel, llama, horse, goat, rabbit, sheep, hamsters, guinea pig, cat, dog, rat, and mouse
  • a non-human primate for example, a monkey, such as a cynomolgus or rhesus monkey, chimpanzee, etc.
  • the subject may be a human or a non-human.
  • the subject is a human.
  • the subject is racially/ethnically identified as singularly or a combination of American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or other Pacific Islander, White or other.
  • the subject or patient may have tumor(s) that is benign, malignant or a combination thereof.
  • the subject or patient may be undergoing other forms of treatment.
  • Treat,” “treating” or “treatment” are each used interchangeably herein to describe reversing, alleviating, or inhibiting the progress of a disease and/or injury, or one or more symptoms of such disease, to which such term applies.
  • the term also refers to preventing a disease, and includes preventing the onset of a disease, or preventing the symptoms associated with a disease.
  • a treatment may be either performed in an acute or chronic way.
  • the term also refers to reducing the severity of a disease or symptoms associated with such disease prior to affliction with the disease.
  • Such prevention or reduction of the severity of a disease prior to affliction refers to administration of a pharmaceutical composition to a subject that is not at the time of administration afflicted with the disease. "Preventing” also refers to preventing the recurrence of a disease or of one or more symptoms associated with such disease. "Treatment” and “therapeutically,” refer to the act of treating, as “treating” is defined above.
  • the present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or not likely to be malignant.
  • IPN indeterminate pulmonary nodules
  • Indeterminate pulmonary nodules can be identified in subjects, particularly high risk subjects (e.g., apparently healthy patients between 55 and 75 years, who have at least a 30-pack-year smoking history and who currently smoke or have quit smoking within the past 15 years (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2016) 297-316)), using routine techniques known in the art, such as low-dose helical computed tomography (LDCT) screening.
  • LDCT low-dose helical computed tomography
  • the methods and systems of the present disclosure involve obtaining certain subject values and assay values for the subject for whom the at least one IPN has been identified.
  • the subject values to be obtained include: (a) a subject smoking pack years value; and (b) a measurement of the size of the IPN in the subject.
  • the assay values to be obtained include: (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of the subject’s total IgG concentration, IgA concentration, IgM concentration, IgE concentration, kappa free light chain concentration, lambda free light chain concentration, or any combination thereof, in a biological sample obtained from the subject.
  • CA125 Cancer Antigen 125
  • CEA Carcinoembryonic Antigen
  • HE4 Human Epidid
  • the biological samples obtained from the subject for determining the concentration of can be obtained using techniques known to those skilled in the art, and the sample may be used directly as obtained from the subject or following a pretreatment to modify the character of the sample.
  • Such pretreatment may include, for example, preparing plasma from blood, diluting viscous fluids, filtration, precipitation, dilution, distillation, mixing, concentration, inactivation of interfering components, the addition of reagents, lysing, and the like.
  • the biological sample is a whole blood sample, a serum sample, a plasma sample, or a capillary blood sample.
  • the same or different biological samples can be used to determine the concentration of (i) one or more of CA125, CEA, HE4, Cyfra 21-1, NSe, SCC, ProGRP, or any combination thereof; and (ii) one or more of total IgG, IgA, IgM, IgE, kappa free light chain, lambda free light chain, or any combination thereof in the subject.
  • the source of the subject values and the assay values is not critical.
  • the subject values and assay values can be obtained from one or more of a physician’s office, a hospital or other medical facility, a testing lab, in a decentralized setting, from an analytical testing system, from a hand-held or point-of-care testing device, a high throughput analyzer or any combination thereof.
  • the smoking pack years value of the subject can be obtained by determining the number of cigarette packs smoked per year by the subject and then multiplying that number by the number of years the subject smoked. For example, if a subject smoked 2 packs a day for 20 years, the subject’s smoking pack years value would be 40.
  • the measurement of the size of the IPN in the subject can be determined using routine techniques known in the art.
  • LDCT can be used to detect and measure the size of an IPN identified in a subject. It is known in the art that larger IPN nodule size correlates with a carrying the greater risk of the IPN being cancerous: IPN ⁇ 5 mm: 0%-l%; 5-10 mm: 6%-28%; 11-20 mm: 37%-64%; >20 mm: 64%-82% (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173-1178).
  • the methods and systems also require obtaining, providing and/or determining the concentration of one or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, or ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of two or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of three or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of four or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of five or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of six or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject.
  • the methods and systems also require obtaining, providing, and/or determining the concentration of CA125, CEA, HE4, Cyfra 21-1, and NSE, SCC, and ProGRP, in a biological sample obtained from a subject.
  • any assay known in the art for determining the concentration of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof in a biological sample can be provided and/or used in the methods and systems of the present disclosure.
  • an immunoassay, a clinical chemistry, radioimmuoassay, immunoradiometric assay, etc. can be used or provided in the methods and systems of the present disclosure.
  • the concentration of at least one of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from the subject can be determined using a high throughput analyzer or a point-of-care device.
  • the Abbott Laboratories CA125 chemiluminescent assay for use on the ARCHITECT® i2000 automated immunoassay platform can be used in the methods and systems described herein.
  • the methods and systems further require determining the concentration of at least one of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof in a biological sample obtained from the subject. In some aspects, the methods and systems further require determining the concentration of at least two of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof.
  • the methods and systems further require determining the concentration of at least three of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof. In yet still further aspects, the methods and systems further require determining the concentration of at least four of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof.
  • the methods and systems of the present disclosure require determining the concentration of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration and a subject’s kappa free light chain (KFLC) lambda free light chain (LFLC) concentration.
  • KFLC free light chain lambda free light chain
  • an immunoassay for example, an immunoassay, a clinical chemistry, radioimmunoassay, immunoradiometric assay, can be used.
  • a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combination thereof, in a biological sample obtained from the subject can be determined using a high throughput analyzer or a point-of-care device.
  • the Abbott Laboratories ARCHITECT c8000 automated clinical chemistry platform utilizing a turbidimetric assay format can be used to determine the subject’s total IgG concentration, the subject’s IgA concentration, the subject’s IgM concentration, the subject’s IgE concentration, the subject’s LFLC concentration, or any combination thereof in the methods and systems described herein.
  • the processing system can comprise a computer processor and a non-transitory computer memory that contains one or more computer programs and a database.
  • the subject values, assay values and subject values and assay values are manually inputted into the processing system. In other aspects, the subject values, assay values or subject values and assay values are automatically inputted into the processing system. In yet further aspects, the subject values, assay values or subject values and assay values are received electronically, such as, for example, via e-mail. In further aspects, the processing system further comprises a hand-held or point-of-care testing device. In yet further aspects, the processing system further comprises a high throughput analyzer.
  • At least one of the computer programs contained in the processing system is one or more machine learning algorithms. Any machine learning algorithm can be used in the methods and systems of the present disclosure.
  • the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm.
  • the processing system can contain any AIM algorithm known in the art.
  • R package version 1.01 can be used
  • the at least one machine learning algorithm is a random forest algorithm. Any random forest algorithm known in the art can be used.
  • the at least one machine learning algorithm is a logistic regression algorithm. Any machine learning algorithm known in the art can be used. In some aspects, the at least one machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
  • the processing system is configured to apply the at least one machine learning algorithm to the assay values and the subject values to produce, generate or output a score for the subject (a machine learning score, such as, for example, an AIM score, a random forest algorithm score, a logistic regression algorithm score, or any combination thereof).
  • a machine learning score such as, for example, an AIM score, a random forest algorithm score, a logistic regression algorithm score, or any combination thereof.
  • the processing system is further configured to communicate (e.g., report) the machine learning score is communicated (e.g., reported) for further analysis, interpretation, interpretation, processing and/or display.
  • the machine learning score for the subject can be communicated (e.g., reported) by the processing system (e.g., a computer), in a document and/or spreadsheet, on a mobile device (e.g., a smart phone), on a website, in an e- mail, or any combination thereof.
  • the machine learning score for the subject is compared to a reference score.
  • a clinician or other medical personnel can compare the machine learning score for the subject with a reference score.
  • the reference score can be provided in a product insert or other publication, or on a website or on a mobile device (e.g., such as through an app).
  • the processing system is configured to provide a reference score for comparison with the machine learning algorithm score.
  • the reference score can be determined using routine techniques known in the art.
  • the reference score may be determined by the machine learning algorithm in the processing system.
  • the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
  • the reference score is
  • the reference score is 2. In still further aspects, the reference score is 3.
  • the reference score is 4. In yet further aspects, the reference score is 5.
  • the reference score is 6. In yet further aspects, the reference score is 7.
  • the reference score is 8. In yet further aspects, the reference score is 9.
  • the reference score is 10. In yet further aspects, the reference score is
  • the reference score is 12. In yet further aspects, the reference score is 13. In yet further aspects, the reference score is 14. In yet further aspects, the reference score is 15. In yet further aspects, the reference score is 16. In yet further aspects, the reference score is 17. In yet further aspects, the reference score is 18. In yet further aspects, the reference score is 19. In yet further aspects, the reference score is 20. In yet further aspects, the reference score is 21. In yet further aspects, the reference score is 22. In yet further aspects, the reference score is 23. In yet further aspects, the reference score is 24. In yet further aspects, the reference score is 25. In yet further aspects, the reference score is 26. In yet further aspects, the reference score is 27. In yet further aspects, the reference score is 28. In yet further aspects, the reference score is 29.
  • the reference score is 30. In yet further aspects, the reference score is 31. In yet further aspects, the reference score is 32. In yet further aspects, the reference score is 33. In yet further aspects, the reference score is 34. In yet further aspects, the reference score is 35. In yet further aspects, the reference score is 36. In yet further aspects, the reference score is 37. In yet further aspects, the reference score is 38. In yet further aspects, the reference score is 39. In yet further aspects, the reference score is 40. In yet further aspects, the reference score is 41. In yet further aspects, the reference score is 42. In yet further aspects, the reference score is 43. In yet further aspects, the reference score is 44. In yet further aspects, the reference score is 45. In yet further aspects, the reference score is 46.
  • the reference score is 47. In yet further aspects, the reference score is 48. In yet further aspects, the reference score is 49. In yet further aspects, the reference score is 50. In yet further aspects, the reference score is 51. In yet further aspects, the reference score is 52. In yet further aspects, the reference score is 53. In yet further aspects, the reference score is 54. In yet further aspects, the reference score is 55. In yet further aspects, the reference score is 56. In yet further aspects, the reference score is 57. In yet further aspects, the reference score is 58. In yet further aspects, the reference score is 59. In yet further aspects, the reference score is 60. In yet further aspects, the reference score is 61. In yet further aspects, the reference score is 62.
  • the reference score is 63. In yet further aspects, the reference score is 64. In yet further aspects, the reference score is 65. In yet further aspects, the reference score is 66. In yet further aspects, the reference score is 67. In yet further aspects, the reference score is 68. In yet further aspects, the reference score is 69. In yet further aspects, the reference score is 70. In yet further aspects, the reference score is 71. In yet further aspects, the reference score is 72. In yet further aspects, the reference score is 73. In yet further aspects, the reference score is 74. In yet further aspects, the reference score is 75. In yet further aspects, the reference score is 76. In yet further aspects, the reference score is 77. In yet further aspects, the reference score is 78.
  • the reference score is 79. In yet further aspects, the reference score is 80. In yet further aspects, the reference score is 81. In yet further aspects, the reference score is 82. In yet further aspects, the reference score is 83. In yet further aspects, the reference score is 84. In yet further aspects, the reference score is 85. In yet further aspects, the reference score is 86. In yet further aspects, the reference score is 87. In yet further aspects, the reference score is 88. In yet further aspects, the reference score is 89. In yet further aspects, the reference score is 90. In yet further aspects, the reference score is 91. In yet further aspects, the reference score is 92. In yet further aspects, the reference score is 93. In yet further aspects, the reference score is 94.
  • the reference score is 95. In yet further aspects, the reference score is 96. In yet further aspects, the reference score is 97. In yet further aspects, the reference score is 98. In yet further aspects, the reference score is 99. In yet further aspects, the reference score is 100.
  • this determination of whether a subject’s IPN is likely to be malignant or likely not to be malignant can be communicated (e.g., reported) by the processing system (e.g., a computer), in a document and/or spreadsheet, on a mobile device (e.g., a smart phone), on a website, in an e-mail, or any combination thereof.
  • the processing system e.g., a computer
  • a document and/or spreadsheet e.g., a document and/or spreadsheet
  • a mobile device e.g., a smart phone
  • a subject identified as having an IPN that is likely to malignant based on the methods and systems described herein may be treated, monitored, or treated and monitored.
  • a surgical or non-surgical biopsy can be performed to further evaluate the IPN identified as likely to be malignant.
  • a portion of the lung containing the IPN may be surgically removed or resected from the subject using techniques known in the art such as, for example, video-assisted thoracic surgery or a thoracotomy.
  • the subject may receive one or more pharmaceutical or biopharmaceutical treatments.
  • the subject may be treated chemotherapy, radiation, budesonide, fluticasone, or any combinations thereof.
  • the subject may also be monitored.
  • the IPN in the subject can be monitored using one or more of computed tomography (including LDCT) scans, positron emission tomography (PET) scans, bronchoscopy, or any combination thereof.
  • the subject can be monitored before, during and/or after any biopsy and/or treatment.
  • the patient sample cohort tested consisted of 141 patients in total with the following conditions: 36 patients with benign indeterminate pulmonary lung nodules (median nodule size 14.9 mm) and 105 patients with malignant indeterminate pulmonary lung nodules (median nodule size 19 mm). Approximately 74.5% of the population tested were classified as having Stage I NSCLC lung cancer. Clinical and pathological details for the malignant cases were obtained from the medical record system. Criteria for study inclusion in the malignant NSCLC cohort were broad (consisting of having a surgical resection with lymph node sampling and accompanying pathological examinations) and were not limited to any demographic or clinical factor.
  • the benign cohort with malignant pulmonary nodules consisted of patients with granulomas, pneumonitis, or pneumonia. These patients received an anatomic resection for a suspected malignancy.
  • the benign and malignant samples collected represent a real- world collection where no histological selection criteria were applied.
  • the demographic variable of smoking pack years was defined by the number of cigarette packs smoked per year multiplied by the number of years the individual smoked. For this pilot study, smoking pack years was calculated for any patient case receiving a CT scan based on risk assessment or as an incidental finding.
  • Specimens were obtained with full written informed consent under a protocol approved by the Rush UMC Institutional Review Board (IRB). Peripheral blood collected at Rush UMC was obtained from each patient immediately before treatment initiation using standard phlebotomy techniques. Treatment initiation for the IPN could be surgical removal, biopsy, or further radiography assessments. All specimens were handled and processed into EDTA plasma in an identical manner. The time interval from sample collection to processing was less than 90 minutes. All EDTA vacutainer collection tubes were centrifuged at 750 RCF for 20 minutes to isolate the plasma layer. The subsequent plasma layer was transferred to a second tube and re-centrifuged to remove particulates.
  • IPN Rush UMC Institutional Review Board
  • AIM adaptive index modeling
  • R package version 1.01. This method uses the concept of an index predictor, defined as a binary rule based on the value of a predictor variable - for example, whether a patient’s age is either >55 or ⁇ 55.
  • AIM adaptively searches for individual cutoffs for each variable to build an overall model. Index predictors are selected by maximizing the score test statistic, up to a prespecified number of total predictors (Tian L, Tibshirani R, Adaptive index models for marker-based risk stratification, Biostatistics 12, 1 (2011), 68-86; See Table 2 for example).
  • the maximum number of index predictors was set to 8 to avoid having a model that would be unlikely to be implemented and clinically adopted based on algorithm cost and/or complexity.
  • 5-fold cross validation was used (due to small sample size) to select the model with the optimal number of index predictors.
  • each individual subject can be scored according to the values of their index predictors relative to the binary rule cutoffs.
  • Table 2 shows an example of the scoring process for AIM. Each subject will have a score from 0 to n, where n represents the total number of index predictors. Model performance can then be assessed by creating a binary outcome variable based on a score cutoff of >0, >1, . . ., > n-1. Across this range of possible cutoffs, performance metrics were assessed on the entire dataset, including AUC, accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. Positive predictive value and negative predictive value were calculated based on study prevalence (74.5% malignant). The final score cutoff for each AIM model was selected by choosing the maximum AUC, which on average provides a balance of sensitivity and specificity.
  • Table 5 shows the results of performing AIM statistical methodology on four possible combinations of biomarkers and demographic variables. Of the models, the best performing was model (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2016) 297-316) including ARCHITECT immunoassay and clinical chemical biomarkers as well as demographic variables.
  • This model consisted of IgG, IgM, IgE, IgA, Lambda free light chain, CA-125, smoking pack years, and nodule size and resulted in an AUC of 0.819 (95% CI 0.730-0.899) as well as a sensitivity of 0.971 and specificity of 0.667.
  • AIM model Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2016) 297-316
  • AIM model Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2016) 297-316 correctly classified 102/105 malignant samples and 24/36 benign samples, whereas AIM model (Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 countries, CA Cancer J Clin 71 (2021) 209-249) correctly classified 68/105 malignant samples and 27/36 benign samples.
  • the suspect IPN could be further imaged with contrast MRI to assess for spiculation (spiky outgrowths from the nodule) which can a high-risk indicator of cancer.
  • the IPN is determined to be high risk based on the AIM score and MRI imaging, the nodule could potentially be biopsied and the resulting pathology may necessitate aggressive treatment such as surgical removal of the IPN, radiation, and/or chemotherapy.
  • an IPN presenting with low AIM score by the current algorithm and confirmed imaging that supports smooth calcified nodules would have a generally low risk of being cancer.
  • a method of determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely malignant comprising the steps of:
  • a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and
  • processing system is configured to:
  • Clause 2 The method of clause 1, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
  • Clause 3 The method of clause 1 or clause 2, wherein the method involves obtaining an assay value comprising a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain, a lambda free light chain concentration from a biological sample obtained from the subject.
  • Clause 4 The method of any of clauses 1-3, wherein the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
  • Clause 5 The method of any of clauses 1-4, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
  • Clause 6 The method of any of clauses 1-5, wherein said providing the subject values, the assay values or the subject values and the assay values comprises receiving said subject values, assay values or subject values and assay values from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
  • Clause 7 The method of any of clauses 1-6, wherein said providing the subject values, the assay values, or subject values and the assay values comprise electronically receiving said subject values.
  • Clause 8 The method of any of clauses 1-7, further comprising manually or automatically inputting said subject values, assay values, or subject values and assay values into said processing system.
  • Clause 9 The method of any of clauses 1-8, wherein the processing system compares the machine learning score for the subject against the reference score.
  • Clause 10 The method of clause 9, wherein the determination of whether the IPN is likely malignant or likely not malignant is displayed on a device.
  • Clause 11 The method of any of clauses 1-10, wherein said subject is a human. [0175] Clause 12. The method of any of clauses 1-11, wherein at least one of the subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
  • Clause 13 The method of any of clauses 1-9, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
  • subject values for a subject comprising:
  • a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and
  • CA125 Cancer Antigen 125
  • CEA Carcinoembryonic Antigen
  • HE4 Human Epididymis Protein 4
  • NSE Neuron-Specific Enolase
  • SCC Squamous Cell Carcinoma Antigen
  • ProGRP Pro Gastrin Releasing Peptide
  • a device comprising a processing system, wherein the processing system comprises a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm;
  • the at least one machine learning algorithm is configured to process the subject values and the assay values to produce a machine learning score for the subject, and [0186] further wherein the processing system is configured to:
  • iii) provide a reference score for comparison with the machine learning score, [0190] wherein the device displays that the IPN is (1) likely malignant if the machine learning score is higher than the reference score; or (2) not likely malignant if the machine learning score is the same as or below the reference score.
  • Clause 15 The system of clause 14, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
  • Clause 16 The system of clause 14 or clause 15, wherein the assay values comprise a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
  • Clause 18 The system of clauses 14-17, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
  • Clause 19 The system of any of clauses 14-18, wherein said subject values, assay values, or subject values and assay values are received from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
  • Clause 20 The system of any of clauses 14-19, wherein said subject values, assay values or subject values and assay values are received electronically.
  • Clause 21 The system of any of clauses 14-20, wherein the subject values, assay values or subject values and assay values are manually or automatically inputted into said processing system.
  • Clause 22 The system of any of clauses 14-21, wherein said subject is a human.
  • Clause 23 The system of any of clauses 14-22, wherein the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
  • Clause 24 The method of any of clauses 14-23, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject a kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
  • a) providing one or more diagnostic assays configured to measure assay values comprising: (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration , a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
  • CA125 Cancer Anti
  • the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
  • processing system is configured to:
  • a system comprising: [0210] a) one or more diagnostic assays configured to measure assay values comprising: (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject
  • b) a processing system comprising a computer processor and a non-transitory computer memory comprising a database and a machine learning algorithm;
  • the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the 1PN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
  • processing system is configured to:

Abstract

Disclosed herein are methods and systems for determining whether at least one indeterminate pulmonary nodule identified in a subject is likely to be malignant or likely not to be malignant.

Description

METHODS AND SYSTEMS FOR MALIGNANCY PREDICTION OF INDETERMINATE PULMONARY NODULES
RELATED APPLICATION INFORMATION
10001] This application claims priority to U.S. Application No. 63/389,077 filed on July 14, 2023, the contents of which is herein incorporated by reference.
TECHNICAL FIELD
[0002] The present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or likely not to be malignant. The method and systems of the present disclosure utilize certain subject values including: (i) a subject smoking pack years value; and (ii) a measurement of the size of the IPN in the subject; and certain assay values including: (iii) (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a subject kappa free light chain concentration, a subject lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject. The subject values and the assay values are applied to at least one machine learning algorithm which is used to produce or output a score (a machine learning score) for the subject. The machine learning score for the subject is compared to a reference score to determine whether an IPN is likely to be malignant or likely not to be malignant.
BACKGROUND
[0003] Lung cancer is the second most common cancer worldwide. In 2020, there were an estimated 2.2 million cases of lung cancer and over 1.7 million deaths globally (See, Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries, CA Cancer J Clin 71 (2021) 209- 249). In the United States, the most common form of lung cancer is non-small cell lung cancer (NSCLC) which accounts for approximately 80-85% of all lung cancers. Additionally, more than half of all NSCLC patients are also diagnosed with advanced and/or metastatic disease (NIH Surveillance, Epidemiology, and End Results (SEER program)).
Currently, 57% of all lung cancers are diagnosed late, greatly reducing the chance of survival. This data aligns with the present survival rate statistics for patients diagnosed with metastatic lung cancer as a recent epidemiological study has shown that the 5-year survival rate for metastatic lung cancer patients is less than 10% (NIH Surveillance, Epidemiology, and End Results). Cure rates are significantly higher for early localized lung cancers as compared to systemically dispersed cancers. Therefore, early diagnosis may play a key role in improving cancer outcomes by way of early treatment intervention and in turn increasing the cure and survival rates.
[0004] To identify and treat lung cancer patients in early stages of the disease, the National Cancer Institute’s National Lung Screening Trial (NLST) established that an annual low-dose helical computed tomography (LDCT) screening in specific high-risk groups reduce lung cancer fatalities (Wendler R, Fontham E, Barrera E, et al., American Cancer Society Lung Cancer Screening Guidelines, CA Cancer J Clin 63,2 (2013) 106-117). High risk lung cancer populations that could benefit from this screening have been defined by the American Cancer Association as apparently healthy patients between 55 and 75 years, who have at least a 20-pack-year smoking history and who currently smoke or have quit smoking within the past 15 years (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316). However, LDCT will not detect all lung cancers and can generate a high frequency of false positive findings. LDCT false positive findings can result in additional non-invasive and/or invasive procedures being performed to determine whether the image abnormality observed is truly cancerous (Wendler R, Fontham E, Barrera E, et al., American Cancer Society Lung Cancer Screening Guidelines, CA Cancer J Clin 63,2 (2013) 106-117). Some image abnormalities found with increased LDCT screening of high-risk lung cancer populations are indeterminate pulmonary nodules (IPNs). IPNs are non-calcified lung nodules (typically 7-20 mm in size) that require further diagnostic workup for risk of malignancy (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173-1178). IPN size itself correlates with larger nodule sizes carrying the greater risk of being cancerous: IPN<5 mm: 0%-l%; 5-10 mm: 6%-28%; 11-20 mm: 37%-64%; >20 mm: 64%-82% (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173— 1178). Since most of the high-risk lung cancer population are classified as an IPNs (55-76%), further tools are needed to distinguish malignancy from benign conditions beyond nodule size (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173—1178).
SUMMARY
[0005] In one embodiment, the present disclosure relates to a method of determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely malignant. The method comprises the steps of:
[0006] a) providing subject values for the subject, wherein said subject values comprises at least one of the following:
|0007] i) a subject smoking pack years value;
|0008] ii) an identification of biological sex;
[0009] iii) an identification of race;
[0010] iv) an identification of type of nodule; and
[0011] v) a measurement of the size of the IPN in the subject;
[0012] b) providing at least two assay values, wherein said at least two assay values comprise:
[0013] i) at least one of a subject Cancer Antigen 125 (C A125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and
[0014] ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0015] c) providing a processing system comprising a computer processor and a non- transitory computer memory comprising a database and at least one machine learning algorithm, [0016] wherein the at least one machine learning algorithm is configured to process the subject values and the assay values, and
[0017] further wherein the processing system is configured to:
[0018] i) apply the at least one machine learning algorithm to the assay values and the subject values to output a machine learning score for the subject;
|0019] ii) report the machine learning score for the subject; and
[0020] iii) provide a reference score for comparison with the machine learning score; and [0021 ] d) determining that the IPN is likely malignant if the machine learning score is higher than the reference score and not likely malignant if the machine learning score is the same as or below the reference score.
[0022] In one aspect of the above method, the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
[0023] In another aspect of the above method, the method involves obtaining an assay value comprising a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
[0024] In another aspect of the above method, the method involves obtaining an assay value comprising a subject CAI 25 concentration, a subject CEA concentration, a subject HE4 concentration, a subject Cyfra 21-1 concentration, a subject NSE concentration, a subject SCC concentration, a subject ProGRP concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
[0025] In some aspects of the above method, the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score. In other aspects, the machine learning algorithm is a random forest algorithm that generates a random forest score. In yet other aspects, the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score. In some aspects, the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
[0026] In another aspect of the above method, the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
[0027] In still a further aspect of the above method, the biological sample is a whole blood sample, a serum sample, or a plasma sample.
10028] In yet another aspect of the above method, the obtaining subject values, assay values, or subject values and assay values comprise receiving said subject values, assay values, or subject values and assay values from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
[0029] In still yet a further aspect of the above method, the obtaining subject values, assay values or subject values and assay values comprise electronically receiving said subject values.
[0030] In still yet a further aspect of the above method, the method further comprises manually or automatically inputting said subject values, assay values, or subject values and assay values into said processing system.
[0031] In still yet a further aspect of the above method, the processing system compares the machine learning score for the subject against the reference score.
[0032] In still yet a further aspect of the above method, the determination of whether the IPN is likely malignant or likely not malignant is displayed on a device.
[0033] In still yet a further aspect of the above method, the subject is a human.
[0034] In still yet a further aspect of the above method, the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
[0035] In still yet a further aspect of the above method, the at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
[0036] In another embodiment, the present disclosure relates to a system comprising: [0037] a. subject values for a subject, wherein said subject values comprise: |0038] i) a subject smoking pack years value; and [0039] ii) a measurement of the size of the IPN in the subject;
[0040] b. one or more assays for measuring:
[0041] i) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and
[0042] ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0043] c. a device comprising a processing system, wherein the processing system comprises a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm;
[0044] wherein the at least one machine learning algorithm is configured to process the subject values and the assay values to produce a machine learning score for the subject, and [0045] further wherein the processing system is configured to:
[0046] i) apply the at least one machine learning algorithm to the subject values and the assay values to output a machine learning score for the subject;
[0047] ii) report the machine learning score for the subject; and
[0048] iii) provide a reference score for comparison with the machine learning score, [0049] wherein the device displays that the IPN is (1) likely malignant if the machine learning score is higher than the reference score; or (2) not likely malignant if the machine learning score is the same as or below the reference score.
[0050] In one aspect of the above system, the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
[0051] In another aspect of the above system, the assay values comprise a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject. [0052] In another aspect of the above method, the method involves obtaining an assay value comprising a subject CAI 25 concentration, a subject CEA concentration, a subject HE4 concentration, a subject Cyfra 21-1 concentration, a subject NSE concentration, a subject SCC concentration, a subject ProGRP concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
[0053] In some aspects of the above system, the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score. In other aspects, the machine learning algorithm is a random forest algorithm that generates a random forest score. In yet other aspects, the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score. In some aspects, the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
[0054] In still yet another aspect of the above system, the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,
83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
[0055] In still yet another aspect of the above system, the biological sample is a whole blood sample, a serum sample, or a plasma sample.
[0056] In still yet a further aspect of the above system, the subject values, assay values, or subject values and assay values are received from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
[0057] In still yet another aspect of the above system, the subject values, assay values, or subject values and assay values are received electronically.
[0058] In still yet another aspect of the above system, the subject values, assay values, or subject values and assay values are manually or automatically inputted into said processing system.
[0059] In still yet another aspect of the above system, the subject is a human.
[0060] In still yet another aspect of the above system, the at least one of the subject CAI 25 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
[0061] In still yet another aspect of the above system, at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
[0062] In yet another embodiment, the present disclosure relates to a method comprising the steps of:
[0063] a) providing one or more diagnostic assays configured to measure assay values comprising (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0064] b) providing a processing system comprising a computer processor and a non- transitory computer memory comprising a database and at least one machine learning algorithm;
[0065] wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database; and
[0066] wherein the processing system is configured to:
[0067] i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject;
[0068] ii) report the machine learning score for the subject; and
[0069] iii) provide a reference score for comparison with the machine learning score. [0070] In some aspects of the above method, the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score. In other aspects, the machine learning algorithm is a random forest algorithm that generates a random forest score. In yet other aspects, the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score. In some aspects, the at least one machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
[0071] In yet another embodiment, the present disclosure relates to a system comprising: [0072] a) one or more diagnostic assays configured to measure assay values comprising (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0073] b) a processing system comprising a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm; [0074] wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
[0075] wherein the processing system is configured to:
[0076] i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject;
[0077] ii) report the machine learning score for the subject; and
[0078] iii) provide a reference score for comparison with the machine learning score.
[0079] In some aspects of the above system, the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm which generates an AIM score. In other aspects, the machine learning algorithm is a random forest algorithm that generates a random forest score. In yet other aspects, the machine learning algorithm is a logistic regression algorithm that generates a logistic regression score. In some aspects, the machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
BRIEF DESCRIPTION OF THE DRAWINGS
[0080] FIG. 1 shows the relative classification of AIM models by IPN category as described in the Example.
DETAILED DESCRIPTION
[0081] The present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or likely not to be malignant. The methods and systems described herein utilize certain subject values including: (i) a subject smoking pack years value; and (ii) a measurement of the size of the IPN in the subject; and certain assay values including (iii) (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject. The subject values and assay values are applied to at least one machine learning algorithm which is used to produce a score (a machine learning score) for the subject. The machine learning score for the subject is compared to a reference score to determine whether an IPN is likely to be malignant or likely not to be malignant.
Specifically, if the subject’s machine learning score is higher than the reference score, then the IPN is likely malignant. If the subject’s machine learning score is the same as or below (e.g., less than) the reference score, then the IPN is likely not to be malignant. 1. Definitions
[0082] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In case of conflict, the present document, including definitions, will control. Preferred methods and materials are described below, although methods and materials similar or equivalent to those described herein can be used in practice or testing of the present disclosure. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples disclosed herein are illustrative only and not intended to be limiting.
[0083] The terms “comprise(s),” “include(s),” “having,” “has,” “can,” “contain(s),” and variants thereof, as used herein, are intended to be open-ended transitional phrases, terms, or words that do not preclude the possibility of additional acts or structures. The singular forms “a,” “an” and “the” include plural references unless the context clearly dictates otherwise. The present disclosure also contemplates other embodiments “comprising,” “consisting of’ and “consisting essentially of,” the embodiments or elements presented herein, whether explicitly set forth or not.
[0084] For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6- 9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
[0085] “Biological sample” or “sample” as used interchangeably herein, includes, but is not necessarily limited to, bodily fluids such as blood-related samples (e.g., whole blood (including, for example, capillary whole blood samples), serum, plasma, and other blood- derived samples), urine, cerebral spinal fluid, bronchoalveolar lavage, and the like. Another example of a biological sample is a tissue sample. A biological sample may be fresh or stored (e.g., blood or blood fraction stored in a blood bank). The biological sample may be a bodily fluid expressly obtained for use in the assays described herein or a bodily fluid obtained for another purpose which can be sub-sampled for use in the assays described herein. In certain aspects, the biological sample is whole blood. Whole blood may be obtained from the subject using standard clinical procedures. In other aspects, the biological sample is plasma. Plasma may be obtained from whole blood samples by centrifugation of anti -coagulated blood. Such process provides a buffy coat of white cell components and a supernatant of the plasma. In certain aspects, the biological sample is serum. Serum may be obtained by centrifugation of whole blood samples that have been collected in tubes that are free of anti-coagulant. The blood is permitted to clot prior to centrifugation. The yellowish-reddish fluid that is obtained by centrifugation is the serum. In another aspect, the sample is urine. The sample may be pretreated as necessary by dilution in an appropriate buffer solution, heparinized, concentrated if desired, or fractionated by any number of methods including but not limited to ultracentrifugation, fractionation by fast performance liquid chromatography (FPLC), or precipitation of apolipoprotein B containing proteins with dextran sulfate or other methods. Any of a number of standard aqueous buffer solutions at physiological pH, such as phosphate, Tris, or the like, can be used. In still further aspects, the biological sample is a whole blood sample and the subject is a human. In yet further aspects, the biological sample is a plasma sample and the subject is a human. In yet further aspects, the biological sample is a serum sample and the subject is a human. In still yet further aspects, the biological sample is a capillary blood sample and the subject is a human.
[0086] "Decentralize”, “Decentralized”, or “Decentralization”, as used interchangeably herein, refers to, in the context of testing, the performance of one or more medical tests and/or assays outside of a traditional medical setting (e.g., a hospital, physician office, stand alone lab site, etc.) to one or more places such as urgent care clinics, retail clinics, pharmacies, grocery stores or convenience stores, residences (e.g., homes, apartments, etc.), workplaces, and/or government offices (e.g., U.S. Transportation and Safety Authority), etc. “Hybrid-decentralization” or “hybrid-decentralized” refers to situations in which a subject or patient collects a sample at a residence and/or workplace and ships the sample to a laboratory, avoiding a professional collection site (such as a hospital, physician’s office, or stand alone sample collection or lab site).
[0087] “Higher throughput assay analyzer” or a “non-point-of-care device”, as used interchangeably herein, refers to a device that is not a point-of-care device or a single use device. A higher throughput assay analyzer or non-point-of-care device refers to any device that does not meet any of the limitations of a point-of-care or a single use device as defined herein. In some embodiments, a “higher throughput assay analyzer” or “non-point-of-care device” refers to an instrument that: (a) may be a relatively large instrument compared to a hand-held point-of-care device, e.g., such as ranging in size from that of a tabletop instrument (e.g., typically considered low- or medium- throughput) to a large room-size or multipleroom-size instrument (e.g., typically considered high throughput); (b) is not a handheld instrument; (c) is capable of performing an assay on more than one clinical sample simultaneously; and (d) any combination of (a)-(c). A higher throughput assay analyzer may be a clinical chemistry analyzer, an immunoassay analyzer, or a combination thereof. Exemplary higher throughput assay analyzers or non-point-of-care devices include, for example, the ARCHITECT or Alinity platforms produced by Abbott Laboratories.
[0088] “Point-of-care device” refers to a device used to provide medical diagnostic testing at or near the point-of-care (namely, typically, outside of a laboratory), at the time and place of patient care (such as in a hospital, physician’s office, urgent or other medical care facility, a patient’s home, a nursing home and/or a long term care and/or hospice facility). Examples of point-of-care devices include those produced by Abbott Laboratories (Abbott Park, IL) (e.g., i-STAT and i-STAT Alinity, Universal Biosensors (Rowville, Australia) (see US 2006/0134713), Axis-Shield PoC AS (Oslo, Norway) and Clinical Lab Products (Los Angeles, USA).
[0089] “Reference score” as used herein refers to a value that is used to assess diagnostic, prognostic, or therapeutic efficacy and that has been linked or is associated herein with various clinical parameters (e.g., presence of disease (such as malignant versus non- malignant), stage of disease, severity of disease, progression, non-progression, or improvement of disease, etc.).
[0090] “Subject” and “patient” as used herein interchangeably refers to any vertebrate, including, but not limited to, a mammal e.g., cow, pig, camel, llama, horse, goat, rabbit, sheep, hamsters, guinea pig, cat, dog, rat, and mouse, a non-human primate (for example, a monkey, such as a cynomolgus or rhesus monkey, chimpanzee, etc.) and a human). In some embodiments, the subject may be a human or a non-human. In some embodiments, the subject is a human. In some embodiments, the subject is biologically male, female, or other. In some embodiments, the subject is racially/ethnically identified as singularly or a combination of American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or other Pacific Islander, White or other. The subject or patient may have tumor(s) that is benign, malignant or a combination thereof. The subject or patient may be undergoing other forms of treatment.
[0091] “Treat,” “treating” or “treatment” are each used interchangeably herein to describe reversing, alleviating, or inhibiting the progress of a disease and/or injury, or one or more symptoms of such disease, to which such term applies. Depending on the condition of the subject, the term also refers to preventing a disease, and includes preventing the onset of a disease, or preventing the symptoms associated with a disease. A treatment may be either performed in an acute or chronic way. The term also refers to reducing the severity of a disease or symptoms associated with such disease prior to affliction with the disease. Such prevention or reduction of the severity of a disease prior to affliction refers to administration of a pharmaceutical composition to a subject that is not at the time of administration afflicted with the disease. "Preventing" also refers to preventing the recurrence of a disease or of one or more symptoms associated with such disease. "Treatment" and "therapeutically," refer to the act of treating, as "treating" is defined above.
10092] Unless otherwise defined herein, scientific and technical terms used in connection with the present disclosure shall have the meaning that are commonly understood by those of ordinary skill in the art. For example, any nomenclatures used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those that are well known and commonly used in the art. The meaning and scope of the terms should be clear; in the event, however of any latent ambiguity, definitions provided herein take precedent over any dictionary or extrinsic definition. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
2. Methods and Apparatus for Determining Whether at Least One Indeterminate Pulmonary Nodule Identified in a Subject is Malignant
[0093] In one embodiment, the present disclosure relates to methods and systems for determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely to be malignant or not likely to be malignant. Indeterminate pulmonary nodules can be identified in subjects, particularly high risk subjects (e.g., apparently healthy patients between 55 and 75 years, who have at least a 30-pack-year smoking history and who currently smoke or have quit smoking within the past 15 years (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316)), using routine techniques known in the art, such as low-dose helical computed tomography (LDCT) screening.
[0094] In one aspect, the methods and systems of the present disclosure involve obtaining certain subject values and assay values for the subject for whom the at least one IPN has been identified. The subject values to be obtained include: (a) a subject smoking pack years value; and (b) a measurement of the size of the IPN in the subject. The assay values to be obtained include: (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from a subject; and (b) at least one of the subject’s total IgG concentration, IgA concentration, IgM concentration, IgE concentration, kappa free light chain concentration, lambda free light chain concentration, or any combination thereof, in a biological sample obtained from the subject.
[0095] The biological samples obtained from the subject for determining the concentration of (i) one or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof; and (ii) one or more of total IgG, IgA, IgM, IgE, kappa free light chain, lambda free light chain, or any combination thereof, can be obtained using techniques known to those skilled in the art, and the sample may be used directly as obtained from the subject or following a pretreatment to modify the character of the sample. Such pretreatment may include, for example, preparing plasma from blood, diluting viscous fluids, filtration, precipitation, dilution, distillation, mixing, concentration, inactivation of interfering components, the addition of reagents, lysing, and the like. In some aspects, the biological sample is a whole blood sample, a serum sample, a plasma sample, or a capillary blood sample. In other aspects, the same or different biological samples can be used to determine the concentration of (i) one or more of CA125, CEA, HE4, Cyfra 21-1, NSe, SCC, ProGRP, or any combination thereof; and (ii) one or more of total IgG, IgA, IgM, IgE, kappa free light chain, lambda free light chain, or any combination thereof in the subject.
[0096] The source of the subject values and the assay values is not critical. For example, the subject values and assay values can be obtained from one or more of a physician’s office, a hospital or other medical facility, a testing lab, in a decentralized setting, from an analytical testing system, from a hand-held or point-of-care testing device, a high throughput analyzer or any combination thereof.
[0097] The smoking pack years value of the subject can be obtained by determining the number of cigarette packs smoked per year by the subject and then multiplying that number by the number of years the subject smoked. For example, if a subject smoked 2 packs a day for 20 years, the subject’s smoking pack years value would be 40.
[0098] The measurement of the size of the IPN in the subject can be determined using routine techniques known in the art. For example, LDCT can be used to detect and measure the size of an IPN identified in a subject. It is known in the art that larger IPN nodule size correlates with a carrying the greater risk of the IPN being cancerous: IPN<5 mm: 0%-l%; 5-10 mm: 6%-28%; 11-20 mm: 37%-64%; >20 mm: 64%-82% (Maission P, Walker R., Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer, Cancer Prev Res (Phila) 7, 12 (2014) 1173-1178).
10099] As mentioned previously, the methods and systems also require obtaining, providing and/or determining the concentration of one or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, or ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of two or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of three or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of four or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of five or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of six or more of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from a subject. In another aspect, the methods and systems also require obtaining, providing, and/or determining the concentration of CA125, CEA, HE4, Cyfra 21-1, and NSE, SCC, and ProGRP, in a biological sample obtained from a subject.
[0100] Any assay known in the art for determining the concentration of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof in a biological sample can be provided and/or used in the methods and systems of the present disclosure. For example, an immunoassay, a clinical chemistry, radioimmuoassay, immunoradiometric assay, etc. can be used or provided in the methods and systems of the present disclosure. In some aspects, the concentration of at least one of CA125, CEA, HE4, Cyfra 21-1, NSE, SCC, ProGRP, or any combination thereof, in a biological sample obtained from the subject can be determined using a high throughput analyzer or a point-of-care device. For example, the Abbott Laboratories CA125 chemiluminescent assay for use on the ARCHITECT® i2000 automated immunoassay platform can be used in the methods and systems described herein.
[0101] The methods and systems further require determining the concentration of at least one of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof in a biological sample obtained from the subject. In some aspects, the methods and systems further require determining the concentration of at least two of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof. In still further aspects, the methods and systems further require determining the concentration of at least three of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof. In yet still further aspects, the methods and systems further require determining the concentration of at least four of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof. In yet still further aspects, the methods and systems of the present disclosure require determining the concentration of a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration and a subject’s kappa free light chain (KFLC) lambda free light chain (LFLC) concentration. Any assay known in the art for determining a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combinations thereof can be provided and/or used in the methods and systems of the present disclosure. For example, an immunoassay, a clinical chemistry, radioimmunoassay, immunoradiometric assay, can be used. In some aspects, a subject’s total IgG concentration, a subject’s IgA concentration, a subject’s IgM concentration, a subject’s IgE concentration, a subject’s LFLC concentration, or any combination thereof, in a biological sample obtained from the subject can be determined using a high throughput analyzer or a point-of-care device. For example, the Abbott Laboratories ARCHITECT c8000 automated clinical chemistry platform utilizing a turbidimetric assay format (either utilizing antisera or latex enhanced antibody coated particles) can be used to determine the subject’s total IgG concentration, the subject’s IgA concentration, the subject’s IgM concentration, the subject’s IgE concentration, the subject’s LFLC concentration, or any combination thereof in the methods and systems described herein. [0102] Once the subject values and assay values are obtained, the values are inputted and processed by a processing system. The processing system can comprise a computer processor and a non-transitory computer memory that contains one or more computer programs and a database. In some aspects, the subject values, assay values and subject values and assay values are manually inputted into the processing system. In other aspects, the subject values, assay values or subject values and assay values are automatically inputted into the processing system. In yet further aspects, the subject values, assay values or subject values and assay values are received electronically, such as, for example, via e-mail. In further aspects, the processing system further comprises a hand-held or point-of-care testing device. In yet further aspects, the processing system further comprises a high throughput analyzer.
[0103] In other aspects, at least one of the computer programs contained in the processing system is one or more machine learning algorithms. Any machine learning algorithm can be used in the methods and systems of the present disclosure. In some aspects the at least one machine learning algorithm is an adaptive index modeling (AIM) algorithm. The processing system can contain any AIM algorithm known in the art. For example, the AIM algorithm described in Tian L, Tibshirani R, Adaptive index models for marker-based risk stratification, Biostatistics 12, 1 (2011), 68-86 and AIM: AIM: adaptive index model. R package version 1.01 can be used, hi other aspects, the at least one machine learning algorithm is a random forest algorithm. Any random forest algorithm known in the art can be used. In yet other aspects, the at least one machine learning algorithm is a logistic regression algorithm. Any machine learning algorithm known in the art can be used. In some aspects, the at least one machine learning algorithm uses any combination of an AIM algorithm, a random forest algorithm, or a random forest algorithm.
[0104] In some aspects, the processing system is configured to apply the at least one machine learning algorithm to the assay values and the subject values to produce, generate or output a score for the subject (a machine learning score, such as, for example, an AIM score, a random forest algorithm score, a logistic regression algorithm score, or any combination thereof). In some aspects, the processing system is further configured to communicate (e.g., report) the machine learning score is communicated (e.g., reported) for further analysis, interpretation, interpretation, processing and/or display. The machine learning score for the subject can be communicated (e.g., reported) by the processing system (e.g., a computer), in a document and/or spreadsheet, on a mobile device (e.g., a smart phone), on a website, in an e- mail, or any combination thereof. [0105] In still other aspects, the machine learning score for the subject is compared to a reference score. In some aspects, a clinician or other medical personnel can compare the machine learning score for the subject with a reference score. The reference score can be provided in a product insert or other publication, or on a website or on a mobile device (e.g., such as through an app). In other aspects, the processing system is configured to provide a reference score for comparison with the machine learning algorithm score.
The reference score can be determined using routine techniques known in the art. For example, the reference score may be determined by the machine learning algorithm in the processing system. In some aspects, the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,
63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100. In some aspects, the reference score is
1. In other aspects, the reference score is 2. In still further aspects, the reference score is 3.
In yet further aspects, the reference score is 4. In yet further aspects, the reference score is 5.
In yet further aspects, the reference score is 6. In yet further aspects, the reference score is 7.
In yet further aspects, the reference score is 8. In yet further aspects, the reference score is 9.
In yet further aspects, the reference score is 10. In yet further aspects, the reference score is
11. In yet further aspects, the reference score is 12. In yet further aspects, the reference score is 13. In yet further aspects, the reference score is 14. In yet further aspects, the reference score is 15. In yet further aspects, the reference score is 16. In yet further aspects, the reference score is 17. In yet further aspects, the reference score is 18. In yet further aspects, the reference score is 19. In yet further aspects, the reference score is 20. In yet further aspects, the reference score is 21. In yet further aspects, the reference score is 22. In yet further aspects, the reference score is 23. In yet further aspects, the reference score is 24. In yet further aspects, the reference score is 25. In yet further aspects, the reference score is 26. In yet further aspects, the reference score is 27. In yet further aspects, the reference score is 28. In yet further aspects, the reference score is 29. In yet further aspects, the reference score is 30. In yet further aspects, the reference score is 31. In yet further aspects, the reference score is 32. In yet further aspects, the reference score is 33. In yet further aspects, the reference score is 34. In yet further aspects, the reference score is 35. In yet further aspects, the reference score is 36. In yet further aspects, the reference score is 37. In yet further aspects, the reference score is 38. In yet further aspects, the reference score is 39. In yet further aspects, the reference score is 40. In yet further aspects, the reference score is 41. In yet further aspects, the reference score is 42. In yet further aspects, the reference score is 43. In yet further aspects, the reference score is 44. In yet further aspects, the reference score is 45. In yet further aspects, the reference score is 46. In yet further aspects, the reference score is 47. In yet further aspects, the reference score is 48. In yet further aspects, the reference score is 49. In yet further aspects, the reference score is 50. In yet further aspects, the reference score is 51. In yet further aspects, the reference score is 52. In yet further aspects, the reference score is 53. In yet further aspects, the reference score is 54. In yet further aspects, the reference score is 55. In yet further aspects, the reference score is 56. In yet further aspects, the reference score is 57. In yet further aspects, the reference score is 58. In yet further aspects, the reference score is 59. In yet further aspects, the reference score is 60. In yet further aspects, the reference score is 61. In yet further aspects, the reference score is 62. In yet further aspects, the reference score is 63. In yet further aspects, the reference score is 64. In yet further aspects, the reference score is 65. In yet further aspects, the reference score is 66. In yet further aspects, the reference score is 67. In yet further aspects, the reference score is 68. In yet further aspects, the reference score is 69. In yet further aspects, the reference score is 70. In yet further aspects, the reference score is 71. In yet further aspects, the reference score is 72. In yet further aspects, the reference score is 73. In yet further aspects, the reference score is 74. In yet further aspects, the reference score is 75. In yet further aspects, the reference score is 76. In yet further aspects, the reference score is 77. In yet further aspects, the reference score is 78. In yet further aspects, the reference score is 79. In yet further aspects, the reference score is 80. In yet further aspects, the reference score is 81. In yet further aspects, the reference score is 82. In yet further aspects, the reference score is 83. In yet further aspects, the reference score is 84. In yet further aspects, the reference score is 85. In yet further aspects, the reference score is 86. In yet further aspects, the reference score is 87. In yet further aspects, the reference score is 88. In yet further aspects, the reference score is 89. In yet further aspects, the reference score is 90. In yet further aspects, the reference score is 91. In yet further aspects, the reference score is 92. In yet further aspects, the reference score is 93. In yet further aspects, the reference score is 94. In yet further aspects, the reference score is 95. In yet further aspects, the reference score is 96. In yet further aspects, the reference score is 97. In yet further aspects, the reference score is 98. In yet further aspects, the reference score is 99. In yet further aspects, the reference score is 100.
[0106] Based on the comparison of the subject’s machine learning score with the reference score, a determination is made whether the IPN is likely malignant or likely not malignant. Specifically, if the subject’s machine learning score is higher than the reference score, then a determination is made that the IPN in the subject is likely to be malignant. If the subject’s machine learning score is the same as or below the reference score, then a determination is made that the IPN in the subject is not likely to be malignant. In some aspects, the determination of whether a subject’s IPN is likely to be malignant or not likely to be malignant can be communicated (e.g., reported) for further display. Specifically, this determination of whether a subject’s IPN is likely to be malignant or likely not to be malignant can be communicated (e.g., reported) by the processing system (e.g., a computer), in a document and/or spreadsheet, on a mobile device (e.g., a smart phone), on a website, in an e-mail, or any combination thereof.
[0107] A subject identified as having an IPN that is likely to malignant based on the methods and systems described herein may be treated, monitored, or treated and monitored. In some aspects, a surgical or non-surgical biopsy can be performed to further evaluate the IPN identified as likely to be malignant. In other aspects, a portion of the lung containing the IPN may be surgically removed or resected from the subject using techniques known in the art such as, for example, video-assisted thoracic surgery or a thoracotomy. In yet other aspects, the subject may receive one or more pharmaceutical or biopharmaceutical treatments. For example, in some aspects, the subject may be treated chemotherapy, radiation, budesonide, fluticasone, or any combinations thereof. In further aspects, the subject may also be monitored. For example, the IPN in the subject can be monitored using one or more of computed tomography (including LDCT) scans, positron emission tomography (PET) scans, bronchoscopy, or any combination thereof. In some aspects, the subject can be monitored before, during and/or after any biopsy and/or treatment.
[0108] It will be readily apparent to those skilled in the art that other suitable modifications and adaptations of the methods of the present disclosure described herein are readily applicable and appreciable, and may be made using suitable equivalents without departing from the scope of the present disclosure or the aspects and embodiments disclosed herein. Having now described the present disclosure in detail, the same will be more clearly understood by reference to the following examples, which are merely intended only to illustrate some aspects and embodiments of the disclosure, and should not be viewed as limiting to the scope of the disclosure. The disclosures of all journal references, U.S. patents, and publications referred to herein are hereby incorporated by reference in their entireties.
10109] EXAMPLE [0110] Materials and Methods
[0111] 1. Study population
[0112] The patient sample cohort tested consisted of 141 patients in total with the following conditions: 36 patients with benign indeterminate pulmonary lung nodules (median nodule size 14.9 mm) and 105 patients with malignant indeterminate pulmonary lung nodules (median nodule size 19 mm). Approximately 74.5% of the population tested were classified as having Stage I NSCLC lung cancer. Clinical and pathological details for the malignant cases were obtained from the medical record system. Criteria for study inclusion in the malignant NSCLC cohort were broad (consisting of having a surgical resection with lymph node sampling and accompanying pathological examinations) and were not limited to any demographic or clinical factor. The benign cohort with malignant pulmonary nodules consisted of patients with granulomas, pneumonitis, or pneumonia. These patients received an anatomic resection for a suspected malignancy. The benign and malignant samples collected represent a real- world collection where no histological selection criteria were applied.
[0113] The demographic variable of smoking pack years was defined by the number of cigarette packs smoked per year multiplied by the number of years the individual smoked. For this pilot study, smoking pack years was calculated for any patient case receiving a CT scan based on risk assessment or as an incidental finding.
[0114] 2. Sample Collection, Handling, and Storage
[0115] Specimens were obtained with full written informed consent under a protocol approved by the Rush UMC Institutional Review Board (IRB). Peripheral blood collected at Rush UMC was obtained from each patient immediately before treatment initiation using standard phlebotomy techniques. Treatment initiation for the IPN could be surgical removal, biopsy, or further radiography assessments. All specimens were handled and processed into EDTA plasma in an identical manner. The time interval from sample collection to processing was less than 90 minutes. All EDTA vacutainer collection tubes were centrifuged at 750 RCF for 20 minutes to isolate the plasma layer. The subsequent plasma layer was transferred to a second tube and re-centrifuged to remove particulates. The specimens after re-centrifugation were dispensed into 0.75 mL aliquots and immediately archived and frozen at -80 degrees Celsius. No specimens were subjected to more than two thaw cycles for this study. Samples were coded with basic demographic and clinical parameters provided to the study personnel for the purposes of this study. [0116] After all samples were collected and EDTA plasma levels of CA-125, SCC, CEA, HE4, ProGRP, NSE, Cyfra 21-1, and Ferritin were determined by using the Abbott ARCHITECT1 i2000 automated immunoassay platform utilizing a two-step dual monoclonal chemiluminescent immunoassay (See Table 1; Quinn, F. A., The Immunoassay Handbook, 3rd edition, Wild, D; Elsevier Ltd.: United Kingdom, 2005. Chapter 34: ARCHITECT i2000 and i2000SR Analyzers). In addition, hs-CRP, total IgG, IgGl, IgG2, IgG3, IgG4, IgE, IgM, IgA, Kappa Free Light Chain, and Lambda Free Light Chain were determined by using the Abbott ARCHITECT® c 8000 automated clinical chemistry platform utilizing a turbidimetric assay format (either utilizing antisera or latex enhanced antibody coated microparticles) (See, Table 1; Clinical Chemistry Learning Guide Series 2020. https ://www.corelaboratory.abbott/sal/learningGuide/ ADD-
00061345 ClinChem Learning Guide.pdf (corelaboratory.abbott), 2020)). The IA and CC assay testing determinations were performed at Abbott Laboratories (IL, USA). Testing results were included in a database with the demographic and other clinical parameter data.
[0117] Table 1
Figure imgf000025_0001
[0118] Clinical Chemistry research use only (RUO) assays IgGl, IgG2, IgG3, and IgG4 were independently verified against the on-market Abbott Total IgG test. The summation of the IgGl-IgG4 test results is comparable to the total IgG results with a Passing-Bablok slope of 0.97 and a correlation coefficient of 0.96.
10119] 3. Statistical Methods
[0120] Patient samples were stratified based on IPN nodule category. Descriptive analysis was performed to show distributions of demographic and clinical variables by IPN nodule category. Mean and standard deviation are reported for approximately normally distributed variables, and median with minimum and maximum for non-normally distributed variables. The significance level for this study was set to a = 0.05. Individual profiles of biomarkers were examined using distribution plots, receiver operating curves (ROC), and area under the curve (AUC) for IPN nodule category prediction. The Wilcoxon rank sum test was also used to compare whether there was a difference in distribution for biomarkers between benign and nodule categories. For subsequent multivariable analyses, a single imputation of the median was performed for variables containing missing values: CEA (9.93% missing of total population), smoking pack years (3.55%), Lambda free light chain (2.13%), IgE (1.42%), IgG (0.71%), and IgA (0.71%). [0121] To explore multivariable relationships between biomarkers and IPN nodule category, adaptive index modeling (AIM) was applied (Tian L, Tibshirani R, Adaptive index models for marker-based risk stratification, Biostatistics 12, 1 (2011), 68-86; AIM: AIM: adaptive index model. R package version 1.01.)This method uses the concept of an index predictor, defined as a binary rule based on the value of a predictor variable - for example, whether a patient’s age is either >55 or <55. After providing a set of variables that are potential predictors, AIM adaptively searches for individual cutoffs for each variable to build an overall model. Index predictors are selected by maximizing the score test statistic, up to a prespecified number of total predictors (Tian L, Tibshirani R, Adaptive index models for marker-based risk stratification, Biostatistics 12, 1 (2011), 68-86; See Table 2 for example). In this study, the maximum number of index predictors was set to 8 to avoid having a model that would be unlikely to be implemented and clinically adopted based on algorithm cost and/or complexity. After the adaptive selection process, 5-fold cross validation was used (due to small sample size) to select the model with the optimal number of index predictors.
[0122] Once the optimal AIM model is selected, each individual subject can be scored according to the values of their index predictors relative to the binary rule cutoffs. Table 2 shows an example of the scoring process for AIM. Each subject will have a score from 0 to n, where n represents the total number of index predictors. Model performance can then be assessed by creating a binary outcome variable based on a score cutoff of >0, >1, . . ., > n-1. Across this range of possible cutoffs, performance metrics were assessed on the entire dataset, including AUC, accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. Positive predictive value and negative predictive value were calculated based on study prevalence (74.5% malignant). The final score cutoff for each AIM model was selected by choosing the maximum AUC, which on average provides a balance of sensitivity and specificity.
[0123] In this study, four possible models were developed using AIM:
[0124] 1. Demographic variables only: age + gender + smoking pack years + nodule size [0125] 2. ARCHITECT immunoassay biomarkers
[0126] 3. ARCHITECT clinical chemistry biomarkers
[0127] 4. ARCHITECT immunoassay biomarkers + clinical chemistry biomarkers + demographic variables
[0128] In the absence of a baseline model with clinical characteristics such as the Mayo clinical model for IPN prediction, the model using demographic variables only was treated as the baseline model to determine if biomarkers provided additional predictive value (The R Project for Statistical Computing. https://www.R-project.org/ ). Relative classification of patients by IPN nodule status was shown using a bar graph to compare predictions of the optimal AIM model versus the baseline model. All statistical analyses were performed using R version 4.0.5 (the R Project for Statistical Computing; Yang B, Jhun BW, Shin SH, Jeong B-H, Um S-W, Zo JI, et al., 2018. Comparison of four models predicting the malignancy of pulmonary nodules: A single-center study of Korean adults, PLoS ONE 13, 7, e0201242). [0129] In Table 2, the Adaptive Index Model (AIM) score is equivalent to the number of criteria met, in the example below the score would equal 3. The higher the AIM score, the higher the risk of the model being predictive of a disease state such as lung cancer.
[0130] Table 2. AIM Schematic for Method Illustration
Figure imgf000028_0001
[0131] Legend:
[0132] Bold Text Meets AIM cutoff Criteria
|0133] Not Bold Text Does Not Meet AIM cutoff Criteria
[0134] Results
[0135] For this study, patient population inclusion criteria were not restricted to ‘high risk’ populations as defined by the NLST study to better appreciate the clinical factors and biomarker distributions across the spectrum of individuals with pulmonary nodules requiring an anatomic resection. There were a few cases included where the IPN was identified as an incidental finding from radiography. None of the cases had any annotations of being symptomatic in their electronic records at the time of sample collection. Based on p-values <0.05, the demographic variables of age, nodule size, race, and smoking pack years are statistically significant between benign and malignant IPN nodules (See Table 3).
[0136] Table 3: Demographic Variables by Nodule Category
Figure imgf000028_0002
Figure imgf000029_0001
[0137] * For continuous variables, a /-test was used to compare groups if normally distributed and a Mann Whitney test was used if non-normally distributed. For categorical variables, Pearson’s chi-squared test without continuity correction was used if all expected cell counts were >5 and Fisher’s exact test was used otherwise.
[0138] All blood-based biomarkers were individually evaluated for their efficacy in risk stratification of this indeterminate risk pulmonary nodule population. Plasma values for CA- 125, SCC, CEA, HE4, ProGRP, NSE, Cyfra 21-1, hs-CRP, Ferritin, total IgG, IgGl, IgG2, IgG3, IgG4, IgE, IgM, IgA, Kappa Free Light Chain, and Lambda Free Light Chain are summarized in Table 4 along with their respective calculated AUCs. Individually, none of the biomarkers alone make a compelling case for risk stratification, as estimated by their AUC values (range 0.433 - 0.594). As a result, the biomarkers were combined with and without the demographic/clinical data using the AIM methodology to determine if any modeled combinations could be helpful in a risk stratification exercise.
[0139] Table 4. Individual Biomarker Results
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
[0140] Table 5 shows the results of performing AIM statistical methodology on four possible combinations of biomarkers and demographic variables. Of the models, the best performing was model (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316) including ARCHITECT immunoassay and clinical chemical biomarkers as well as demographic variables. This model consisted of IgG, IgM, IgE, IgA, Lambda free light chain, CA-125, smoking pack years, and nodule size and resulted in an AUC of 0.819 (95% CI 0.730-0.899) as well as a sensitivity of 0.971 and specificity of 0.667. AIM model (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316) showed a statistically significant improvement over individual biomarker predictions for IPN nodule category (CI range 0.318-0.710). AIM model (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316) also had a relative improvement in AUC compared to AIM model (Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries, CA Cancer J Clin 71 (2021) 209-249) of demographic variables alone (AUC 0.699, 95% CI 0.614-0.784). FIG. 1 shows the relative classification of AIM models (Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries, CA Cancer J Clin 71 (2021) 209-249) and (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316) by IPN nodule category. AIM model (Smith R, Andrews K, Brooks D, et al., Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening, CA Cancer J Clin 68,4 (2018) 297-316) correctly classified 102/105 malignant samples and 24/36 benign samples, whereas AIM model (Sung H, Ferlay J, Siegel R, et al., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries, CA Cancer J Clin 71 (2021) 209-249) correctly classified 68/105 malignant samples and 27/36 benign samples.
|0141] Table 5 - AIM Modeling Results
Figure imgf000034_0002
Figure imgf000034_0001
* Prevalence approximately 74.5% [0142] Results
[0143] The addition of exploratory blood-based biomarkers described in this Example, narrowed down to key benign vs. malignancy discriminating biomarkers CA 125, IgG, IgM, IgA, IgE, Lambda Free Light Chain coupled with smoking pack years and nodule size, demonstrated improvement over demographic/nodule size data alone in aiding in malignancy risk assessment for IPN nodules (See Table 5 and FIG. 1). The developed AIM algorithm could potentially be used to stratify IPN risk of malignancy as follows: an AIM score above the cutoff of 4 has a high likelihood of the IPN being malignant, whereas an AIM score at or below the cutoff has a low likelihood that the IPN is malignant. For example, to confirm a high AIM score result, the suspect IPN could be further imaged with contrast MRI to assess for spiculation (spiky outgrowths from the nodule) which can a high-risk indicator of cancer. If the IPN is determined to be high risk based on the AIM score and MRI imaging, the nodule could potentially be biopsied and the resulting pathology may necessitate aggressive treatment such as surgical removal of the IPN, radiation, and/or chemotherapy. Alternatively, an IPN presenting with low AIM score by the current algorithm and confirmed imaging that supports smooth calcified nodules would have a generally low risk of being cancer.
|0144] This study also raised some further items of interest with the immune-related clinical chemistry assays. Although the immune biomarkers tested are not specific for lung cancer, in this sample set, they appeared to be sensitive to biological changes associated with cancer. IgG, IgGl, IgG2, IgG4, and IgE concentrations appear to be suppressed in cancer patients versus benign nodule patients; this potentially indicates a downregulation of the immune system (See Table 4). This opens a possibility for the immune system profile to aid in predicting benign disease.
[0145] It is understood that the foregoing detailed description and accompanying examples are merely illustrative and are not to be taken as limitations upon the scope of the disclosure, which is defined solely by the appended claims and their equivalents.
[0146] Various changes and modifications to the disclosed embodiments will be apparent to those skilled in the art. Such changes and modifications, including without limitation those relating to the chemical structures, substituents, derivatives, intermediates, syntheses, compositions, formulations, or methods of use of the disclosure, may be made without departing from the spirit and scope thereof. [0147] For reasons of completeness, various aspects of the disclosure are set out in the following numbered clauses:
[0148] Clause 1. A method of determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely malignant, the method comprising the steps of: |0149] a) providing subject values for the subject, wherein said subject values comprises at least one of the following:
[0150] i) a subject smoking pack years value;
[0151] ii) an identification of biological sex;
[0152] iii) an identification of race;
[0153] iv) an identification of type of nodule; and
[0154] v) a measurement of the size of the IPN in the subject;
[0155] b) providing at least two assay values, wherein said at least two assay values comprise:
[0156] i) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and
[0157] ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0158] c) providing a processing system comprising a computer processor and a non- transitory computer memory comprising a database and at least one machine learning algorithm,
[0159] wherein the at least one machine learning algorithm configured to process the subject values and the assay values, and
[0160] further wherein the processing system is configured to:
[0161] i) apply the at least one machine learning algorithm to the assay values and the subject values to output a machine learning score for the subject;
[0162] report the machine learning score for the subject; and
|0163] provide a reference score for comparison with the machine learning score; and [0164] d) determining that the IPN is likely malignant if the machine learning score is higher than the reference score and not likely malignant if the machine learning score is the same as or below the reference score.
[0165] Clause 2. The method of clause 1, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
[0166] Clause 3. The method of clause 1 or clause 2, wherein the method involves obtaining an assay value comprising a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain, a lambda free light chain concentration from a biological sample obtained from the subject.
[0167] Clause 4. The method of any of clauses 1-3, wherein the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
[0168] Clause 5. The method of any of clauses 1-4, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
[0169] Clause 6. The method of any of clauses 1-5, wherein said providing the subject values, the assay values or the subject values and the assay values comprises receiving said subject values, assay values or subject values and assay values from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
[0170] Clause 7. The method of any of clauses 1-6, wherein said providing the subject values, the assay values, or subject values and the assay values comprise electronically receiving said subject values.
[0171] Clause 8. The method of any of clauses 1-7, further comprising manually or automatically inputting said subject values, assay values, or subject values and assay values into said processing system.
[0172] Clause 9. The method of any of clauses 1-8, wherein the processing system compares the machine learning score for the subject against the reference score.
[0173] Clause 10. The method of clause 9, wherein the determination of whether the IPN is likely malignant or likely not malignant is displayed on a device.
10174] Clause 11. The method of any of clauses 1-10, wherein said subject is a human. [0175] Clause 12. The method of any of clauses 1-11, wherein at least one of the subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
10176] Clause 13. The method of any of clauses 1-9, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
[0177] Clause 14. A system comprising:
[0178] a. subject values for a subject, wherein said subject values comprise:
|0179] i) a subject smoking pack years value; and
[0180] ii) a measurement of the size of the IPN in the subject;
[0181] b. one or more assays for measuring:
[0182] i) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and
[0183] ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain, a lambda free light chain concentration, or any combination thereof from a biological sample obtained from the subject;
[0184] c. a device comprising a processing system, wherein the processing system comprises a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm;
[0185] wherein the at least one machine learning algorithm is configured to process the subject values and the assay values to produce a machine learning score for the subject, and [0186] further wherein the processing system is configured to:
[0187] i) apply the at least one machine learning algorithm to the subject values and the assay values to output a machine learning score for the subject;
31 [0188] ii) report the machine learning score for the subject; and
[0189] iii) provide a reference score for comparison with the machine learning score, [0190] wherein the device displays that the IPN is (1) likely malignant if the machine learning score is higher than the reference score; or (2) not likely malignant if the machine learning score is the same as or below the reference score.
10191] Clause 15. The system of clause 14, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
[0192] Clause 16. The system of clause 14 or clause 15, wherein the assay values comprise a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
[0193] Clause 17. The system of any of clauses 14-16, wherein the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
[0194] Clause 18. The system of clauses 14-17, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
[0195] Clause 19. The system of any of clauses 14-18, wherein said subject values, assay values, or subject values and assay values are received from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
[0196] Clause 20. The system of any of clauses 14-19, wherein said subject values, assay values or subject values and assay values are received electronically.
[0197] Clause 21. The system of any of clauses 14-20, wherein the subject values, assay values or subject values and assay values are manually or automatically inputted into said processing system.
[0198] Clause 22. The system of any of clauses 14-21, wherein said subject is a human.
[0199] Clause 23. The system of any of clauses 14-22, wherein the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
[0200] Clause 24. The method of any of clauses 14-23, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject a kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
[0201] Clause 25. A method comprising the steps of:
[0202] a) providing one or more diagnostic assays configured to measure assay values comprising: (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration , a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0203] b) providing a processing system comprising a computer processor and a non- transitory computer memory comprising a database and at least one machine learning algorithm;
[0204] wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database; and
[0205] wherein the processing system is configured to:
[0206] i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject;
[0207] ii) report the machine learning score for the subject; and
[0208] iii) provide a reference score for comparison with the machine learning score.
[0209] Clause 26. A system comprising: [0210] a) one or more diagnostic assays configured to measure assay values comprising: (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject;
[0211] b) a processing system comprising a computer processor and a non-transitory computer memory comprising a database and a machine learning algorithm;
[0212] wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the 1PN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database;
[0213] wherein the processing system is configured to:
[0214] i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject;
[0215] ii) report the machine learning score for the subject; and
[0216] iii) provide a reference score for comparison with the machine learning score.

Claims

CLAIMS What is claimed is:
1. A method of determining whether at least one indeterminate pulmonary nodule (IPN) identified in a subject is likely malignant, the method comprising the steps of: a) providing subject values for the subject, wherein said subject values comprises at least one of the following: i) a subject smoking pack years value; ii) an identification of biological sex; iii) an identification of race; iv) an identification of type of nodule; and v) a measurement of the size of the IPN in the subject; b) providing at least two assay values, wherein said at least two assay values comprise: i) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject; c) providing a processing system comprising a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm, wherein the at least one machine learning algorithm configured to process the subject values and the assay values, and further wherein the processing system is configured to: i) apply the at least one machine learning algorithm to the assay values and the subject values to output a machine learning score for the subject; ii) report the machine learning score for the subject; and iii) provide a reference score for comparison with the machine learning score; and d) determining that the IPN is likely malignant if the machine learning score is higher than the reference score and not likely malignant if the machine learning score is the same as or below the reference score.
2. The method of claim 1, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
3. The method of claim 1 or claim 2, wherein the method involves obtaining an assay value comprising a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
4. The method of any of claims 1-3, wherein the reference score is 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
5. The method of any of claims 1-4, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
6. The method of any of claims 1-5, wherein said providing the subject values, the assay values or the subject values and the assay values comprises receiving said subject values, assay values or subject values and assay values from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
7. The method of any of claims 1-6, wherein said providing the subject values, the assay values, or subject values and the assay values comprise electronically receiving said subject values.
8. The method of any of claims 1-7, further comprising manually or automatically inputting said subject values, assay values, or subject values and assay values into said processing system.
9. The method of any of claims 1-8, wherein the processing system compares the Machine learning score for the subject against the reference score.
10. The method of claim 9, wherein the determination of whether the IPN is likely malignant or likely not malignant is displayed on a device.
11. The method of any of claims 1-10, wherein said subject is a human.
12. The method of any of claims 1-11, wherein the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21-1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
13. The method of any of claims 1-9, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
14. A system comprising: a. subject values for a subject, wherein said subject values comprise: i) a subject smoking pack years value; and ii) a measurement of the size of the IPN in the subject; b. one or more assays for measuring: i) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and ii) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject; c. a device comprising a processing system, wherein the processing system comprises a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm; wherein the at least one machine learning algorithm is configured to process the subject values and the assay values to produce a machine learning score for the subject, and further wherein the processing system is configured to: i) apply the at least one machine learning algorithm to the subject values and the assay values to output a machine learning score for the subject; ii) report the machine learning score for the subject; and iii) provide a reference score for comparison with the machine learning score, wherein the device displays that the IPN is (1) likely malignant if the machine learning score is higher than the reference score; or (2) not likely malignant if the machine learning score is the same as or below the reference score.
15. The system of claim 14, wherein the subject smoking pack years value is the number of cigarette packs smoked per year by the subject multiplied by number of years the subject smoked.
16. The system of claim 14 or claim 15, wherein the assay values comprise a subject CA125 concentration, a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration and a lambda free light chain concentration from a biological sample obtained from the subject.
17. The system of any of claims 14-16, wherein the reference score is 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 , 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100.
18. The system of claims 14-17, wherein the biological sample is a whole blood sample, a serum sample, or a plasma sample.
19. The system of any of claims 14-18, wherein said subject values, assay values, or subject values and assay values are received from a testing lab, from said subject, from an analytical testing system, from a hand-held or point of care testing device, or any combination thereof.
20. The system of any of claims 14-19, wherein said subject values, assay values or subject values and assay values are received electronically.
21. The system of any of claims 14-20, wherein the subject values, assay values or subject values and assay values are manually or automatically inputted into said processing system.
22. The system of any of claims 14-21, wherein said subject is a human.
23. The system of any of claims 14-22, wherein the at least one subject CA125 concentration, the subject CEA concentration, the subject HE4 concentration, the subject Cyfra 21 -1 concentration, the subject NSE concentration, the subject SCC concentration, the subject ProGRP concentration, or any combination thereof, is determined using an immunoassay.
24. The method of any of claims 14-23, wherein at least one of the subject total IgG concentration, the subject IgA concentration, the subject IgM concentration, the subject IgE concentration, the subject kappa free light chain concentration, the subject lambda free light chain concentration, or any combination thereof, is determined using a clinical chemistry assay.
25. A method comprising the steps of: a) providing one or more diagnostic assays configured to measure assay values comprising: (a)at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject; b) providing a processing system comprising a computer processor and a non- transitory computer memory comprising a database and at least one machine learning algorithm; wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database; and wherein the processing system is configured to: i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject; ii) report the machine learning score for the subject; and iii) provide a reference score for comparison with the machine learning score.
26. A system comprising: a) one or more diagnostic assays configured to measure assay values comprising: (a) at least one of a subject Cancer Antigen 125 (CA125) concentration, a subject Carcinoembryonic Antigen (CEA) concentration, a subject (Human Epididymis Protein 4 (HE4) concentration, a subject cytokeratin fragment 21-1 (Cyfra 21-1) concentration, a subject Neuron-Specific Enolase (NSE) concentration, a subject Squamous Cell Carcinoma Antigen (SCC) concentration, a subject Pro Gastrin Releasing Peptide (ProGRP) concentration, or any combination thereof, from a biological sample obtained from the subject; and (b) at least one of a subject total IgG concentration, a subject IgA concentration, a subject IgM concentration, a subject IgE concentration, a kappa free light chain concentration, a lambda free light chain concentration, or any combination thereof, from a biological sample obtained from the subject; b) a processing system comprising a computer processor and a non-transitory computer memory comprising a database and at least one machine learning algorithm; wherein the at least one machine learning algorithm is configured to process the assay values and subject values for the subject comprising a subject smoking pack years value and a measurement of the size of the IPN in the subject, wherein the subject smoking pack years value and measurement of the size of the IPN in the subject are previously inputted into the database; wherein the processing system is configured to: i) apply the at least one machine learning algorithm to the assay values and subject values to output a machine learning score for the subject; ii) report the machine learning score for the subject; and iii) provide a reference score for comparison with the machine learning score.
PCT/US2023/027597 2022-07-14 2023-07-13 Methods and systems for malignancy prediction of indeterminate pulmonary nodules WO2024015493A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263389077P 2022-07-14 2022-07-14
US63/389,077 2022-07-14

Publications (1)

Publication Number Publication Date
WO2024015493A1 true WO2024015493A1 (en) 2024-01-18

Family

ID=87554746

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/027597 WO2024015493A1 (en) 2022-07-14 2023-07-13 Methods and systems for malignancy prediction of indeterminate pulmonary nodules

Country Status (1)

Country Link
WO (1) WO2024015493A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060134713A1 (en) 2002-03-21 2006-06-22 Lifescan, Inc. Biosensor apparatus and methods of use
US20140274772A1 (en) * 2013-03-15 2014-09-18 Rush University Medical Center Biomarker panel for detecting lung cancer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060134713A1 (en) 2002-03-21 2006-06-22 Lifescan, Inc. Biosensor apparatus and methods of use
US20140274772A1 (en) * 2013-03-15 2014-09-18 Rush University Medical Center Biomarker panel for detecting lung cancer

Non-Patent Citations (15)

* Cited by examiner, † Cited by third party
Title
"A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening", CA CANCER J CLIN, vol. 68, no. 4, 2018, pages 297 - 316
"GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries", CA CANCER J CLIN, vol. 71, 2021, pages 209 - 249
INGRID BROODMAN: "Early Detection of Lung Cancer "A Role for Serum Biomarkers?"", 1 January 2016 (2016-01-01), XP055384307, Retrieved from the Internet <URL:https://repub.eur.nl/pub/93516/161012_Broodman-Ingrid.pdf> [retrieved on 20170622] *
MAISSION P, WALKER R.: "Indeterminate Pulmonary Nodules: Risk for Having or for Developing Lung Cancer", CANCER PREV RES (PHILA, vol. 7, no. 12, 2014, pages 1173 - 1178, XP093048119, DOI: 10.1158/1940-6207.CAPR-14-0364
MAISSION P, WALKER R.: "Risk for Having or for Developing Lung Cancer", CANCER PREV RES (PHILA, vol. 7, no. 12, 2014, pages 1173 - 1178, XP093048119, DOI: 10.1158/1940-6207.CAPR-14-0364
QUINN, F. A.: "The Immunoassay Handbook", 2005, ELSEVIER LTD.
SMITH RANDREWS KBROOKS D ET AL., CANCER SCREENING IN THE UNITED STATES, 2018
SMITH RANDREWS KBROOKS D ET AL.: "Cancer Screening in the United States, 2018: A Review of Current American Cancer Society Guidelines and Current Issues in Cancer Screening", CA CANCER J CLIN, vol. 68, no. 4, 2018, pages 297 - 316
SUNG HFERLAY JSIEGEL R ET AL., GLOBAL CANCER STATISTICS, 2020
SUNG HFERLAY JSIEGEL R ET AL.: "Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries", CA CANCER J CLIN, vol. 71, 2021, pages 209 - 249
TIAN L. ET AL: "Adaptive index models for marker-based risk stratification", BIOSTATISTICS, vol. 12, no. 1, 1 January 2011 (2011-01-01), GB, pages 68 - 86, XP093091898, ISSN: 1465-4644, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3006126/pdf/kxq047.pdf> DOI: 10.1093/biostatistics/kxq047 *
TIAN LTIBSHIRANI R: "Adaptive index models for marker-based risk stratification", BIOSTATISTICS, vol. 12, no. 1, 2011, pages 68 - 86
WENDLER RFONTHAM EBARRERA E ET AL.: "American Cancer Society Lung Cancer Screening Guidelines", CA CANCER J CLIN, vol. 63, no. 2, 2013, pages 106 - 117
YANG BJHUN BWSHIN SHJEONG B-HUM S-WZO JI ET AL.: "Comparison of four models predicting the malignancy of pulmonary nodules: A single-center study of Korean adults", PLOS ONE, vol. 13, no. 7, 2018, pages e0201242
ZHONG L ET AL: "Using protein microarray as a diagnostic assay for non-small cell lung cancer", AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, AMERICAN THORACIC SOCIETY, US, vol. 172, no. 10, 18 August 2005 (2005-08-18), pages 1308 - 1314, XP002516030, ISSN: 1073-449X, [retrieved on 20050818], DOI: 10.1164/RCCM.200505-830OC *

Similar Documents

Publication Publication Date Title
US20240112811A1 (en) Methods and machine learning systems for predicting the likelihood or risk of having cancer
Kyle et al. Management of monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM)
US20190257835A1 (en) Protein biomarker panels for detecting colorectal cancer and advanced adenoma
Carlsson et al. Circulating tumor microemboli diagnostics for patients with non–small-cell lung cancer
US20230314436A1 (en) Methods for the detection and treatment of lung cancer
Pan et al. Nomogram prediction for the survival of the patients with small cell lung cancer
Li et al. Driverless artificial intelligence framework for the identification of malignant pleural effusion
US20180100858A1 (en) Protein biomarker panels for detecting colorectal cancer and advanced adenoma
US20150031065A1 (en) Compositions, methods and kits for diagnosis of lung cancer
Ferreiro et al. Predictive models of malignant transudative pleural effusions
Tecles et al. Serum acute phase protein concentrations in female dogs with mammary tumors
JP7470268B2 (en) Biomarkers and methods for assessing risk of myocardial infarction and serious infections in patients with rheumatoid arthritis - Patents.com
US20170168058A1 (en) Compositions, methods and kits for diagnosis of lung cancer
Nishiyama et al. Human epididymis protein 4 is a new biomarker to predict the prognosis of progressive fibrosing interstitial lung disease
EP3155439A1 (en) Biomarkers and methods for measuring and monitoring axial spondyloarthritis disease activity
KR20230080442A (en) Methods for Detection and Treatment of Lung Cancer
Watine Prognostic evaluation of primary non-small cell lung carcinoma patients using biological fluid variables. A systematic review
Taheriyan et al. Prediction of COVID-19 patients’ survival by deep learning approaches
He et al. High serum lactate dehydrogenase adds prognostic value to cardiac biomarker staging system for light chain amyloidosis
WO2024015493A1 (en) Methods and systems for malignancy prediction of indeterminate pulmonary nodules
WO2021247577A1 (en) Methods and software systems to optimize and personalize the frequency of cancer screening blood tests
Jeanblanc et al. Development of exploratory algorithms to aid in risk of malignancy prediction of indeterminate pulmonary nodules
Ioannou et al. Prognostic models to predict survival in patients with pancreatic cancer: a systematic review
CN111263965A (en) System and method for improving disease diagnosis using measurement of analytes
Bu et al. Evaluation of C–reactive protein and fibrinogen in comparison to CEA and CA72–4 as diagnostic biomarkers for colorectal cancer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23750854

Country of ref document: EP

Kind code of ref document: A1