CN101061480A - Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis - Google Patents

Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis Download PDF

Info

Publication number
CN101061480A
CN101061480A CNA200580039170XA CN200580039170A CN101061480A CN 101061480 A CN101061480 A CN 101061480A CN A200580039170X A CNA200580039170X A CN A200580039170XA CN 200580039170 A CN200580039170 A CN 200580039170A CN 101061480 A CN101061480 A CN 101061480A
Authority
CN
China
Prior art keywords
prognosis
mark
operating part
rule
quantization characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA200580039170XA
Other languages
Chinese (zh)
Inventor
拉斐尔·马塞尔波利
克拉克·梅里尔·怀特黑德
蒂莫西·J.·费希尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TriPath Imaging Inc
Original Assignee
TriPath Imaging Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TriPath Imaging Inc filed Critical TriPath Imaging Inc
Publication of CN101061480A publication Critical patent/CN101061480A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods and computer program products for evaluating and optimizing one or more markers for use in establishing a prognosis for a patient suffering from a disease are provided. More particularly, the methods include steps for systematically evaluating a number of features that may be extracted from an image of a body sample, such as a histological slide, that has been exposed to one or more biomarkers so as to establish a prognostic decision rule based on one or more of the extracted features such that the decision rule yields a prognosis that is optimally predictive of actual patient outcome. Thus, the methods and computer program products provided yield optimally predictive prognoses to assist clinicians in developing strategies for effective patient care management.

Description

Be used for analyzing and optimizing the method and computer program product of the mark candidate thing that is used for cancer diagnosis
Technical field
The present invention relates to be used for selecting, analyze, reaching the method that optimization can be the biomarker of the material standed for of use when foundation suffers from the patient's of cancer prognosis (prognosis).
Background technology
Known propagation (gene amplification), gene delection (gene deletion), and gene mutation (gene mutation) be expressed in by abnormal protein and have remarkable effect in the abnormal cell behavior.The scope of the cell behavior of being concerned about comprises the behavior of variation to for example hyperplasia or differentiation adjustment.Therefore, in gene amplification, disappearance, and sudden change; MRNA quantizes; Or the effective detection in the protein expressioning analysis and quantification be necessary, so that help useful research, diagnosis and prognosis instrument in complex disease (for example various forms of cancer).
The gene amplification of being intended to, disappearance are arranged, reach sudden change; MRNA quantizes; Or the kinds of experiments chamber technology of detection in the protein expressioning analysis and quantification.For example, such technology comprises Western, Northern and Southern blots; Polymerase chain reaction (" PCR "); Enzyme labeled immunoassay compartment analysis (" ELISA "); And comparative genome hybridization (" CGH ") technology.Yet the daily microscopic method that utilizes because it is a kind of useful technology, allows the quick check under cell and subcellsular level, can implement expediently at lower cost simultaneously.
When microscopic method was the laboratory technique of selecting, biological specimen at first must stand particular probe and appear preparation.In case sample is produced, the human expert just in qualitative examination only by means of microscope or in quantitative and general standard research by means of the microscope analyzing samples typically that is connected on video camera and the computing machine.In some instances, microscope can be configured for analyzing fully automatically, and wherein microscope is by means of robotizations such as motorized stage and focusing, motor-driven objective changer, the controls of automatic Light degree.
The sample preparation that is used to survey may relate to the dissimilar technology of preparing that is suitable for the micro-imaging analysis, for example based on hybridization with based on the tagged technology of preparing of immunity.Such Detection Techniques may link with the suitable technology that appears, and for example with based on fluorescence with based on the technology of visible color reaction link.
(Situ Hybridization in hybridization, " ISH ") and (the Fluorescent In Situ Hybridization of the fluorescence in hybridization, " FISH "), use detection and appear technology, for example be used for detection and quantification at hereditary information propagation and mutation analysis.ISH and FISH can both be used for histological or cytological sample.These technology use the particular complementary probe to discern corresponding accurate sequence.According to the technology of using, particular probe can comprise colourity (eISH) mark or fluorescence (FISH) mark, wherein uses transmission microscopy or fluorescent microscope analyzing samples respectively then.User's target is depended in colourity mark or fluorescently-labeled use, and being marked to have in the particular instance of each type is better than alternative corresponding advantage.
Imaging and microscopic method technology are developed, reading with optimization and standardization colourity mark or stain, its can be used for surveying and/or quantize gene amplification, gene delection, gene mutation, and abnormal protein express, the above-mentioned tangent plane section of organizing of using in analysis that the suitable mark selected handles can be visible when emphasizing that abnormal cell is movable, and this can help determining for the diagnosis of the disease such as cancer and/or prognosis.
Such method is useful for the quantitative measurment that obtains the target molecule species in given tissue samples, yet, if other molecular species is emphasized by other biomarker in same tissue samples, then they can not be appreciable immediately, and existence is distinguished and is quantized such feature so that more systematically analyze the needs of tissue samples, thereby the permission clinician provides the more accurate prognosis for the patient who suffers from the complex disease such as cancer.For example, in polytype cancer, the patient of Zhen Duan small percentage still finally has ten unfortunate annual bearings in early days, as palindromia, transfer or the death in this 10 year period.Yet Zhen Duan most of cancer patients have good 10-prognosis in early days, and may not need or benefit from other active complementary therapy (for example, chemotherapy).For example, current clinical consensus is, some is early stage at least, tubercle-negative breast cancer patient should receive the assistant chemical therapy, but has FDA-to appraise and decide at present to analyze not take a risk the layering patient to be used for more active treatment.Owing to these early-stage breast cancers of great majority patient long-term surviving after operation and/or radiation-therapy does not have further treatment, complementary therapy may be unsuitable so recommend initiatively for all these patients, particularly in view of the remarkable spinoff relevant with cancer chemotherapy.The composition and the method that allow these colonies of the early-stage breast cancer patient when initial diagnosis to be distinguished into good and poor prognosis group help the clinician to select suitable therapeutic process.Thereby, need be used for assessing the method for breast cancer patient's (particularly early-stage breast cancer patient) prognosis.
Although the quantitative videomicroscopy method analysis of current prognosis standard and mark provides certain guidance in the prediction patient outcomes with when selecting suitable therapeutic process, exist for the remarkable needs that utilize clinical videomicroscopy data with systems approach that best specific and sensitive cancer prognosis (particularly patient in) in early days is provided.In addition, there are needs, to help the assessment of cancer prognosis for the method that is used for distinguishing and assess the feature that candidate's mark and its distinguish through the videomicroscopy method.
Summary of the invention
A kind of method and computer program product that is used for analyzing and/or assess at least one mark of the prognosis that is suitable for determining cancer patient is provided.Be used for analyzing at least one mark and comprise step: body sample (obtaining from cancer patient) is exposed to described at least one mark with the method for the prognosis of determining cancer patient; But use image processing system from least one section obtain at least one quantization characteristic of image contract, wherein said at least one section is from described body sample preparation; But and the decision rule application in described at least one quantization characteristic, thereby but determine the prognosis of cancer patient with determining relation between the rule based on described at least one quantization characteristic.In some embodiment of the method that is used for analyzing described at least one mark, but applying step also comprises threshold application in described at least one quantization characteristic, thereby but determines the prognosis of cancer patient based on the relation between described at least one quantization characteristic and the threshold value.In another embodiment of the method that is used for analyzing described at least one mark, applying step also comprises the disease rule that is used for described threshold value, but the disease rule can be set up good prognosis or poor prognosis corresponding to the value of at least one quantization characteristic of being correlated with threshold value.
The method that is used for assessing at least one mark comprises the step that a plurality of body sample is exposed to described at least one mark, and described a plurality of body sample obtain from a plurality of patients of correspondence, and wherein each patient has known results.But this method also comprises the step that obtains at least one quantization characteristic of image contract of using image processing system each from a plurality of sections.Described a plurality of section can be from preparing with the corresponding a plurality of body sample of each patient.In addition, the method comprising the steps of: but a plurality of candidates are determined rule application each at least one quantization characteristic in a plurality of sections, thus provide corresponding candidate's prognosis for each of a plurality of sections; But with at least one quantization characteristic, select best decision rule, wherein best decision rule determines to select the rule from the candidate.Described best decision rule predetermining, each the candidate's prognosis that is used for a plurality of sections is corresponding with each the known results for a plurality of patients best.For example, by determining to determine each specificity and sensitivity and selecting in the rule to have the most close described best specificity and sensitivity, can select best decision rule to the decision rule of (1,1) for the candidate.
Some embodiment of method and computer program product of the present invention comprises the step of the statistical independence of assessing at least one mark, thereby guarantees that described at least one mark can provide the prognosis that is independent of to statistics at least one complementary indicia substantially.More particularly, above appraisal procedure can also comprise step in certain embodiments: first, the frequency distribution of observations is compared with the frequency distribution of the theoretical prognosis of more than first body sample that is used to be exposed to described at least one mark and described at least one complementary indicia, and described more than first body sample is corresponding with the patient with known good result; Second, the frequency distribution of observations is compared with the frequency distribution of the theoretical prognosis of more than second body sample that is used to be exposed to described at least one mark and described at least one complementary indicia, and described more than second body sample is corresponding with the patient with known bad result; And last, estimate the independence (in some cases, use card square analysis) of described at least one mark with respect to described at least one complementary indicia.
According to some embodiment, but the applying step of the method that is used to assess can also comprise a plurality of candidate's threshold application in each quantization characteristic, thereby produce each corresponding a plurality of candidate's prognosis with a plurality of candidate's threshold values, be used for a plurality of body sample each.In addition, select step can also comprise from a plurality of candidate's threshold values and selects optimal threshold, thereby each the candidate's prognosis that is used for a plurality of sections is corresponding with a plurality of patients' each known results best.Such optimal threshold for example can provide by the computerize image processing system and be used for being applied to body sample (as Histological section) but the employed instrument of set-point of classifying afterwards and determining for the specific quantization characteristic of mark at mark.In case be categorized into more than the optimal threshold or below, described set-point just can convert the result of described application decision rule then to, this result is used for setting up the prognosis that is used for patient again, wherein obtains body sample from this patient.
In other embodiments, described applying step can also comprise each the disease rule that is identified for a plurality of candidate's threshold values, but described disease rule can set up corresponding to the good prognosis or the poor prognosis of the value of each at least one relevant quantization characteristics of a plurality of candidate's threshold values.
According to various embodiments of the present invention, described method can comprise a plurality of body sample are exposed at least one mark that wherein mark can be selected from following: colourity biomarker, SLPI, PSMB9, NDRG-1, Muc-1, phospho-p27, src, E2F1, p21ras, p53 and combination thereof.In addition, in certain embodiments, but described method can comprise from obtaining the image of each of a plurality of sections and extract at least one quantization characteristic, but wherein said quantization characteristic is detectable and quantifiable by image processing system.But quantization characteristic like this can comprise: transmissivity; Optical density (OD); Cellular morphology; The number percent of cell type; And combination.
More than the method step of Gai Kuoing also can be implemented in one or more suitable computer programs, this computer program computer installation (as with the microscopic method system of the image that is suitable for catching dyeing Histological section and/or the computer installation that image analysis system is communicated by letter) on be executable, and can finish the various functions relevant with said method embodiment.For example, according to an embodiment, a kind of computer program is provided, this computer program can be controlled image processing system, to determine the prognosis of cancer patient, wherein computer program comprises: (1) but be used for using image processing system from a plurality of sections each obtain the operating part that extracts feature the image, described a plurality of section is by a plurality of body sample preparations that obtain from a plurality of patients, wherein each patient has known results, and described a plurality of body sample have been exposed at least one mark; (2) but be used for a plurality of candidates are determined each the operating part of feature of rule application in a plurality of sections, thereby be provided for the every kind of candidate's prognosis that may make up that the candidate determines rule and feature; And (3) but be used for selecting and the operating part of the corresponding best decision rule of best prognosis, described best decision rule determines to select the rule from described a plurality of candidates, for described feature, best decision rule predetermining be used for each best prognosis of a plurality of sections best with to be used for each known results of patient corresponding.
Thereby, described best decision rule can be based on a plurality of patients' known results, prognosis based on the analysis-by-synthesis of at least one mark is provided, but this mark has at least one quantization characteristic, thereby described prognosis provides the false positives prognosis and the false negative prognosis of minimum number when comparing with a plurality of patients' known results.Thereby, in case it is selected, described best decision rule can be used for optimizing the analysis of one or more colourity marks, this colourity mark has and can quantize one or more features of (by the analysis in the image processing system for example), thereby the patient's prognosis that can predict good or bad result more accurately is provided.Thereby method and computer program product of the present invention can allow the clinician to utilize given mark (or mark group) better, even so that also predict bad result's generation in the patient of the early stage performance that only presents specified disease.
Description of drawings
So briefly described the present invention, now with reference to accompanying drawing, accompanying drawing needn't draw in proportion, and in the accompanying drawings:
Fig. 1 represents to be used for assessing according to one embodiment of the invention the calcspar of the method and computer program product of at least one mark;
Fig. 2 represents that when comparing with corresponding actual result candidate's prognosis may be positioned at the diagrammatic representation of its four possibility quadrants, and it is right that the quadrant of being described can be used for producing the sensitivity and the specificity that are used for candidate's prognosis;
Fig. 3 represents according to the draw example of the right ROC curve of sensitivity and specificity of one embodiment of the invention, this sensitivity and specificity be to can being used for the best of breed of selected marker feature and/or threshold value, thereby make the sensitivity of the prognosis of being set up by the combination of mark or mark and specificity all maximum;
Fig. 4 represents to be used for assessing at least one mark and to estimate the calcspar of described at least one mark with respect to the method and computer program product of the independence of at least one complementary indicia according to one embodiment of the invention; And
Fig. 5 is illustrated in the visible expression that is identified for the optimal threshold of given feature in the single marking analysis by draw in the ratio of candidate's threshold value good and bad result's distribution.
Embodiment
The invention provides the method that is used for evaluating and optimizing the mark candidate thing that when the prognosis of setting up cancer patient, uses.Although the mark that describes below (with its concrete feature) is particularly useful for the prognosis that foundation is used for breast cancer patient (and more particularly early-stage breast cancer patient), but method disclosed herein can be used for evaluating and optimizing the mark candidate thing that uses when foundation suffers from patient's the prognosis of any disease, and this disease may be linked to (through for example clinical data) and be adapted to pass through colourity biomarker (mark) for example and can be obedient in the undue expression of the specified protein of dyeing (staining) or other target molecule.Thereby, person of skill in the art will appreciate that, the analysis and the optimization of the mark that method disclosed herein is used during for the prognosis of setting up patient are applicable, and this patient has that be linked to can mark and with after cancer or other disease of other form of the expression of protein that microscopic method is analyzed or target molecule.
Method disclosed herein also is applied to assess mark, and this is marked at predicts that the breast cancer patient may be useful for the response of selected treatment.By " prediction breast cancer patient is for the response of selected treatment ", plan to estimate that patient will experience the possibility of positive and negative findings for particular treatment.As used herein, " indication positive treatment result " is meant that patient will experience the increase possibility from the useful result (for example, alleviating, reduce tumor size or the like wholly or in part) who selects treatment.Mean by " indicating negative treatment results ", with respect to the progress of potential breast cancer, with the increase possibility of not being benefited from selected treatment.Aspect some, described selection treatment is a chemotherapy of the present invention.
Method disclosed herein also is applied to assessment and/or optimizes and to distinguish or useful mark during cancer diagnosis (particularly breast cancer)." diagnosing mammary cancer " planned to comprise for example diagnosis or surveyed the existence of breast cancer, monitors the progress of disease and distinguish or survey the cell or the sample of indicating breast cancer.Described term is diagnosed, is determined and distinguishes that cancer is used interchangeably here.In a particular embodiment, method of the present invention is the combination of the most effective mark and/or mark by optimizing when diagnosing mammary cancer or other disease, can help the detection of early-stage breast cancer, this breast cancer or other disease can be identified by the detection of given mark and/or diagnose because this given be marked in the body sample or by undue express or present express loss (as the Histological section or the cytology section of dyeing).
Method described herein relates to uses the selection feature of a plurality of threshold values to given mark (biomarker or colourity biomarker), and the undue expression of this given mark can be indicated for given patient's good result or bad result.Person of skill in the art will appreciate that method of the present invention can be applied to represent to express the mark of loss, as the melastatin that for example loss is expressed in expression under the melanoma situation.In addition, but method of the present invention based on the systematic analysis of quantization characteristic (with a plurality of threshold values that are applied to it) experiencing palindromia (promptly, poor prognosis) patient with more may keep avoiding cancer (promptly, good prognosis) those people distinguish, but should quantization characteristic the colorimetric analysis of tissue samples (as the Histological section of preparation) by being exposed to one or more biomarkers can be emphasized.More particularly, method of the present invention relate to assessment be exposed to mark (as the colourity biomarker) given tissue samples feature and select the systemic process of the optimal threshold that is used for every kind of feature, thereby described mark can be analyzed according to described feature and corresponding optimal threshold, thereby the combination of mark/threshold value is prognosis the most accurately when providing with known actual patient outcomes comparison.Thereby method of the present invention can also be used to best of breed, its feature of selected marker and is used for the threshold value of each special characteristic, thereby the more accurate prognosis for the early-stage cancer patient is provided.
Biomarker by the present invention's assessment comprises gene and protein.Such biomarker comprises DNA, and this DNA comprises all or part of order of nucleic acid sequences of encoding human mark or a kind of like this complement of order.Biomarker nucleic acid also comprises RNA, and this RNA comprises any all or part of order of interested nucleic acid sequences.Biomarker protein is by DNA biomarker coding of the present invention or protein corresponding with it.Biomarker protein comprises all or part of amino acid sequence that biomarker protein or polypeptide are one of any.
" biomarker " is any gene or protein, and its expression in tissue or cell is compared with expression normal or healthy cell or tissue and is changed.Biomarker is gene and protein according to one embodiment of present invention, and the undue expression of this gene and protein is relevant with cancer prognosis, and specifically, in the example that here presents, relevant with Prognosis in Breast Cancer.In some cases, the optionally undue bad cancer prognosis of indication of expressing of the combination of interested biomarker or biomarker in patient's sample.By " indication poor prognosis " is to mean, the undue expression of biomarker-specific be less than sending out again or recurring, shift or dead increase possibility is associated of potential cancer in 5 years or tumour.The biomarker of indication poor prognosis can be called " bad biomarker as a result " here.In others of the present invention, the selectivity of the combination of interested biomarker or biomarker is too expressed the indication good prognosis.As used herein, " indication good prognosis " is meant that patient will keep avoiding the increase possibility at least five years of cancer.Such biomarker can be called " good result biomarker ".
The biomarker that can be assessed by method of the present invention comprises any gene or protein, and the undue expression of this gene or protein is associated with cancer prognosis, and is such as described above.Biomarker comprises those (that is good result biomarkers) of the gene of indicating bad cancer prognosis and protein (that is bad biomarker as a result) and indication good prognosis.Interested especially biomarker be included in cell growth and hyperplasia, cell cycle control, dna replication dna and transcribe, apoptosis, signal transduction, angiogenesis/lymph generates or the adjusting shifted in the gene and the protein that relate to.In certain embodiments, biomarker is adjusted in tissue reconstruction, extracellular matrix degradation, reaches the protease system that relates in the adjacent tissue intrusion.Although any biomarker that it is undue expresses the indication cancer prognosis in the method for the invention can be analyzed and/or be utilized, but in the specific embodiment of assessment Prognosis in Breast Cancer, biomarker from comprise SLPI, p21ras, MUC-1, DARPP-32, phospho-p27, src, MGC 14832, myc, TGF β-3, SERHL, E2F1, PDGFR α, NDRG-1, MCM2, PSMB9, MCM6, and the group of p53 select.More preferably, the interested biomarker when setting up Prognosis in Breast Cancer comprises SLPI, PSMB9, NDRG-1, Muc-1, phospho-p27, src, E2F1, p21ras or p53.In one aspect of the invention, as showing in the experimental example that here comprises, the method that is used for assessing Prognosis in Breast Cancer comprise survey E2F1 and from comprise SLPI, src, phosph-p27, p21ras, and the group of PSMB9 the excessive expression of at least one other biomarker of selecting.
Term discussed herein " feature " is meant by be exposed to feeling and/or quantifiable variation that given mark and/or biomarker produce in body sample.Feature can comprise that this variation for example can use microscopic method technology and image processing system to survey by the transmissivity of the dyeing property generation of colourity mark (comprising mark discussed above) or the variation of optical density (OD) value.Such microscopic method technology and/or image processing system, dyed at biological specimen to indicate the existing of interested biomarker-specific (and thereby indicate the existence of corresponding specified protein and/or interested target molecule) afterwards, to be used to provide the image of biological specimen visibly.In these methods and the related system some, as by with reference to the U.S. Patent application 09/957 of authorizing Marcelpoil etc. that is included in here, 446 (' 446 application) and authorize the U.S. Patent application 10/057 of Marcelpoil etc., those disclosed in 729 (' 729 applications), difference public image disposal system, method, and the use of related computer program product, with based on existing by the indicated representative colors dyestuff of the optical density (OD) of those color dye marks or transmittance values, determine the relative quantity of each molecular species in given image, as determining by imaging system and related software.These technology can also provide the quantitative of relative quantity of every kind of target molecule or protein to determine, the undue expression of this target molecule or protein can be disclosed by the colourity biomarker that is applied in the tissue samples section.For example, the expression of the feature of given mark can be used the digital picture that is labeled the tissue samples section and disclose, wherein mark uses with its one-tenth dividend, colored gene (chromagen) green, that reach blue (RGB) color part and separates, be separated with background dyeing and/or other mark, thus in the body sample that obtains from patient, in interested cell or zone, can determine mark relative influence (with respect to background dye and/or from the dyeing of other mark).
According to various embodiments of the present invention, the image contract that various features (can quantize with non-quantifiable) can use image processing system to obtain from tagged tissue sample (as the preparation Histological section with the dyeing of colourity biomarker), this image processing system can be caught the image of interesting areas (ROI), various field of view (FOV) or whole Histological section, and determine the form border that wherein limits, as comprise nucleus, tenuigenin, and the various zones of the cell of cell membrane.Be used for determining that this image processing step on the form border in section and/or body sample is called segmentation (segmentation).Interesting areas (ROI) according to various embodiment, can stride across the part of whole section, section, the discrete selection part and/or the whole FOV of section.The accurate segmentation on form border (through microscopic method and/or graphical analysis) requires manifold definite, because various biomarker type presents different subcellular locations in the cell of given body sample.For example, some biomarkers only discloses the undue expression at the endonuclear target molecule of cell.Other mark may be disclosed in the tenuigenin of cell or the undue expression of the target molecule in cell membrane.For example, table 1 expression, some marks that use when foundation is used for the prognosis of breast cancer and/or diagnoses are listed with the respective regions of their Subcellular Localization.
As described at the appendix of exemplary features, some cell descriptor feature, as CELL, CYTO, MEMB, and NUCL (be called respectively cell, tenuigenin, cell membrane, and nucleus) as the location identifier in the cell of body sample, wherein the feature that is presented by specific markers can be used the colored Gene Isolation of dyestuff for example or stain and surveys and/or quantize.
Be also illustrated in the appendix be can extract, check by method of the present invention and or a plurality of other characteristic features of the various biomarkers that quantize so that optimize the prognosis values of the combination of given biomarker or biomarker.Described feature is generally by following classification: the shape description symbols feature; Texture and/or histogram descriptor feature (they mainly refer to determine about the statistics of undue amount of expressing of the target molecule that can be emphasized by biomarker-specific and variation); Spectrum descriptor feature (as the various colourity biomarkers of the undue expression that can be used for disclosing target molecule and/or the transmissivity or the optical density (OD) of counterstain); Hierarchy description symbol feature (but they are used for calculating the quantization characteristic with respect to different levels object of being caught by imaging system); And cell descriptor feature (comprise CELL, CYTO, MEMB, reach NUCL, as in above description and appendix, describing in detail) in exemplary features.Below the inventory of the feature of describing in more detail briefly and in the appendix of attached here exemplary features does not mean that it is limit, and means only as example.But method of the present invention can be utilized various quantization characteristic (with its various combinations), so that optimize the prognosis values of the combination of given mark or mark.According to computer program product embodiments of the present invention, feature described herein can be surveyed by the controller that for example is configured to control image processing system (as computer installation) with automated manner, and this image processing system has following ability: mark interesting areas (ROI); The various chambeies and the composition of segmentation cell or tissue sample; And/or stain or dyestuff disintegrated into the RGB part, thereby determine transmissivity, brightness, optical density (OD) and/or other spectral signature.
In certain embodiments of the present invention, above feature and other can make up the summary feature that comprises the foundation characteristic of several types with establishment, but so that create the quantization characteristic that may have practicality for the purpose of diagnosis that given patient is provided and/or prognosis.In order to build a kind of like this summary feature, other more specifically feature can be quantized and check, so that create described summary feature, it may have more meanings for the clinician who seeks to obtain from the feature of being emphasized by the set of biomarker and/or biomarker prognosis and/or diagnostic value in some cases.For example, here in the experimental example of Miao Shuing, the feature of utilizing comprises the numerical value number percent of the various grades of cancer cell, exists in the given cell aggregation (can be emphasized) in the particular region of interest (ROI) that this cancer cell is considered to distinguish in body sample (as Histological section).Person of skill in the art will appreciate that, the degree of virologist by the mark determining in area-of-interest (ROI) to exist (as the area that seems and dye deeplyer Histological section) than the peripheral region, for example can " classification " when microscopic method is watched with the cell of mark dyeing.Although the naked eyes classification by the virologist helps to determine the level relatively of the mark that exists in cell, such classification is quite subjective, and may become according to each clinician with in various contexts.Thereby, during summary feature in building the present invention, suspicious cancer cell can be classified into for example 0 (there is not mark fully in indication in the target cell chamber), 1 (there is some a small amount of mark in indication in the target cell chamber), 2 (there is the mark of medium level in indication in the target cell chamber) or 3 (there is high-caliber mark in indication in the target cell chamber) more objectively.Such classification can be used video-microscopic method system and/or image processing system, as those disclosed in ' 446 applications and ' 729 applications, finishes with automated manner.As following summarizing in the table 2, according to an example of the present invention, by NUCL, CYTO, MEMB, DYE2, OD, and the feature of MEAN indication can make up, the optical transmittance value that has various values with generation, these various values can be separated to determine that given colourity biomarker (or in some instances in given cell, its chromatic component) level (for example, by " DYE2 " indication).Identical dyestuff can be used for making given biomarker (for example to become the colourity biomarker, normally used dyestuff stain, as for DAB well-known to those having ordinary skill in the art or other), yet, can be disclosed in the existence of target molecule in the various lumens (as nucleus, cell membrane and/or tenuigenin) by the various not isolabelings of the present invention's assessment.Exemplary threshold (corresponding with transmittance values), expression thereby can be assigned to each of observed cell in one of following classification in this case in table 2: 0,1,2 or 3.Can use image processing system and/or microscopic method to carry out with the assessment of the corresponding classification 0 of desired amt of non-staining cell (that is the cell that, when being exposed to mark, does not present the undue expression of target molecule).The appropriate number of 0 (non-dyeing) cell can also be used in this specific embodiment the average tumor area (for example, by 1100 estimated pixels of the feature that is called CELL AREA (seeing the appendix of exemplary features)) that is obtained by the calculating of 1,2 and 3 cell areas (that lists below the use determines) and calculate:
N 1=N Neg Rof (1)
N 2=N Test (2)
N 3=N Pos Ref (3)
N Total = max ( N 1 + N 2 + N 3 , FOCUS _ AREA 1100 ) - - - ( 4 )
N 0=max(0,N Total-N 1-N 2-N 3) (5)
In other embodiments, the quantity of cell can be used the method except that determining cell area and calculate (as by the nucleus of counting in the FOV that dyes with the nucleus telltale mark).In case 0,1,2, and the quantity of 3 cell types (be respectively N 0, N 1, N 2And N 3) be determined (using the various threshold values that for example in table 2, provide), just can calculate the number percent of 0,1,2 and 3 cells.Table 3 presents and uses prefix CELL_PERCENT and the expression title with these new summary features of the value identifiers of the cell type of given number percent reflection.These demonstration summary features can be used as simple number percent and calculate.For example, CELL_PERCENT_0 can be by following calculating:
CELL _ PERCENT _ 0 = N 0 N Total × 100 - - - ( 6 )
Although use CELL_PERCENT described above to summarize feature in the experimental example of Miao Shuing, but can assess any amount of possible quantization characteristic here as the part of the embodiment of method and computer program product of the present invention.For example, one or more (relevant with the analysis of the dyeing Histological section that for example uses image analysis system) of disclosed chromaticity can be made up to form the summary feature of another kind of type in the appendix of exemplary features, and each feature of perhaps describing in appendix can be used and analyze independently.
Various feature described above and summary feature are applicable in the analysis of one or more marks, body sample (or the section of preparation thus that can be used for dyeing of this mark, Histological section for example), so that set up the prognosis that (or helping to set up) is used for cancer patient (as the early-stage breast cancer patient).According to embodiments of the invention, the various combination of described mark and its feature can use the embodiments of the invention assessment, setting up feature, characteristic threshold value (as the given CELL_PERCENT of type-2 cancer cell in given area-of-interest (ROI)), and the best of breed of type, thereby can optimize the sensitivity and the specificity of given mark or marker combination.In addition, other type can be combined based on patient characteristics with feature disclosed herein, as (but being not limited to): patient age; The patient medical history; And indication is used for the possible prognosis of cancer patient and/or the other factors of diagnosis.For example, lymph node involves, tumor size, histology grade, estrogen and progesterone receptor level, Her 2/ neurolemma state, tumour ploidy, and family history may all be to help to be used for prognosis and/or the diagnosis factor that early-stage breast cancer patient's prognosis is set up.
Use method and computer program product of the present invention, feature, threshold value, and marker combination can be analyzed and be assessed efficiently and systematically, determine best specificity and sensitivity when being used for the prognosis of any given cancer patient in foundation.In method and computer program product of the present invention, the terminal point that is used for estimating specificity and sensitivity be prognosis (for example, use particular candidate mark and/or corresponding candidate feature prediction result) with the actual clinical result comparison of (that is, whether patient kept avoiding cancer or standing recurrence in 5 years).As shown in Figure 2, candidate's prognosis by a plurality of candidate feature/threshold value combination results can be drawn in represented four-quadrant matrix based on the known results of the body sample of using in the method for the invention, with determine by the true positive of given mark/feature (and/or decision rule) combination results 210, true negative 240, false positive 220, and the quantity of false negative 230 prognosis, as with following in greater detail.Calculate true positive 210, true negative 240, false positive 220, and the relative populations of false negative 230 prognosis after, can the estimated performance sensitivity and specificity right, with estimate mark/feature/decision principle combinations as the validity of prognosis instrument (as with following in greater detail).
As used herein, " specificity (specificity) " is meant that method of the present invention can distinguish the level of true feminine gender exactly.In clinical research, specificity by true negative quantity divided by true negative and false positive sum calculate (as by in the quadrant of Fig. 2, draw candidate's prognosis and definite).Mean the level that method of the present invention can be distinguished the sample that is the true positive exactly by " sensitivity (sensitivity) ".Sensitivity in the clinical research by true positive quantity is calculated (also draw candidate's prognosis and definite) divided by true positive and false negative sum in by the quadrant at Fig. 2.In certain embodiments, the mark that discloses by disclosed method, feature, and the sensitivity of the given combination of threshold value be at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or bigger.In addition, by the available specificity of this appraisal procedure preferably at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or bigger.
As used herein, the positive of " truly " and " falseness " and feminine gender will depend on that the mark of consideration or marker combination are good result or bad result queue.In other words, at the good result mark (promptly, those of indication good prognosis) under the situation, " true positive " is meant, those samples that present the undue expression of biomarker interested-as determining (for example, by immunohistochemical positive staining), have the good actual clinical result of confirmation by method of the present invention.On the contrary, " false positive " shows the undue expression of good result biomarker, but has the bad actual clinical result of confirmation.About " true negative " and " false feminine gender " of good result mark not show tags too express (for example, in immunohistochemical method not stained positive), and confirm bad and good actual clinical result respectively.
Similarly, under the situation of bad result queue, " true positive " is meant, presents the undue expression of interested mark or composite marking, those samples with bad actual clinical result of confirmation.In a word, be meant, wherein predict the sample of actual clinical result (that is, good or bad) exactly about " true positive " of good and bad biomarker as a result." false positive " shows the undue expression of bad biomarker as a result, but has the good actual clinical result of confirmation.Do not show that about " true negative " and " false negative " of bad biomarker as a result biomarker too expresses, and have the good and bad actual clinical result of confirmation respectively.Method and computer program product utilization of the present invention uses the feature of a plurality of marks, mark and is used for the prognosis and actual clinical result's systematic comparison that the threshold value of given feature produces, so as to determine mark, feature, which kind of best of breed most probable of reaching threshold value provides the defined prognosis the most accurately by the actual clinical result.
Fig. 1 represents to be used for to assess the schematic flow diagram of the method according to an embodiment of the invention of at least one mark, and this at least one mark can be used for determining the prognosis of cancer patient.Step 110 expression exposing step comprises a plurality of body sample are exposed to a mark (or in some cases, a plurality of marks).Described a plurality of body sample for example obtains from a plurality of patients of correspondence, and wherein each patient has known clinical effectiveness.As above in greater detail, described mark can comprise various colourity biomarkers, this colourity biomarker can be used for surveying all types of target molecule (for example, protein) that may too be expressed in given cell.Described body sample can comprise the biopsy sample of obtaining from the patient with disease, and method of the present invention is being used for for the described mark of this disease assessment.
The next step of step 120 expression the method according to this invention, but this step comprise use image processing system from a plurality of sections each obtain at least one quantization characteristic of image contract, wherein said a plurality of sections from the corresponding a plurality of body sample preparations of each patient with known results.Described section can keep the order cross section of biopsy core this or other tissue samples, and can be exposed to can be at the described mark under the assessment one or more, as the part of method of the present invention.But described section can comprise and is colored and/or dyes with the Histological section of the extraction that promotes quantization characteristic (as color, shade, brightness, transmissivity (TRANS), optical density (OD) (OD) or the above variation felt of further feature in greater detail).For example, described section can be handled with stain, with the mark (or a plurality of mark) of emphasizing that described body sample is exposed to.In addition, described section can be handled with counterstain, and it has the color and/or the dyeing property of the dyeing of tending to emphasize described interested one or more marks.Person of skill in the art will appreciate that such colourity stain can comprise DAB (tending to dye brown) to the outward appearance of mark, and counterstain can comprise haematine (tending to the normal morphology of cell is dyed blueness).In addition, any of stained for example can use disclosed colored Gene Isolation technology in ' 446 applications and ' 729 applications and analyze.
As mentioned above, extraction step can relate to image analysis system and the correlation control unit (as computer installation) that use for example is configured to analyze given image (as the area-of-interest (ROI) of whole section, camera field of view (FOV) or selection), from the video-microscopic method image contract feature of section.As describing in detail in the attached exemplary features appendix, the multiple different characteristic relevant with the image of the section that is exposed to given mark (or tag set) can be extracted and analyze.In certain embodiments, the clinician such as the virologist can utilize image analysis system to select ROI (thereby for example with the microscopic method image that dyes the existence of secretly indicating a large amount of given marks with DAB regional corresponding).In ROI, image analysis system (with the controller that communicates with) can be used for isolating and extracting a plurality of features of describing in attached appendix.For example, a plurality of cells in ROI can be calculated, and wherein the number percent of Class1 cell also can be calculated (after determining that the optical density (OD) of light of the different lumens (depending on type) that comprise is crossed in transmission in ROI, by the divider setting of application examples as general introduction at table 2).In order to use threshold values or objective decision rule (see step 130, below describe in detail), but described feature in most of the cases is a quantization characteristic, as number percent, cell quantity, area, brightness, transmissivity and/or optical density (OD).For example, in appended experimental example, described summary feature from comprise Class1, type 2, and the various ROI of the number percent (with the combination of these number percents) of type 3 cancer cell extract, wherein said number percent calculates by the more specific feature of combination (as being used for given cell is assigned to the transmissivity and/or the optical density (OD) of pigmented section that particular type is indicated the cell of (for example, Class1,2 or 3)).
But the step 130 of an embodiment of the inventive method comprises and determines a plurality of candidates rule application in each extraction quantization characteristic of a plurality of sections, thereby provides corresponding candidate's prognosis for each of section." decision rule " can be grouped into by several one-tenth, comprises disease rule (but the quantization characteristic indication good prognosis that it relates to greater than given threshold value still is determining of poor prognosis) but and the threshold value that is used for given quantization characteristic.In the single labelled analysis with single feature, the decision rule can be the scale-of-two decision to described special characteristic.According to a plurality of embodiment of the present invention, whole decision rule relates to generation good or bad candidate's prognosis (the disease rule that depends on described candidate's threshold value and described correspondence).For example, according to an embodiment, good prognosis can be designated as zero (0), and poor prognosis can be designated as (1).Yet for each possibility threshold value, two of being useful on the disease rule may select (that is, good prognosis (0) can refer to the value less than threshold value, and perhaps selectively, poor prognosis (1) can refer to the value less than threshold value).Thereby, each disease rule (to each possibility threshold value) can be evaluated for each body sample (corresponding with the patient with known results), and be placed on one of corresponding four quadrants of one of following classification shown in Fig. 2 in: true positive (quadrant a, 210), false positive (quadrant b, 220), false negative (quadrant c, 230), reach true negative (quadrant d, 240).Then can for each of possible threshold value/disease principle combinations of being used for each body sample (corresponding with the patient with known results) produces one may prognosis, thereby the best disease rule that is used for each threshold value can be by each quadrant to Fig. 2, based on the appearance of the good and bad result in this quadrant, select disease rule and determine.
For example, but the given threshold value (T) that is used for the specific quantization characteristic (F) of mark, and two disease rules may be determined prognosis.The first possible rule is, if F greater than T, then prognosis is bad (1).Second may rule be: if F greater than T, then prognosis is good (0).For these possible disease rules each, actual patient result (that is, producing true positive or true negative) may or be predicted in the prognosis of being predicted exactly, perhaps fails to predict actual patient result (that is, produce false positive or false negative).Which quadrant that might determine Fig. 2 comprises maximum possible prognosis, but to determine the most suitable given quantization characteristic of which disease rule.For example, with reference to Fig. 2, being used for first may can be drawn in regular possible prognosis, to determine where the result is positioned at.In addition, be used for second may rule possible prognosis can be drawn, where be located in the quadrant of describing among Fig. 2 with definite result.In suitable quadrant, draw two kinds may the disease rules after, by determining prediction, normalize to described good and bad result's sum well to bad result's ratio, can determine best disease rule.For example, for given feature and threshold value, if use first may disease rule (if F>T, then prognosis=bad (1)), then great majority draw and a little may be arranged in true positive quadrant.In this case, may produce following candidate and determine rule: but the patient who is presented on the above quantization characteristic of described threshold value is considered to have poor prognosis (for the positive of disease).In another example, if use the first possibility rule: if F<T, then prognosis=good (0), then great majority draw and a little may be arranged in false negative quadrant.In this case, the candidate determines rule to pronounce: but the patient who is presented on the above quantization characteristic of described threshold value is considered to have good prognosis (for the feminine gender of disease).Person of skill in the art will appreciate that other statistical method also can be used for obtaining efficient decision rule.For example, linear discriminant, secondary discrimination, vague generalization linear model, logistic regression, punishment are differentiated, flexibility is differentiated, mix differentiation and/or other statistical method can be used for obtaining such decision rule, as the part of step 130 of the present invention.
As shown in fig. 1, but step 140 comprises at least one quantization characteristic is selected best decision rule, this best decision rule determines to select the rule from described a plurality of candidates.Best decision rule is selected as, and candidate's prognosis of each of described a plurality of sections is corresponding with each the known results for a plurality of patients best.For example, determine to select the rule described decision rule from described a plurality of candidates, thereby the false negative and false positive optimum prediction prognosis instrument of minimum number is provided when providing with the patient's who obtains body sample (seeing step 110) from it clinical effectiveness relatively.As mentioned above, described candidate determines rule to have threshold value composition and disease rule composition.Can select optimal threshold by systematically assessing a plurality of candidate's threshold values (with the disease rule), thus can be the most closely corresponding to each best prognosis that generates by this optimal threshold of a plurality of sections with the known results of a plurality of patients each (producing a plurality of sections) by it.In addition, use specificity and sensitivity can test the efficient of given decision rule, as following expression in formula 7 and 8.
According to some embodiment of the present invention, select best decision rule also to comprise and determine to determine that with a plurality of candidates each the corresponding a plurality of specificity and the sensitivity of rule are right.In such embodiments, being used for each candidate determines the specificity of rule (with each and corresponding disease rule that are used for a plurality of candidate's threshold values) and sensitivity to calculate by the candidate's prognosis that determines rule from each candidate is compared with each patient's who obtains body sample from it actual known results.Carry out this relatively the time, true positive (quadrant a), false positive (quadrant b), false negative (quadrant c), and each of the relative populations of true negative (quadrant d) can use the quadrant system and determine, the system of in Fig. 2, describing as use.Use the relative populations of each quadrant, for each candidate determine rule and each of described a plurality of candidate's threshold values use following formula can calculate sensitivity and specificity to (sens, spec):
Figure A20058003917000271
Thereby as above general description, sensitivity is meant that bad patient as a result is assessed as the positive probability of (that is, being used as the true positive) about mark.Similarly, specificity is meant that a good result patient is evaluated as the feminine gender probability of (that is, being used as true feminine gender) about mark.
Right each of specificity and sensitivity can be drawn on bidimensional sensitivity and specificity chart then, as shown in Figure 3, wherein each point is meant for a plurality of candidates and determines specificity and the sensitivity value that rule each (with for each of a plurality of candidate's threshold values) calculates.The chart of representing in Fig. 3 is also referred to as receiver operating characteristic (ROC) curve, and expression has determined the curve map of the sensitivity value 310 and the corresponding specificity values 300 of rule set together with the candidate that the corresponding data set of actual clinical result is compared.Desirable prognostic assay has the desirable sensitivity of drawing at point 1,1 place and specificity to 320, and this point 1,1 all prognosis results of indication comprise the true positive or true negative (seeing quadrant a210 and d240 in Fig. 2).Right to each sensitivity of drawing and specificity on the ROC curve, use that draw and desirable several between poor specificity 350 and sensitivity differ from 340, can calculate draw to the Euclidean distance of the ideal of locating in (1,1) between to 320.Drawing after the ROC curve shown in Fig. 3, can distinguish have to ideal right to the specificity and the sensitivity of 320 minimum Euclideam distance 320, thereby can select best decision rule (with corresponding optimal threshold and/or disease rule), so that the mark and the specific sensitivity and the specificity of characteristics combination of aiming under assessment is right.In addition, in certain embodiments, can select best decision rule so that make mark under assessment and the sensitivity of characteristics combination and specificity maximization (promptly near ideal (1,1) sensitivity and specificity to).
As shown in Figure 4, some method of the present invention also is included in the additional step that schematically illustrates in the piece 150, it comprises the statistical independence of assessing at least one mark, thereby guarantees that described mark can provide the prognosis that is independent of to statistics at least one complementary indicia substantially.Thereby this embodiment can guarantee, and is right for the given mark that is applied to body sample, is to add up independently substantially by the prognosis of its generation, thereby mark can not provide the cardinal principle repeatability information about complementary indicia.This can guarantee, for example, when complementary indicia and first mark are not to add up ground substantially when independent, complementary indicia is not used with first mark.It is repetitions that the dependence of two marks can be indicated them, and the interpolation of second mark is not added bonus values on the right prognosis ability of given mark to.In order to optimize the prognosis ability of given mark list, wish that also reducing signal " noise " by minimizing usage flag (this mark provides the prognosis information of repetition when comparing with another mark in list) measures.
The assessment of the statistical independence of two marks for example may relate to following additional step in certain embodiments: compare the frequency distribution of observations (1) with the frequency distribution of the theoretical prognosis of the first group of body sample that is used to be exposed to first mark and complementary indicia, wherein first group of body sample is corresponding with the patient with known good result; (2) frequency distribution of observations is compared with the frequency distribution of the theoretical prognosis of the second group of body sample that is used to be exposed to described first mark and described complementary indicia, wherein second group of body sample is corresponding with the patient with known bad result; And (3) use card square (X 2) analyze, estimate the independence of described at least one mark with respect at least one complementary indicia.
For example, X 2Analysis can be carried out, with box lunch once consider 2 marks and consider patient (corresponding) with body sample as a result the time estimate mark independence.Table 7 describes in detail for specific markers, how to obtain X for the good and bad Ya-population of patient as a result (sub-population) 2Value.According to an example, X 2Value is calculated as 7.81 with 0.05 the probability of error (p).Thereby following result can be arranged: (1) is if X 2 Good<7.81, can not refuse H so 0Good; (2) if X 2 Bad<7.81, can not refuse H so 0Bad; And thereby (3) if (X 2 Good<7.81 and X 2 Bad<7.81), then can not refuse H 0, and mark can be thought independently.
Method disclosed herein also can be implemented in one or more suitable computer programs, this computer program computer installation (as with the microscopic method system of the image that is suitable for catching the section of dyeing Histological section or cytology and/or the computer installation that image analysis system is communicated by letter) on be executable, and can finish the various functions relevant with related system with method described herein.More particularly, the step 120 of the method embodiment that in Fig. 1 and 4, shows, 130,140, and 150 can finish, but this computer program has and is used for finishing or otherwise one or more operating parts of method step directed to be born by means of computer program.For example, in such computer program embodiments, but operating part by promote computer installation (or other control device), with one or more microscopic method system that is suitable for extracting the feature of describing in the appendix that is being included in the exemplary features here and describing in detail or image analysis system between communicate by letter, can finish the step of in Fig. 1 and 4, representing 120.For example, but may extract statistics (but or another kind quantization characteristic) from digital picture (obtaining) by the operating part that step 120 signal shows through image analysis system with the corresponding dyeing of the dyeing property of specific markers Histological section.
But the operating part of computer program of the present invention determines rule that but the systematicness from least one quantization characteristic of each extraction of a plurality of sections is used the step 130 that also can finish expression Fig. 1 and 4 through a plurality of candidates of limit in addition, thereby produce each corresponding a series of candidate's prognosis with the multiple combination of a plurality of decisions rules of limit (in some cases, comprising the possible threshold value that is used for a plurality of marker combination and its feature and/or the system evaluation of disease rule).
According to some embodiment, but the known results of each of use of the operating part of computer program of the present invention and the corresponding patient of a plurality of sections under investigation, right by calculating for each specificity and sensitivity of candidate's prognosis, also can carry out or promote the step of in Fig. 1 and 4, representing 140.Thereby, but the operating part that shows of signal can be determined with target and/or best specificity and sensitivity regular to corresponding decision in step 140.
At last, as shown in the step 150 of Fig. 4, abovely square analyze or other technology but the operating part of computer program of the present invention uses, also can be intended to and/or promote the determining of mark independence of two or more marks about the described card of the method for the embodiment of the invention.The ubiquity of determining also can consider some result obtain a plurality of sections patient's population of (with its image) from it like this.
Thereby, person of skill in the art will appreciate that, the threshold value that the computer program of the embodiment of the invention may generate in the time of can being used for systematically being evaluated at the assessment tag set, disease rule, and the complex combination of corresponding decision rule based on order, thus determine will near and/or arrival target and/or the marker combination of best specificity and sensitivity level and the decision rule corresponding with it.
Person of skill in the art will appreciate that any or all step in the method for the invention can be realized by the people, perhaps selectively, carries out with automated manner.Thereby, body sample preparation (seeing for example step 110), sample dyeing (seeing for example step 110), and the step of the detection (seeing for example step 120) expressed of biomarker can be automatic.Moreover in certain embodiments, immunohistochemical method of the present invention uses with computerize imaging device and/or software, to promote by the resolution of virologist to the positive staining cell.Method disclosed herein also can (for example, tumor size, lymph node state, other biomarker (comprising for example Her2/ neurolemma, Ki67, estrogen receptor (ER), progesterone receptor (PR) and expression p53)) be combined with other method of prognosis or analysis.By this way, use the optimization of biomarker of method described herein and assessment can promote detection by the undue expression of the various biomarkers of the present invention assessment, thereby the more accurate of prognosis that allows to suffer from the patient of disease determine, this disease may with one or more undue expression of various biomarkers chain mutually.
In addition, for those skilled in the art in the invention, will expect multiple modification of the present invention and other embodiment, it has the benefit of the instruction that presents in above description and relevant drawings, appendix and example.Therefore, be appreciated that to the invention is not restricted to disclosed specific embodiment, and revise with other embodiment and plan to be included in the scope of appended claims book.Although adopt particular term here, they only use with the general and descriptive meaning, and are not in order to limit purpose.
Following experimental example is described in the use that assessment can be used for setting up the embodiment of the invention when quantizing to summarize feature for 4 candidate's biomarkers of breast cancer patient's prognosis and Qi Ke.It provides by explanation rather than by restriction.
Experimental example: the assessment that is used for setting up biomarker (SLPI, p21ras, E2F1 and the src) group of Prognosis in Breast Cancer
Foreword:
According to the experimental example that comprises here, embodiments of the invention can be used for assessing the combination of biomarker, and it is undue expresses for setting up diagnosis for the patient with various types of breast cancer and prognosis may be useful.Under the situation of appended experimental example, and in other embodiments of the invention, the mark group can be evaluated, to determine the best decision rule based on order.For example mean by " breast cancer " and to be categorized as pernicious pathological those conditions by biopsy.Clinical being described in the medical technology of breast cancer diagnosis known.Person of skill in the art will appreciate that breast cancer is meant any malignant tumour of breast tissue, comprise for example cancer or sarcoma.In specific embodiment, breast cancer is pipe cancer original place (DCIS), lobular carcinoma original place (LCIS) or mucinous carcinoma.Breast cancer also refers to infiltrate pipe (IDC) or infiltrates lobular carcinoma (ILC).In most of embodiment of the present invention, topics of interest is that suspection has or reality is diagnosed the patient that breast cancer is arranged.
American cancer joint committee (AJCC) has developed a kind of use " TNM " classification schemes and has been used for breast cancer standardized system stage by stage.Estimate patient's main tumor size (T), regional lymph nodes state (N), and the remote existence/shortage that shifts (M), and be categorized into stage 0-IV based on this factors combine then.In this system, main tumor size is classified by the scale of 0-4, and (T0=does not have the sign of main tumour; T1=≤2cm; T2=>2cm-≤5cm; T3=>5cm; T4=directly expands to the tumour of the virtually any size of the wall of the chest or skin).Lymph node state classification is that (the N0=regional lymph nodes does not shift N0-N3; That N1=transfers to is movable, the homonymy armpit is given birth to lymph node; N2=transfers to the homonymy lymph node that interfixes or transfers to other structure; N3=transfers to the homonymy lymph node below breastbone).Shift (metastasis) by the shortage (M0) of remote transfer or exist and classify.Contained by the present invention although be used for being based upon the breast cancer patient's of any clinical stage the assessment of mark of prognosis, the evaluating and optimizing of mark of the prognosis that is used for being based upon the breast cancer patient in the early-stage breast cancer had special interest.Mean stage 0 (original place breast cancer), I (T1, N0, M0), IIA (T0-1, N1, M0 or T2, N0, M0) by " early-stage breast cancer ", reach IIB (T2, N1, M0 or T3, N0, M0).Early-stage breast cancer patient presents seldom or does not have lymph node to involve.As used herein, " lymph node involves " or " lymph node state " be meant whether cancer has transferred to lymph node.Breast cancer patient is categorized as " the lymph node positive " or " lymph node feminine gender " on this basis.The method of breast cancer patient and segmentation disease of distinguishing is known, and review, and the imaging technique that can comprise hand inspection, biopsy, patient and/or family history are as breast x-ray photography, magnetic resonance imaging (MRI), reach the positron emission tomogram.
Term " prognosis (prognosis) " is understood in present technique and is comprised prediction about the possible process of breast cancer or breast cancer progress, particularly about disease alleviations, disease reproduction, tumor recurrence, transfer, and the possibility of death.Purpose for example described herein, " good prognosis " is meant that the patient who suffers from breast cancer avoids disease (promptly with maintenance, avoid cancer) possibility at least five years, and " poor prognosis " means in the reproduction or recurrence, transfer or the dead possibility that are being less than potential cancer in 5 years or tumour.Be categorized as cancer patient and in 5 years, keep avoiding potential cancer or tumour at least with " good result ".On the contrary, " bad result " cancer patient experienced disease reproduction, tumor recurrence, transfer or death in 5 years.As used herein, the correlation time that is used for estimating prognosis or does not have a disease time-to-live is from the exenterate of tumour or forbidding of inhibition, alleviation or tumor growth.
As described above here, a plurality of clinical and prognosis breast cancer factors are known in present technique, and are used for the possibility of predicted treatment result and palindromia.Such factor comprises that lymph node involves, tumor size, histology grade, estrogen and progesterone hormone receptor status (ER/PR), Her2/ neurolemma level, and tumour ploidy.Use method of the present invention, the mark that uses when setting up early-stage breast cancer patient's prognosis and the assessment of its combination of features can be independent of the estimation of these or other clinical and prognosis factor or finish in combination with it in the systematization mode.
Method of the present invention allows the system evaluation of candidate's biomarker (with its feature), thereby with other known prognosis designator (for example, lymph node involves, tumor size, histology grade, estrogen and progesterone receptor level, Her2/ neurolemma state, tumour ploidy, and family history) analysis compare, provide Prognosis in Breast Cancer more superior estimation.
Breast cancer is by several selectable tactical managements, and these strategies comprise for example operation, radiation-therapy, hormonotherapy, chemotherapy or its certain combination.As known in the present technique, the treatment decision that is used for indivedual breast cancer patients can be based on the size of quantity, estrogen and the progesterone receptor state of the lymph node that relates to, main tumour, and disease stage under diagnosis.The time use method disclosed herein that patient is separated into poor prognosis or the dangerous group of good prognosis can provide treatment decisive action factor extra or that replace in diagnosis.Method of the present invention allows to be used for analysis and assessment with candidate's biomarker that those breast cancer patient of good prognosis and those people that more may experience recurrence when diagnosis (that is, may need extra active treatment or its patient from benefiting) distinguish.Method of the present invention is being selected suitable biomarker, its feature, particularly useful when reaching characteristic threshold value, thereby makes the prognosis values maximization of the candidate's biomarker (or biomarker group) when setting up early-stage breast cancer patient's more accurate prognosis.As discussed above, at the early stage most of breast cancer patients that diagnose of disease long-term surviving after operation and/or radiation-therapy, and further not extra therapy.Yet the remarkable ratio among these patients (approximate 20%) will stand palindromia or death, cause some or all early-stage breast cancer patient should receive the clinical recommendation of extra therapy (for example, chemotherapy).Method of the present invention is used for assessing biomarker and its feature that can better emphasize early-stage breast cancer patient this high-risk, the poor prognosis group, and determine thus which patient can from continue and/or more active therapy be benefited and pay close attention to after treat.
In this experimental example, method of the present invention be used for assessing 4 candidate's biomarkers (SLPI, p21ras, E2F1 and src) groups and with the corresponding single summary feature of each biomarker (using image processing system to extract).This example is represented best according to an embodiment of the invention determining based on order decision rule.The feature of utilizing in this example relates to 1+, 2+ in the breast cancer tumour zone that is characterized as area-of-interest (ROI) by the virologist and the number percent of 3+ cell.Based on these features, use for mark selected/characteristics combination best to make sensitivity and specificity to maximization based on order decision rule (comprising threshold value and disease rule).
Material and method:
In this experimental example, it is analyzed to surpass 200 patients, so that evaluate and optimize not isolabeling and the characteristics combination that is used for setting up Prognosis in Breast Cancer.As summarizing at table 4, this group patient is very different, and presents the tumour of scope from T1N0 to T3N0 different phase.Patient's target property is their good result or bad result phase.Good result patient is those people that still avoided disease after 5 years; Bad patient as a result is defined as recurrence or dead patient in 5 years.The section that body sample and its correspondence obtain obtains from each patient, so that the body sample with known results is provided, thereby can determine that for each may mark/feature/threshold value makes up specificity and sensitivity are right as described above.
Body sample from research (coming the same patient group of general introduction in the comfortable table 4) is exposed to 4 biomarker groups (seeing Table 5) then, and corresponding section produced, so that make the mark section stand method of the present invention.Following steps are emphasized method of the present invention, and it is applied in this experimental example: (1) optimizes colored Gene Isolation (according to the colored Gene Isolation method of ' 446 applications and ' 729 applications) for each mark of expression best quality stain; (2) set up according to its Subcellular Localization customization segmentation for each mark; See Table 1 (nucleus, tenuigenin or cell membrane).(also see the NUCL, the CYTO that in the appendix of attached exemplary features, emphasize, and MEMB feature); Reach (3) feature and in the ROI that limits, under cell, field of view (FOV) and focus level, extract, and output to output file (XML form).
Specific computer program product (called after " multiple labeling analyzer " in this example) is used for finishing evaluating and optimizing of marker combination then according to an embodiment of the invention.According to an embodiment, computer program is constructed to be permeable to load use that microscopic method produces organize micro-array (TMA) or organize all or part of of cross section XML file, be incorporated in the data that comprise in these files with XML file (under the situation that TMA analyzes) that uses description TMA key or the Excel file (organizing under the situation of cross-section analysis) that provides patients clinical state and patient assessment, and all are analyzed further all.This merging process with comprise explicitly through microscopic method feature that extracts and the information that in about patient's TMA key (or Excel file), keeps for each body sample (corresponding) with each patient: identification number and medical condition (comprising good or bad result) and virologist assess, if it is not included in the XML formatted file.
Table 5 is listed in this example the mark (SLPI, p21ras, E2F1 and src) of assessment and summarizes feature for the corresponding CELL_PERCENT that each type extracts that (this example represents to be used for the foundation based on the decision rule of order of four marks, wherein single labelled/single characteristic threshold value is analyzed, to determine the regular based on the order decision of the best).Described decision rule uses the method for summarizing in Fig. 1 of the present invention to create, wherein predict prognosis (for every kind of possibility order of mark), wherein each mark or " on " (1) or " off " (0).In order to determine the threshold value (seeing Table 5) to the feature of each specific markers assessment, each is may threshold quantity (from 0 to 100%) analyzed, and compares with under study for action various patients' (obtaining the body sample that is used for example from these patients) result.For example, the distribution curve of table 5 expression and the corresponding CELL_PERCENT_2 of E2F1 mark.This graphical representation is as the bad patient's as a result of the function of CELL_PERCENT_2 value distribution 520 and good result patient's distribution 510.As representing among Fig. 5, more than the 2-3 percentage limits, bad patient as a result (520) is more frequent significantly than good result patient (510).Use 2.46% threshold value 550, only be used as the prognosis designator, provide 0.54 and 0.75 sensitivity and specificity respectively by means of the E2F1 mark.The row 3 of table 5 are expressed as the generation decision rule that the data of E2F1 mark from Fig. 5 determine (it comprises 2.46% the threshold value of E2F1 and disease rule (if greater than 2.46%CELL_PERCENT_2 then be " on ")).
Candidate's prognosis (may make up corresponding with in proper order every kind) is produced, and compare with each actual result then, so that determine true positive 210, false positive 220, virtual negative 230 and true negative 240 quantity for the body sample of using the quadrant system evaluation among Fig. 2.Such as described in detail above, in case suitably drawing in the quadrant, may determine regular corresponding specificity and sensitivity value (such result calculated is illustrated in the table 6) with regard to calculating with each.Can read by following by the decision rule that the data of table 6 are determined based on order: if E2F1 be ON (promptly 1) and not only a mark be ON, patient is taken as bad result so, otherwise is taken as good result.
The result:
Threshold value and regular SLPI, p21ras, E2F1 and the src of decision with definition in table 5 only used a number percent feature, on this sample set, use simpler decision rule to reach 60% sensitivity and 80% specificity based on order: if E2F1 be ON (promptly 1) and not only a mark be ON, the best prognosis for patient is bad result so.
Therefore, otherwise for patient's prognosis is good result.
As mentioned above, only based on regular 54% and 75% sensitivity and the specificity of providing respectively of the prognosis decision of E2F1.Yet,, cause 60% sensitivity and 80% specificity (using the decision algorithm based on order of result's definition of table 6) when E2F1 is that ON and SLPI, p21ras or src use based on the marker combination of explaining when being ON.
Appendix: exemplary features
But the type of following feature indication quantization characteristic, but these quantization characteristics can use the imaging system of communicating by letter with for example controller such as computer installation or the videomicroscopy method system image contract from body sample (cutting into slices as dyeing Histological section or cytology).In addition, following feature can use the embodiment of computer program product described herein to be extracted and/or to calculate.In certain embodiments, following feature can be compound and/or combination, can be thereby build by the summary feature of the easier utilization of clinician, with quantize can with the corresponding value of prognosis designator that is used for specified disease, this disease can be linked to the undue expression (and resulting dyeing) of specific target molecules.
Should be appreciated that following feature appendix provides by explanation rather than by restriction.Person of skill in the art will appreciate that further feature may be useful, and can extract and analyze, thereby use method and computer program product embodiments of the present invention to assess one or several mark.
A. shape description symbols feature:
1. area
This is the quantity (no count hole) of the prospect pixel (foreground pixel) in spot, and this mask (binary representation) is M.When pixel arrives corresponding available (k) of micron, the physical area (micron of the spot (M) of its representative in section 2).If pixel is unavailable to the physics correspondence of micron (k), then AREA is a quantity (k=1) of measuring pixel.
Area=k 2* ∑ P ∈ EP (9)
E={p|p ∈ M} wherein
Scope be [0, ∝ [.
2. girth
This is the total length at the edge (edge that comprises any hole) in spot, and this mask (binary representation) is M, allow generation when diagonal edges is digitized staircase (interior angle Bei count into , rather than 2).Single pixel spot (area=1) has 4.0 girth.When pixel arrives corresponding available (k) of micron, the physics girth (micron) of the spot (M) of its representative in section.If pixel is to the physics correspondence unavailable (k=1) of micron.
Girth=k * ∑ P ∈ EP * q (n p) (10)
Make E={p|p ∈ M} and n p = t l p r b , If p Interior and p are angle (Corner), q (n so p)=
And, q (n p)=4-∑ (t, l, r, b)
Scope is [0, ∝].
3.MINFERET
This is minimum Feret diameter (the least commitment diameter of the rectangular box of matching object is obtained after the angle of checking some).When pixel arrives corresponding available (k) of micron, the physics Min Feret diameter (micron) of the spot (M) of its representative in section.If pixel is to the physics correspondence unavailable (k=1) of micron.
Scope be [0, ∝ [.
4.MAXFERET
This is maximum Feret diameter (the maximum constrained diameter of the rectangular box of matching object is obtained after the angle of checking some).When pixel arrives corresponding available (k) of micron, the physics Max Feret diameter (micron) of the spot (M) of its representative in section.If pixel is to the physics correspondence unavailable (k=1) of micron.
Scope is [0, ∝].
5. compactedness
This value is the minimum value that is used for circle (1.0), and derives from girth (P) and area (A).Shape is spiraled more, is worth big more.
Scope is [0, ∝].
6. roughness
How coarse this be spot tolerance, and equal girth divided by protruding girth (P c).
The smooth bumps object will have 1.0 minimal roughness
Figure A20058003917000381
Scope is [0,1].
7. elongate
This value equals true length/width.It should be used for slender body.
Scope is [0, ∝].
B. histogram descriptor feature
1.SUM
SUM is the summation of all indivedual pixel marks.
Figure A20058003917000382
For transmissivity with for optical density (OD), scope is [0, ∝].
2. mean value
Arithmetic mean is the value that is called mean value usually: when not modifying when this speech " mean value " use, suppose that it is meant arithmetic mean.Mean value is the quantity of the summation of all marks divided by mark.The good measure of the central tendency that mean value is used for roughly being symmetrically distributed, but in skewed distribution, mislead, because it can be subjected to the tremendous influence of extreme value mark.Therefore, the statistical figure of other such as intermediate value are for may be more useful such as the reaction time or the distribution the family income that usually are very deflection.
The variance sum of mark and its mean value less than itself and any other the number variance.
For normal distribution, mean value is the most efficiently, and the therefore minimum sample fluctuation that stands all tolerance of central tendency.
Figure A20058003917000383
Make N = Σ i = 0 255 h ( i )
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
3.MIN
Minimum value is the minimum value that distributes.
Figure A20058003917000391
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
4.Q1
Q1 is one of the 25th percentage that distributes.25% mark is below Q1, and 75% mark is more than Q1.
Q 1 = i | { &Sigma; j = 0 j < i h ( i ) < N 4 , &Sigma; j = 0 j < = i h ( i ) &GreaterEqual; N 4 } - - - ( 16 )
Make N = &Sigma; i = 0 255 h ( i )
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
5. intermediate value
Intermediate value is the mid point that distributes: half of mark is higher than intermediate value, and half is below intermediate value.Intermediate value is not so good as the mean value sensitivity for the extreme value mark, and this makes that distribution is than the better tolerance of mean value for high deflection for it.
Each number and the absolute deviation of intermediate value and less than absolute deviation and any other number and.
Mean value, intermediate value, and model in symmetrical distribution, equate.Mean value is higher than intermediate value in positive skewness distributes, and in negative skewness distributes less than intermediate value
Figure A20058003917000394
Make N = &Sigma; i = 0 255 h ( i )
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
6.Q3
Q3 is one of the 75th percentage that distributes.75% mark is below Q3, and 25% mark is more than Q3.
Q 3 = i | { &Sigma; j = 0 j < i h ( i ) < N &times; 3 4 , &Sigma; j = 0 j < = i h ( i ) &GreaterEqual; N &times; 3 4 } - - - ( 18 )
Make N = &Sigma; i = 0 255 h ( i )
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
7.MAX
Maximal value is the maximal value that distributes.
Figure A20058003917000402
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
8. vertical number
Vertical number (mode) is the mark of frequent appearance in distribution, and as the tolerance of central tendency.Vertical number is obvious as the advantage of the tolerance of central tendency.And it is unique tolerance of the central tendency that can use with nominal data.
Vertical number stands sample fluctuation widely, and does not therefore recommend to be used as unique tolerance of central tendency.The other shortcoming of vertical number is that multiple distribution has more than a vertical number.These distributions are called " many vertical numbers ".
In normal distribution, mean value, intermediate value, and vertical number identical.
Figure A20058003917000403
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
9. three are worth mean values
Three value mean values are by calculating the summation of one of one of the 50th percentage of one of the 25th percentage+twice (intermediate value)+75th percentage and divided by four.
Three value mean values are almost the same with intermediate value resists the extreme value mark, and stands less sample fluctuation than arithmetic mean in skewed distribution.It is efficient not as mean value for normal distribution.
Figure A20058003917000404
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
10. modified mean 50
Modified mean is calculated by certain percentage that abandons minimum and highest score and the mean value that calculates the residue mark then.Revise 50% mean value by abandon the low of mark and higher 25% and the mean value of getting the residue mark calculate.Intermediate value is to revise 100% mean value, and arithmetic mean is to revise 0% mean value.
Modified mean obviously is not so good as the arithmetic mean sensitivity to the extreme value mark.Therefore it be not so good as the mean value sensitivity for skewed distribution to the sampling fluctuation.It is efficient not as mean value for normal distribution
Figure A20058003917000411
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
11. scope
Scope is the simple metric of scattering or disperseing: it equals the poor of maximum and minimum value.Scope can be the useful metrics that scatters, because of it is understood so easily.Yet it is very responsive for the extreme value mark, because of it only based on two values.Scope almost never should be as unique tolerance of scattering, if but as for such as standard deviation or half interior four minutes apart from the replenishing of other measure of spread, then can be useful.
Scope=maximal value-minimum value (23)
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
12. half interior four minutes distances
Partly interior four minutes is single tolerance of scattering or disperseing apart from (semi-interquartile range).It is as [usually being called (Q at one of the 75th percentage 3)] with one of the 25th percentage (Q 1) difference half and calculate.
Because half mark in distribution is positioned at Q 3With Q 1Between, so half interior four minutes distances are 1/2 distances that needs to cover 1/2 mark.In symmetrical distribution, will comprise 1/2 of mark apart from the interval that is stretched over interior four minutes distances of one and half more than the intermediate value from interior four minutes of one and half below the intermediate value.Yet, be not like this for the skewed distribution situation.
Half interior four minutes distances are subjected to the influence of extreme value mark very little, so it is the good measure of scattering for skewed distribution.Yet it stands bigger sample fluctuation than standard deviation in normal distribution, and therefore usually is not used in the data that similar normal state distributes.
Figure A20058003917000421
For transmission ranges is [0,1]
For optical density (OD), scope is [0.0000,2.4065].
13. variance
Variance is the tolerance that distributes and how to scatter.It calculates with the mean square deviation of its mean value as each number.
Figure A20058003917000422
Scope is [0, ∝]
14.STDEV
This feature is estimated the standard deviation based on sample.Standard deviation is that how wide value and mean value (average) disperse tolerance.Standard deviation is the square root of variance.It is the most frequently used tolerance of scattering.
Although not as range-sensitive, standard deviation is more responsive than half interior four minutes distances for the extreme value mark.Thereby when the possibility of extreme value mark existed, half interior four minutes distances should be replenished standard deviation.
Stdev = n&Sigma; x 2 - ( &Sigma;x ) 2 n ( n - 1 ) - - - ( 26 )
Scope is [0, ∝].
15. deflection (SKEW)
This feature is returned the deflection of distribution.The deflection characterization distributes around the asymmetry degree of its mean value.One of its afterbody is greater than other afterbody then be deflection if distribute.The positive skewness indication makes asymmetric afterbody to bigger distribution on the occasion of extension.The distribution that the negative skewness indication makes asymmetric afterbody extend to bigger negative value.
Figure A20058003917000424
Scope is [∝ ,+∝]
Making S is sample standard deviation
16. kurtosis (KURTOSIS)
The kurtosis of this feature return data collection.Relative kurtosis or Pingdu that kurtosis characterization distribution and normal distribution are compared.The distribution of point of positive kurtosis indication.The distribution that the indication of negative peak attitude is more flat.Kurtosis is based on the size of the afterbody that distributes.
Figure A20058003917000431
Scope is [∝ ,+∝]
S is a sample standard deviation.
C. transmissivity and optical density (OD) feature (TRANS, OD, and other)
1.TRANS-transmissivity
Transmissivity is that incident provides for quadrature usually by the ratio of the built-up radiation of transparent substance transmission or luminous flux and incident flux.
Trans = I I o - - - ( 29 )
Scope is [0,1]
In image, transmissivity causes 256 values in [0,255] scope for 8 discretizes.If basic calculation is based on such discrete value, in any case then calculated characteristics is expressed in the scope of [0,1], from 0% to 100% transmissivity.
Trans 255 = 255 I 255 I o | 255 , [ 0,255 ] - - - ( 30 )
2.OD-optical density (OD)
Optical density (OD) is relevant with transmissivity, as the negative value of its logarithm.In image, transmissivity causes 256 values in [0,255] scope for 8 discretizes.
OD = - log 10 ( Trans ) = log 10 ( I 0 I ) - - - ( 31 )
Because 8 discretizes of transmissivity, scope is [0.0000,2.4065].
Interim OD frame buffer also is discrete impact damper.
OD 255 = k &times; log 10 ( I 0 | 255 I 255 ) = k &times; log 10 ( 255 I 255 ) , [ 0,255 ] - - - ( 32 )
Make k = 255 log 10 ( 255 ) , OD 255 ( Trans 255 ( 0 ) ) = OD 255 ( Trans 255 ( 1 ) )
If basic calculation is based on such discrete value, in any case then calculated characteristics is expressed from the true OD value of 0 to infinitely great (in theory) according to scope, in practice because 8 constraints are formed into 2.4065 the upper limit.
3. brightness and dyestuff feature (LUMIN, DYE1, DYE2, DYE3)
Be reflected in for the histogram feature of transmissivity or optical density (OD) histogram calculation and solve the image of interest calculated after the colored genetic model that is used for pixel (R, G, B) value or the brightness (" LUMIN ") of Dye.The colored Gene Isolation model of RGB is for example described in ' 446 applications and ' 729 applications.
The conventional floating-point formula (33) of LUNIN (Y)=0.299R0.587G+0.114B
The formula (34) that LUNIN (Y)=[(9798R+19235G+3736B)/32768] used by code
Attention: colored gene error, dyestuff degree of confidence
When solving, the colored Gene Isolation model evaluation of RGB is reconstructed error, this error be in rgb space the input rgb value of pixel with based on the rgb value that influences from every kind of dyestuff reconstruct recomputate Euclidean distance between the rgb value.This error uses the method and apparatus of the colored Gene Isolation model of above-mentioned RGB to assess for each and each pixel of the attention object of report.
According to the colored gene error of when obtaining to carry out the white reference image that shadow correction and image standardization use, measuring, can not change greater than the probability of the human eye ability of the different transmissivities of differentiation based on the transmissivity of pixel assessment hereto for every kind of dyestuff with adding up and to calculate degree of confidence for each rgb value that in optical system, writes down and noise level (NOISE).
D. hierarchy description accords with feature
When calculate with different levels object in section (as Histological section) (as cell, cell membrane, nucleus/or other object) or during the relevant feature of the image of cutting into slices, can assess feature about the benchmark field of following level: the section relevant (SLIDE), focus (FOCUS), field of view (FOV) or cell (CELL) with object.
Section: " SLIDE ", and relevant " FOCUS ", " FOV ", " CELL "
Focus: " FOCUS ", and relevant " FOV ", " CELL "
Visual field: " FOV " and relevant " CELL "
Cell: " CELL "
E. cell descriptor feature
When calculating cell characteristic, described feature is reflected among following cell or subcellular fraction place one or more: whole cell (CELL), nucleus (NUCL), tenuigenin (CYTO) or cell membrane (MEMB).
Whole cell: " CELL "
Nucleus: " NUCL "
Tenuigenin: " CYTO "
Cell membrane: " MEMB "
The form appendix:
Table 1: the inventory and the corresponding subcellular fraction place thereof of demonstration mark.
The mark title The place
E2F1 MUC-1(IF3.9) DDRG-1(ZYMED CAP43) p21 ras p53 Phospho p27 PSMB9(3A2.4) SLPI(5G6.24) src Nucleus cell membrane cytoplasm (karyon plus cell membrane) cytoplasm nucleus cytoplasm (nucleus) cytoplasm cytoplasm cytoplasm
Table 2: be used to obtain the divider setting of disease of the selected cell of classification 1,2 or 3.
The mark of aiming lumen Allocation step If feature Be Value (transmissivity) Cell Be
Nucleus 1 NUCL_DYE_OD_MEAN 0.161151 (69%) All (2 or 3) otherwise 1
2 NUCL_DYE_OD_MEAN 0.29243 (51%) 2 and 3 3 otherwise 2
Tenuigenin 1 CYTO_DYE_OD_MEAN 0.173925 (67%) All (2 or 3) otherwise 1
2 CYTO_DYE_OD_MEAN 0.29243 (51%) 2 and 3 3 otherwise 2
Cell membrane 1 CYTO_DYE_OD_MEAN 0.06048 (87%) All (2 or 3) otherwise 1
MEMB_DYE_OD_MEAN 0.200659 (63%)
MEMB_AREA 150pix
2 CYTO_DYE_OD_MEAN 0.173925 (67%) 2 and 3 3 otherwise 2
MEMB_DYE_OD_MEAN 0.29243 (51%)
MEMB_AREA 150pix
Table 3: number percent is summarized feature.
Number percent from the cell of classification The feature title
01230 and 12 and 30,1 and 21,2 and 3 CELL_PERCENT_0 CELL_PERCENT_1 CELL_PERCENT_2 CELL_PERCENT_3 CELL_PERCENT_01 CELL_PERCENT_23 CELL_PERCENT_012 CELL_PERCENT_123
Table 4: body sample is from its patient's who obtains description and result's (experimental example).
Stage Well Bad All
T1N0 T1N1 T2N0 T3N0 60 6 59 6 20 7 39 10 80 13 98 16
Sum total 131 76 207
Table 5: the number percent that is used for experimental example is summarized feature (expression is for the threshold value of determining based on the decision rule of order).
Mark Feature Threshold rule (if 1)
SLPI p21ras E2F1 src CELL_PERCENT_01 CELL_PERCENT_0 CELL_PERCENT_2 CELL_PERCENT_1 99.887874 < 35.642851 < 2.463659 < 37.624326 >
Table 6: be used in combination the sensitivity of serial interpretation means and specificity to (order S0110 must read by following: SLPI=OFF/p21ras=ON/E2F1=ON/src=OFF for SLPI, p21ras, E2F1 and SRC from experimental example.)
SLPI-p21ras-E2F1-src
In proper order CumulBad CumulGood Sensitivity Specificity
S1111 S1011 S1110 S0111 S1010 S1101 S0011 S0110 4 7 12 14 22 26 31 35 0 0 0 8 12 14 16 19 0.069 0.1207 0.2069 0.2414 0.3793 0.4483 0.5345 0.6034 1 1 1 0.9184 0.8776 0.8571 0.8367 0.8061
S1001 S1100 S0010 S0101 S1000 S0001 S0100 S0000 37 37 39 41 46 49 52 58 24 26 37 40 56 63 71 98 0.6379 0.6379 0.6724 0.7069 0.7931 0.8448 0.8966 1 0.7551 0.7347 0.6224 0.5918 0.4286 0.3571 0.2755 0
Table 7: generate the X that is used for the good result patient 2Value (X 2 Good) and be used for bad patient's as a result X 2Value (X 2 Bad) X 2Analyze the details of formula.
Good result
In proper order Observe Theoretical Weighted deviation
00 a P(00|Good)×S (a-P(00|Good)×S) 2/P(00|Good)×S
01 b P(01|Good)×S (b-P(01|Good)×S) 2/P(01|Good)×S
10 c P(10|Good)×S (c-P(10|Good)×S) 2/P(10|Good)×S
11 d P(11|Good)×S (d-P(11|Good)×S) 2/P(11|Good)×S
Summation S=(a+b+c+d) S X 2 Good
H 0Good: the mark about the good result patient is independently
Bad result
In proper order Observe Theoretical Weighted deviation
00 a P(00|Bad)×S (a-P(00|Bad)×S) 2/P(00|Bad)×S
01 b P(01|Bad)×S (b-P(01|Bad)×S) 2/P(01|Bad)×S
10 c P(10|Bad)×S (c-P(10|Bad)×S) 2/P(10|Bad)×S
11 d P(11|Bad)×S (d-P(11|Bad)×S) 2/P(11|Bad)×S
Summation S=(a+b+c+d) S X 2 Bad
H 0Bad: the mark about bad patient as a result is independently
All communiques of in instructions, mentioning and patented claim indication those skilled in the art in the invention's level.All communiques and patented claim be here by with reference to being included in same degree, just as each indivedual communique or patented claim clearly with individually indicate by with reference to comprising.
Purpose describes in greater detail by explanation and example ratio although above invention is in order to understand clearly, obviously, can implement certain change and modification in the scope of attached embodiment.

Claims (27)

1. one kind is used for analyzing the method for at least one mark with the prognosis of definite cancer patient, and described method comprises:
Body sample is exposed to described at least one mark, and described body sample obtains from described cancer patient;
But use image processing system from least one section obtain at least one quantization characteristic of image contract, described at least one section is from described body sample preparation;
But the decision rule application in described at least one quantization characteristic, thereby but determine the prognosis of described cancer patient based on the relation between described at least one quantization characteristic and the described decision rule.
2. method according to claim 1, but wherein said applying step also comprise threshold application in described at least one quantization characteristic, thereby but determine the prognosis of described cancer patient based on the relation between described at least one quantization characteristic and the described threshold value.
3. method according to claim 2, wherein said applying step also is included as described threshold application disease rule, but described disease rule can be set up good prognosis or poor prognosis corresponding to the value of described at least one quantization characteristic relevant with described threshold value.
4. method according to claim 1, wherein said extraction step also comprises distinguishes area-of-interest, but extract described at least one quantization characteristic from this area-of-interest, described area-of-interest is in the image of described at least one section of using image processing system to obtain.
5. method according to claim 1, wherein, described at least one mark is selected from comprise following group:
The colourity biomarker;
SLPI;
PSMB9;
NDRG-1;
Muc-1;
phospho-p27;
src;
E2F1;
p21ras;
P53; And
Its combination.
6. method according to claim 1, but wherein said at least one quantization characteristic is selected from comprise following group:
Transmissivity;
Optical density (OD);
Cellular morphology;
With mark intensity and cell shape is the number percent of the cell type of feature; And
Its combination.
7. computer program, can control image processing system to analyze at least one mark of the prognosis of determining cancer patient, described computer program comprises computer-readable recording medium, this computer-readable recording medium has the wherein computer readable program code part of storage, and described computer readable program code partly comprises:
But but be used for using the obtain operating part that image extract at least one quantization characteristic of image processing system from least one section, described at least one section is by the body sample preparation that obtains from cancer patient, and described body sample is exposed to described at least one mark; And
But but be used for the decision rule application, thereby but determine the prognosis of cancer patient based on the relation between described at least one quantization characteristic and the described decision rule in the operating part of described at least one quantization characteristic.
8. computer program according to claim 7, but but but the wherein said operating part that is used to use also comprise and being used for the operating part of threshold application in described at least one quantization characteristic, thereby but determine the prognosis of cancer patient based on the relation between described at least one quantization characteristic and the threshold value.
9. computer program according to claim 8, but but the wherein said operating part that is used to use also comprises the operating part that is used for using the disease rule that is used for described threshold value, but described disease rule can be set up good prognosis or poor prognosis corresponding to the value of described at least one quantization characteristic relevant with described threshold value.
10. method that is used for assessing at least one mark of the prognosis that is suitable for determining cancer patient, described method comprises:
A plurality of body sample are exposed to described at least one mark, and described a plurality of body sample obtain from a plurality of patients of correspondence, and each patient has known results;
But use image processing system from a plurality of sections each obtain at least one quantization characteristic of image contract, described a plurality of sections from the corresponding described a plurality of body sample preparations of each patient;
But a plurality of candidates are determined rule application each described at least one quantization characteristic in described a plurality of sections, thereby but determine each of multiple combination of rule and described at least one quantization characteristic that the candidate is provided prognosis for described a plurality of candidates; And
But for described at least one quantization characteristic, select and the corresponding best decision rule of best prognosis, best decision rule determines to select the rule from the candidate, and described best decision rule provides each the best prognosis that is used for a plurality of sections corresponding with described a plurality of patients' each known results best.
11. method according to claim 10, but wherein said applying step also comprises a plurality of candidate's threshold application in described at least one quantization characteristic, thereby produce and be used for each each the corresponding a plurality of candidate's prognosis of a plurality of candidate's threshold values of a plurality of body sample, and wherein said selection step also comprises from described a plurality of candidate's threshold values selects optimal threshold, thereby each the best prognosis that is used for a plurality of sections is corresponding with described a plurality of patients' each known results best.
12. method according to claim 11, wherein said applying step also comprises each the disease rule that is identified for a plurality of candidate's threshold values, but described disease rule can set up corresponding to the good prognosis or the poor prognosis of the value of each relevant described at least one quantization characteristics of a plurality of candidate's threshold values.
13. method according to claim 10, wherein said selection step also comprises:
Determine to determine that with a plurality of candidates each the corresponding a plurality of specificity and the sensitivity of rule are right;
Described a plurality of specificity and the sensitivity of drawing on the receiver operating characteristic curve is right;
Calculating right each of described a plurality of specificitys and sensitivity to and best specificity and sensitivity between a plurality of Euclidean distances; And
Selection is corresponding to the specificity and the right the best decision rule of sensitivity that have to the right minimum Euclideam distance of described best specificity and sensitivity.
14. method according to claim 10, wherein said extraction step also comprises distinguishes area-of-interest, but extract described at least one quantization characteristic from this area-of-interest, described area-of-interest is in each image of a plurality of sections of using image processing system to obtain.
15. method according to claim 10 also comprises the statistical independence of assessing described at least one mark, thereby guarantees that described at least one mark can provide basic statistics ground to be independent of the prognosis of at least one complementary indicia.
16. method according to claim 15, wherein said appraisal procedure also comprises:
The frequency distribution of observations is compared with the frequency distribution of theoretical prognosis, suppose that wherein described at least one mark is independent of the additional markers of more than first body sample that is used to be exposed to described at least one mark and described at least one complementary indicia and calculates this theory prognosis, described more than first body sample is corresponding with the patient with known good result;
The frequency distribution of observations is compared with the frequency distribution of theoretical prognosis, suppose that wherein described at least one mark is independent of the additional markers of more than second body sample that is used to be exposed to described at least one mark and described at least one complementary indicia and calculates this theory prognosis, described more than second body sample is corresponding with the patient with known bad result;
Estimate the independence of described at least one mark with respect to described at least one complementary indicia.
17. also comprising, method according to claim 16, wherein said estimating step use a card square analysis to estimate the independence of described at least one mark with respect to described at least one complementary indicia.
18. method according to claim 10, wherein said at least one mark is selected from comprise following group:
The colourity biomarker;
SLPI;
PSMB9;
NDRG-1;
Muc-1;
phospho-p27;
src;
E2F1;
p21ras;
P53; And
Its combination.
19. method according to claim 10, but wherein said at least one quantization characteristic is selected from comprise following group:
Transmissivity;
Optical density (OD);
Cellular morphology;
With mark intensity and cell shape is the number percent of the cell type of feature; And
Its combination.
20. computer program, can control image processing system, be suitable at least one mark of the prognosis of definite cancer patient with assessment, described computer program comprises computer-readable recording medium, this computer-readable recording medium has the wherein computer readable program code part of storage, and described computer readable program code partly comprises:
But but be used for using image processing system from a plurality of sections each obtain the operating part that extracts at least one quantization characteristic the image, described a plurality of section is by a plurality of body sample preparations that obtain from a plurality of patients of correspondence, each patient has known results, and described a plurality of body sample are exposed to described at least one mark;
Be used for but but a plurality of candidates of limit are determined each the operating part of described at least one quantization characteristic of rule application in a plurality of sections, thereby but a plurality of candidates that are provided for described limit determine each candidate's prognosis of the multiple combination of rule and described at least one quantization characteristic; And
But be used for to described at least one quantization characteristic, but select operating part with the corresponding best decision rule of best prognosis, described optimical block set pattern then determines to select the rule from described candidate, and described best decision rule provides each the best prognosis that is used for described a plurality of sections corresponding with described a plurality of patients' each known results best.
21. computer program according to claim 20, but but but the wherein said operating part that is used to use also comprise and being used for a plurality of candidate's threshold application in the operating part of described at least one quantization characteristic, thereby for each generation of a plurality of body sample and each corresponding a plurality of candidate's prognosis of a plurality of candidate's threshold values, but but and the wherein said operating part that is used for selecting also comprise and be used for selecting the operating part of optimal thresholds from a plurality of candidate's threshold values, thereby each the best prognosis that is used for a plurality of sections is corresponding with a plurality of patients' each known results best.
22. computer program according to claim 21, but but the wherein said operating part that is used for using a plurality of candidate's threshold values also comprises each the operating part of disease rule that is used for being identified for a plurality of candidate's threshold values, but described disease rule can set up corresponding to the good prognosis or the poor prognosis of the value of each relevant described at least one quantization characteristics of a plurality of candidate's threshold values.
23. computer program according to claim 20, but wherein saidly be used to select the operating part of step also to comprise:
But be used for determining to determine each the corresponding a plurality of specificity and the right operating part of sensitivity of rule with a plurality of candidates of limit;
But described a plurality of specificity and the right operating part of sensitivity are used for drawing on the receiver operating characteristic curve;
But be used for calculating right each of described a plurality of specificitys and sensitivity and best specificity and sensitivity between the operating part of a plurality of Euclidean distances; And
But be used for selecting corresponding to having to the operating part of the right the best decision rule of the specificity of the right minimum Euclideam distance of described best specificity and sensitivity and sensitivity.
24. computer program according to claim 20, but but the wherein said operating part that is used to extract also comprises the operating part that is used for distinguishing area-of-interest, but extract described at least one quantization characteristic from this area-of-interest, described area-of-interest is in each image of a plurality of sections of using image processing system to obtain.
25. computer program according to claim 20, but also comprise the operating part of the statistical independence that is used for assessing described at least one mark, thereby guarantee that described at least one mark can provide basic statistics ground to be independent of the prognosis of at least one complementary indicia.
26. computer program according to claim 25, but the wherein said operating part that is used to assess also comprises:
But be used for operating part that the frequency distribution of observations is compared with the frequency distribution of the theoretical prognosis of more than first body sample that is used to be exposed to described at least one mark and described at least one complementary indicia, described more than first body sample is corresponding with the patient with known good result;
But be used for operating part that the frequency distribution of observations is compared with the frequency distribution of the theoretical prognosis of more than second body sample that is used to be exposed to described at least one mark and described at least one complementary indicia, described more than second body sample is corresponding with the patient with known bad result;
But be used for estimating the operating part of described at least one mark with respect to the independence of described at least one complementary indicia.
27. computer program according to claim 26, but but the wherein said operating part that is used to estimate comprises that also being used for using card square to analyze estimates the operating part of described at least one mark with respect to the independence of described at least one complementary indicia.
CNA200580039170XA 2004-09-22 2005-09-22 Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis Pending CN101061480A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US61196504P 2004-09-22 2004-09-22
US60/611,965 2004-09-22
US60/612,073 2004-09-22

Publications (1)

Publication Number Publication Date
CN101061480A true CN101061480A (en) 2007-10-24

Family

ID=38796163

Family Applications (2)

Application Number Title Priority Date Filing Date
CNA200580039170XA Pending CN101061480A (en) 2004-09-22 2005-09-22 Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis
CNA2005800389803A Pending CN101057144A (en) 2004-09-22 2005-09-22 Methods and compositions for evaluating breast cancer prognosis

Family Applications After (1)

Application Number Title Priority Date Filing Date
CNA2005800389803A Pending CN101057144A (en) 2004-09-22 2005-09-22 Methods and compositions for evaluating breast cancer prognosis

Country Status (1)

Country Link
CN (2) CN101061480A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107209796A (en) * 2014-12-19 2017-09-26 皇家飞利浦有限公司 The automation export of quality assurance rule

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0811567D0 (en) * 2008-06-24 2008-07-30 Cytosystems Ltd Assay
CN102183655B (en) * 2011-01-24 2014-06-11 中国人民解放军军事医学科学院生物医学分析中心 New molecular marker CUEDC2 (CUE Domain Containing 2) protein for prognosis judgement in endocrine therapy of breast cancer
WO2013071502A1 (en) 2011-11-17 2013-05-23 Genedia Biotech Co Ltd Mir-193a-3p and associated genes predict tumorigenesis and chemotherapy outcomes
MX369628B (en) * 2012-05-22 2019-11-14 Nanostring Technologies Inc Nano46 genes and methods to predict breast cancer outcome.
CN102786596A (en) * 2012-09-04 2012-11-21 杨举伦 Single-chain antibody KGH-R1-ScFv for resisting p21Ras protein and application thereof
WO2015024942A1 (en) * 2013-08-19 2015-02-26 Biontech Ag Methods and kits for the molecular subtyping of tumors
CN103592444A (en) * 2013-11-27 2014-02-19 中国人民解放军沈阳军区总医院 Method for detecting cyclin G1 protein expression in breast cancer and carrying out prognosis evaluation
JP2017512981A (en) * 2014-04-08 2017-05-25 アルノ セラピューティクス インコーポレイテッド System and method for identifying progesterone receptor subtypes
EP2933639A1 (en) * 2014-04-16 2015-10-21 Deutsches Krebsforschungszentrum Stiftung des Öffentlichen Rechts S100p and Hyaluronic acid as biomarkers for metastatic breast cancer
WO2017203444A1 (en) * 2016-05-24 2017-11-30 Oncostem Diagnostics Pvt. Ltd. Method of prognosing and predicting breast cancer recurrence, markers employed therein and kit thereof
CN109633165B (en) * 2018-12-18 2021-03-09 中国医学科学院基础医学研究所 Application of anti-HSPA 4 autoantibody as breast cancer diagnosis or prognosis evaluation marker

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107209796A (en) * 2014-12-19 2017-09-26 皇家飞利浦有限公司 The automation export of quality assurance rule
CN107209796B (en) * 2014-12-19 2022-01-25 皇家飞利浦有限公司 Automated derivation of quality assurance rules

Also Published As

Publication number Publication date
CN101057144A (en) 2007-10-17

Similar Documents

Publication Publication Date Title
CN101061480A (en) Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis
AU2005289756B2 (en) Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis
US20220051804A1 (en) Image Analysis for Breast Cancer Prognosis
Ayala et al. Reactive stroma as a predictor of biochemical-free recurrence in prostate cancer
Gedye et al. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity
Sandbank et al. Validation and real-world clinical application of an artificial intelligence algorithm for breast cancer detection in biopsies
US20180251849A1 (en) Method for identifying expression distinguishers in biological samples
Strömberg et al. A high‐throughput strategy for protein profiling in cell microarrays using automated image analysis
CN1643163A (en) Materials and methods relating to cancer diagnosis
WO2015135550A1 (en) Assessment of staining quality
Puri et al. Automated computational detection, quantitation, and mapping of mitosis in whole-slide images for clinically actionable surgical pathology decision support
Takes et al. Human papillomavirus detection in fine needle aspiration cytology of lymph node metastasis of head and neck squamous cell cancer
Kuhn et al. Soft tissue pathology for the radiologist: A tumor board primer with 2020 WHO classification update
Mavropoulos et al. Artificial intelligence-driven morphology-based enrichment of malignant cells from body fluid
Narayanan et al. Unmasking the tissue microecology of ductal carcinoma in situ with deep learning
Wrobel et al. Statistical analysis of multiplex immunofluorescence and immunohistochemistry imaging data
WO2019232361A1 (en) Personalized treatment of pancreatic cancer
Da Col et al. Image analysis of circulating tumor cells and leukocytes predicts survival and metastatic pattern in breast cancer patients
Tolkach et al. An international multi-institutional validation study of the algorithm for prostate cancer detection and Gleason grading
Yoder et al. Computer-aided scoring of erb-b2 receptor tyrosine kinase 2 (HER2) gene amplification status in breast cancer
Bao et al. MxIF Q-score: Biology-Informed Quality Assurance for Multiplexed Immunofluorescence Imaging
Yoon et al. Analytical performance of the digital morphology analyzer Sysmex DI-60 for body fluid cell differential counts
Hui et al. “Immuno-FlowFISH”: Applications for Chronic Lymphocytic Leukemia
Boissin et al. Deep learning-based risk stratification of preoperative breast biopsies using digital whole slide images
Kung Assessing a hyperspectral image analysis system to study tumor and immune cell spatial organizations within lung tumor microenvironments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20071024