TWI838648B - Uses and methods of compositions for the detection of colorectal cancer - Google Patents

Uses and methods of compositions for the detection of colorectal cancer Download PDF

Info

Publication number
TWI838648B
TWI838648B TW110136054A TW110136054A TWI838648B TW I838648 B TWI838648 B TW I838648B TW 110136054 A TW110136054 A TW 110136054A TW 110136054 A TW110136054 A TW 110136054A TW I838648 B TWI838648 B TW I838648B
Authority
TW
Taiwan
Prior art keywords
seq
colorectal cancer
blood sample
concentration
purified
Prior art date
Application number
TW110136054A
Other languages
Chinese (zh)
Other versions
TW202314245A (en
Inventor
張君照
蔡奕戎
林景堉
Original Assignee
臺北醫學大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 臺北醫學大學 filed Critical 臺北醫學大學
Priority to TW110136054A priority Critical patent/TWI838648B/en
Publication of TW202314245A publication Critical patent/TW202314245A/en
Application granted granted Critical
Publication of TWI838648B publication Critical patent/TWI838648B/en

Links

Images

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention provides uses and methods of compositions for the detection of colorectal cancer, wherein the peptide composition for detecting colorectal cancer is selected from the group of amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 any two or more thereof. The detection method comprising the steps of: (a) Using a detection method to detect the amount of the peptide composition after a test blood sample of a test individual is subjected to a purification method; (b) Comparing the amount of the peptide composition measured by (a) with the amount of the peptide composition obtained from a blood sample of a non-colorectal cancer individual after being subjected to the purification method, determine whether the test individual suffers from colorectal cancer.

Description

用於檢測大腸直腸癌的組合物的用途與方法 Uses and methods of compositions for detecting colorectal cancer

本發明關於一種檢測癌症的生物標記組合物和檢測方法,特別有關一種用於檢測大腸直腸癌的檢測方法與胜肽組合物。 The present invention relates to a biomarker composition and a detection method for detecting cancer, and in particular to a detection method and a peptide composition for detecting colorectal cancer.

根據GLOBOCAN 2018(Bray F,Ferlay J,Soerjomataram I,Siegel RL,Torre LA,Jemal A.Global cancer statistics 2018:GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.CA Cancer J Clin.2018 Nov;68(6):394-424.doi:10.3322/caac.21492.Epub 2018 Sep 12.Erratum in:CA Cancer J Clin.2020 Jul;70(4):313.PMID:30207593.)的數據顯示,大腸直腸癌是世界盛行的癌症之一,全球死亡率是惡性腫瘤的第三名,每一年預計有180萬新增病例,並有881,000死亡病例。 According to the data from GLOBOCAN 2018 (Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018 Nov; 68(6): 394-424. doi: 10.3322/caac.21492. Epub 2018 Sep 12. Erratum in: CA Cancer J Clin. 2020 Jul; 70(4): 313. PMID: 30207593.), colorectal cancer is one of the most prevalent cancers in the world, and its global mortality rate ranks third among malignant tumors. It is estimated that there will be 1.8 million new cases and 881,000 deaths each year.

根據106年國民健康署癌症登記報告,目前大腸直腸癌在台灣位居十大癌症發生率的第二名以及十大癌症死亡率的第三名。大腸直腸癌在台灣好發於40~50歲,且男性罹患大腸直腸癌的機率為所有癌症的15%,女性則為8%。根據衛福部癌症登記統計顯示,自84年起標準化發生率為每10萬人口有22.9人,95年大腸直腸癌人數首次超越肝癌,成為我國最多人罹患的癌症,人 數已超過15,000人。而104年標準化的發生率為每10萬人口有43.0人,標準化發生率上升87.8%。 According to the 2017 National Health Administration Cancer Registration Report, colorectal cancer currently ranks second in the incidence rate of the top ten cancers and third in the mortality rate of the top ten cancers in Taiwan. Colorectal cancer is more common in Taiwan between the ages of 40 and 50, and the probability of males suffering from colorectal cancer is 15% of all cancers, while that of females is 8%. According to the Ministry of Health and Welfare's cancer registration statistics, the standardized incidence rate has been 22.9 per 100,000 population since 1995. In 2006, the number of colorectal cancer cases surpassed liver cancer for the first time, becoming the most common cancer in Taiwan, with more than 15,000 cases. The standardized incidence rate in 2015 was 43.0 per 100,000 population, an increase of 87.8%.

大腸直腸癌的危險因子,包括:有年齡、性別、遺傳學因子、罹患大腸直腸癌的一級親屬、缺乏運動、吸煙、肥胖和不健康的飲食習慣。 Risk factors for colorectal cancer include age, gender, genetic factors, having a first-degree relative with colorectal cancer, lack of exercise, smoking, obesity, and unhealthy eating habits.

大腸直腸癌的癌症分期,依照2009國際抗癌聯盟(Union for international cancer control)與美國癌症聯合委員會(American joint committee on cancer,AJCC)發行之第七版TNM分期系統來分類:腫瘤侵犯深度(T)、淋巴結侵犯數目(N)、是否遠端轉移(M)。 The cancer staging of colorectal cancer is classified according to the seventh edition of the TNM staging system issued by the Union for International Cancer Control and the American Joint Committee on Cancer (AJCC) in 2009: depth of tumor invasion (T), number of lymph node invasion (N), and whether there is distant metastasis (M).

大腸直腸癌早期沒有明顯的症狀,待檢測出為大腸直腸癌時,通常已是癌症晚期。常見的病徵包含了腹痛、腹瀉、便秘、直腸出血、糞便潛血或體重減輕。 There are no obvious symptoms in the early stages of colorectal cancer. By the time it is detected, it is usually in the late stages of the cancer. Common symptoms include abdominal pain, diarrhea, constipation, rectal bleeding, occult blood in the stool, or weight loss.

目前大腸直腸癌檢測方法包含:糞便潛血試驗(Fecal occult blood test,FOBT),利用免疫法或化學法檢測血紅素,但該檢測方法的靈敏度約有50%~70%、肛門指診(Palpation),僅能診斷直腸附近的病灶處、乙狀結腸鏡(Sigmoidoscope),為侵入式檢查,檢測實行前需禁食並服用藥劑去除腸道中的糞便,且有其他風險存在,例如穿孔。鋇劑灌腸攝影檢查(Barium enema),可檢測整個結腸,但其靈敏度較差,可能會遺漏息肉等病灶處、大腸直腸內視鏡(Colonoscopy)能直接用來評估大腸狀況,為侵入式檢查,且有其他風險存在,例如穿孔。 Current colorectal cancer detection methods include: fecal occult blood test (FOBT), which uses immunoassay or chemical method to detect hemoglobin, but the sensitivity of this detection method is about 50%~70%; palpation, which can only diagnose lesions near the rectum; sigmoidoscope, which is an invasive examination. Before the test, the patient needs to fast and take medicine to remove feces in the intestine. There are also other risks, such as perforation. Barium enema can detect the entire colon, but its sensitivity is poor and may miss lesions such as polyps. Colonoscopy can be used directly to evaluate the condition of the colon. It is an invasive examination and has other risks, such as perforation.

目前臨床使用大腸直腸癌之血液腫瘤標記為癌胚抗原(carcinoembryonic antigen,CEA)與醣抗原19-9(Carbohydrate Antigen 19-9,CA 19-9),其靈敏度分別為33.3~64.5%與33.3~47.8%,特異性分別為89.2~90.9%與90.5~95.9%。 The blood tumor markers currently used in clinical diagnosis of colorectal cancer are carcinoembryonic antigen (CEA) and carbohydrate antigen 19-9 (CA 19-9), with sensitivities of 33.3~64.5% and 33.3~47.8%, and specificities of 89.2~90.9% and 90.5~95.9%, respectively.

Hermunend等人提出癌胚抗原與醣抗原19-9對大腸直腸癌預後的用途(Acta Oncol,2020.59(12):p.1416-1423),評估其靈敏度分別為97.0%與89%,特異性為31.0%與16%;而合併使用癌胚抗原與醣抗原19-9:癌胚抗原(≦5μg/l)合併醣抗原19-9(≦26kU/l)其靈敏度為88.0%,特異性為9.0%、癌胚抗原(>5μg/l)合併醣抗原19-9(≦26kU/l)其靈敏度為100.0%,特異性為88.0%,因此目前臨床上沒有將癌胚抗原與醣抗原19-9作為大腸直腸癌篩檢的生物標記,而是作為大腸直腸癌的預後評估的生物標記。 Hermunend et al. proposed the use of carcinoembryonic antigen and glycoantigen 19-9 for the prognosis of colorectal cancer (Acta Oncol, 2020.59(12): p.1416-1423), and estimated their sensitivities to be 97.0% and 89%, and their specificities to be 31.0% and 16%, respectively. The combined use of carcinoembryonic antigen and glycoantigen 19-9: carcinoembryonic antigen (≤5 μ g/l) combined with glycoantigen 19-9 (≤26 kU/l) had a sensitivity of 88.0% and a specificity of 9.0%, carcinoembryonic antigen (>5 μ g/l) combined with glycoantigen 19-9 (≤26 kU/l) had a sensitivity of 88.0% and a specificity of 9.0%, and carcinoembryonic antigen (>5 μ g/l) combined with glycoantigen 19-9 (≤26 kU/l) had a sensitivity of 88.0% and a specificity of 9.0%. g/l) combined with glycoantigen 19-9 (≤26 kU/l) has a sensitivity of 100.0% and a specificity of 88.0%. Therefore, currently in clinical practice, carcinoembryonic antigen and glycoantigen 19-9 are not used as biomarkers for colorectal cancer screening, but as biomarkers for prognosis assessment of colorectal cancer.

臨床使用的癌胚抗原與醣抗原19-9雖然用於大腸直腸癌術後追蹤的預後價值符合臨床需求,但對於篩檢大腸直腸癌的靈敏度低而不符臨床預期。因此目前臨床仍缺乏非侵入性且同時具備高靈敏度和高特異性的大腸直腸癌篩檢生物標記。 Although the prognostic value of carcinoembryonic antigen and glycogen antigen 19-9 used in clinical practice for postoperative follow-up of colorectal cancer meets clinical needs, their sensitivity for screening colorectal cancer is low and does not meet clinical expectations. Therefore, there is still a lack of non-invasive biomarkers for colorectal cancer screening that are both highly sensitive and highly specific.

本發明之一目的,係為解決大腸直腸癌非侵入式生物標記檢測效能不足的問題。 One of the purposes of the present invention is to solve the problem of insufficient performance of non-invasive biomarker detection of colorectal cancer.

根據本發明之目的,一種用於檢測大腸直腸癌的胜肽組合物,其中該胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2及SEQ ID NO:3的胺基酸序列任二或任二以上組成之群組,其中該胜肽組合物係存在於血液樣品中。 According to the purpose of the present invention, a peptide composition for detecting colorectal cancer, wherein the peptide composition is selected from a group consisting of any two or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3, wherein the peptide composition is present in a blood sample.

其中,檢測大腸直腸癌的胜肽組合物,係通過一質譜分析檢測該胜肽組合物的含量。 Among them, the peptide combination for detecting colorectal cancer is detected by mass spectrometry analysis to detect the content of the peptide combination.

其中,該血液樣品為一血清樣品或、一血漿樣品,或一全血樣品。 Wherein, the blood sample is a serum sample, a plasma sample, or a whole blood sample.

其中,該血液樣品經一麥胚芽凝集素進行純化。 Among them, the blood sample was purified by wheat germ agglutinin.

根據本發明之目的,係提供一種用於檢測大腸直腸癌的檢測方法,步驟包括:(a)經一檢測方法檢測一待測個體的一待測血液樣品經一純化方法後獲得之一胜肽組合物的含量,其中該胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2及SEQ ID NO:3的胺基酸序列任二以上組成的群組;(b)比較經(a)測得之該胜肽組合物的含量,與一非大腸直腸癌個體的一比較血液樣品經該純化方法後獲得的該胜肽組合物的含量,判斷該待測個體是否罹患大腸直腸癌。 According to the purpose of the present invention, a detection method for detecting colorectal cancer is provided, the steps comprising: (a) detecting the content of a peptide composition obtained by a purification method in a blood sample of a test subject, wherein the peptide composition is selected from a group consisting of any two or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3; (b) comparing the content of the peptide composition detected by (a) with the content of the peptide composition obtained by the purification method in a comparison blood sample of a non-colorectal cancer subject, to determine whether the test subject suffers from colorectal cancer.

其中,比較方法係選自由多元邏輯式迴歸、決策樹、隨機森林及支持向量機所組成之群組。 Among them, the comparison method is selected from the group consisting of multivariate logical regression, decision tree, random forest and support vector machine.

其中,該檢測方法係選自質譜分析或免疫分析。 Wherein, the detection method is selected from mass spectrometry analysis or immunoassay.

其中,該待測血液樣品與該比較血液樣品係皆為一血清樣品,或該待測血液樣品與該比較血液樣品係皆為一血漿樣品,或該待測血液樣品與該比較血液樣品係皆為一全血樣品。 Wherein, the blood sample to be tested and the blood sample for comparison are both serum samples, or the blood sample to be tested and the blood sample for comparison are both plasma samples, or the blood sample to be tested and the blood sample for comparison are both whole blood samples.

其中,該純化方法為麥胚芽凝集素純化方法。 Among them, the purification method is a wheat germ agglutinin purification method.

其中,該質譜分析係選自由基質輔助雷射脫附游離質譜分析、液相層析質譜分析、液相層析串聯質譜分析、及氣相層析質譜分析所組成之群組。 The mass spectrometry analysis is selected from the group consisting of radical-assisted laser desorption mass spectrometry, liquid chromatography-mass spectrometry, liquid chromatography-tandem mass spectrometry, and gas chromatography-mass spectrometry.

根據本研究方法,證實通過檢測該胜肽組合物,可以克服傳統檢測大腸直腸癌的生物標記的敏感度(Sensitivity)和特異性(Specificity)不足的問題。 According to this research method, it is confirmed that by detecting the peptide combination, the problem of insufficient sensitivity and specificity of traditional biomarkers for detecting colorectal cancer can be overcome.

圖1為大腸直腸癌TNM分期系統。 Figure 1 shows the TNM staging system for colorectal cancer.

圖2為待測個體的血漿樣品的分組資訊。 Figure 2 shows the grouping information of the plasma samples of the individuals to be tested.

圖3為篩選的胜肽片段的多重反應監測離子對和質譜參數。 Figure 3 shows the multiple reaction monitoring ion pairs and mass spectrum parameters of the screened peptide fragments.

圖4為SEQ ID NO:1的胜肽片段在非大腸直腸癌組、早期大腸直腸癌組,以及晚期大腸直腸癌組中的質譜信號強度點陣圖。 Figure 4 is a dot matrix diagram of the mass spectrometry signal intensity of the peptide fragment of SEQ ID NO: 1 in the non-colorectal cancer group, the early colorectal cancer group, and the advanced colorectal cancer group.

圖5為SEQ ID NO:2的胜肽片段在非大腸直腸癌組、早期大腸直腸癌組,以及晚期大腸直腸癌組中的質譜信號強度點陣圖。 Figure 5 is a dot matrix diagram of the mass spectrometry signal intensity of the peptide fragment of SEQ ID NO: 2 in the non-colorectal cancer group, the early colorectal cancer group, and the advanced colorectal cancer group.

圖6為SEQ ID NO:3的胜肽片段在非大腸直腸癌組、早期大腸直腸癌組,以及晚期大腸直腸癌組中的質譜信號強度點陣圖。 Figure 6 is a dot matrix diagram of the mass spectrometry signal intensity of the peptide fragment of SEQ ID NO: 3 in the non-colorectal cancer group, the early colorectal cancer group, and the late colorectal cancer group.

圖7為篩選的胜肽片段的多重反應監測離子對和質譜分析參數。 Figure 7 shows the multiple reaction monitoring ion pairs and mass spectrometry analysis parameters of the screened peptide fragments.

圖8為SEQ ID NO:1的三個離子對的滯留時間和信號強度。 Figure 8 shows the retention time and signal intensity of the three ion pairs of SEQ ID NO: 1.

圖9為SEQ ID NO:2的三個離子對的滯留時間和信號強度。 Figure 9 shows the retention time and signal intensity of the three ion pairs of SEQ ID NO: 2.

圖10為SEQ ID NO:3的三個離子對的滯留時間和信號強度。 Figure 10 shows the retention time and signal intensity of the three ion pairs of SEQ ID NO: 3.

圖11為SEQ ID NO:1的校正曲線圖。 Figure 11 is a calibration curve diagram of SEQ ID NO: 1.

圖12為SEQ ID NO:2的校正曲線圖。 Figure 12 is a calibration curve diagram of SEQ ID NO: 2.

圖13為SEQ ID NO:3的校正曲線圖。 Figure 13 is a calibration curve diagram of SEQ ID NO: 3.

圖14為SEQ ID NO:1在各個分組的個別樣品中各胜肽濃度含量結果點陣圖。 Figure 14 is a dot matrix diagram showing the concentration of each peptide in each sample of each group of SEQ ID NO: 1.

圖15為SEQ ID NO:2在各個分組的個別樣品中各胜肽濃度含量結果點陣圖。 Figure 15 is a dot matrix diagram showing the concentration of each peptide in each sample of each group of SEQ ID NO: 2.

圖16為SEQ ID NO:3在各個分組的個別樣品中各胜肽濃度含量結果點陣圖。 Figure 16 is a dot matrix diagram showing the concentration of each peptide in each sample of each group of SEQ ID NO: 3.

圖17為通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸直腸癌組和早期大腸直腸癌組的效能,其中胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3之胺基酸序列任一或任二以上組成之群組。以接收者操作特徵曲線表示。 Figure 17 shows the efficacy of peptide combinations in distinguishing non-colorectal cancer groups from early colorectal cancer groups evaluated by multivariate logistic regression and random forest, wherein the peptide combination is selected from a group consisting of any one or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3. It is represented by a receiver operating characteristic curve.

圖18為通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸直腸癌組和晚期大腸直腸癌組的效能,其中胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3之胺基酸序列任一或任二以上組成之群組。以接收者操作特徵曲線表示。 Figure 18 shows the efficacy of peptide combinations in distinguishing non-colorectal cancer groups from advanced colorectal cancer groups evaluated by multivariate logistic regression and random forest, wherein the peptide combinations are selected from a group consisting of any one or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3. It is represented by a receiver operating characteristic curve.

圖19為通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸直腸癌組和全期大腸直腸癌組的效能,其中胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3之胺基酸序列任一或任二以上組成之群組。以接收者操作特徵曲線表示。 Figure 19 shows the efficacy of peptide combinations in distinguishing non-colorectal cancer groups from full-stage colorectal cancer groups evaluated by multivariate logistic regression and random forest, wherein the peptide combinations are selected from a group consisting of any one or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3. It is represented by a receiver operating characteristic curve.

圖20為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量用於檢測大腸直腸癌的效能,以曲線下面積和、靈敏度、特異性、準確度、學生t檢定(Student’s t-test),以及統計檢定力進行評估。 Figure 20 shows the performance of detecting the individual levels of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3 for detecting colorectal cancer, evaluated by the sum of the areas under the curves, sensitivity, specificity, accuracy, Student’s t-test, and statistical power.

圖21為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量,以任二者組成的群組用於檢測大腸直腸癌的效能,以曲線下面積和、靈敏度、特異性、準確度、學生t檢定,以及統計檢定力進行評估。 Figure 21 shows the performance of detecting the individual contents of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, and any two of them as a group for detecting colorectal cancer, evaluated by the sum of the area under the curve, sensitivity, specificity, accuracy, Student's t test, and statistical power.

圖22為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量,合併三者組成的群組用於檢測大腸直腸癌的效能,以曲線下面積和、靈敏度、特異性、準確度、學生t檢定,以及統計檢定力進行評估。 Figure 22 shows the performance of detecting the individual contents of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, and combining the three to form a group for detecting colorectal cancer, which is evaluated by the sum of the areas under the curve, sensitivity, specificity, accuracy, Student's t test, and statistical power.

為能更清楚理解本發明內容,以下結合附圖以詳細說明本發明的具體實施例。在本說明書中所提之「一」表示一種、至少一種、一個或至少一個。 In order to more clearly understand the content of the present invention, the following is a detailed description of the specific embodiments of the present invention in conjunction with the attached figures. The "one" mentioned in this specification means one, at least one, one or at least one.

請參見序列表,本發明提供一種用於檢測大腸直腸癌之胜肽組合物,該胜肽組合物係選自由SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3 之胺基酸序列任二以上組成之群組,且所述胜肽組合物係存在於一待測個體的血液樣品中。其中,由SEQ ID NO:1之胺基酸序列所構成之胜肽片段係屬於蛋白質血小板因子4(platelet factor,PF4)的第54個至第62個胺基酸。由SEQ ID NO:2之胺基酸序列所構成之胜肽片段係屬於蛋白質互聯α胰蛋白酶抑制劑4(inter-alpha-trypsin inhibitor heavy chain 4,ITIH4)的第429個至第438個胺基酸。由SEQ ID NO:3之胺基酸序列所構成之胜肽片段係屬於蛋白質脂蛋白E(apolipoprotein E,APOE)的第198個至第207個胺基酸。 Please refer to the sequence table. The present invention provides a peptide composition for detecting colorectal cancer. The peptide composition is selected from a group consisting of any two or more of the amino acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, and the peptide composition is present in a blood sample of a subject to be tested. The peptide fragment composed of the amino acid sequence of SEQ ID NO: 1 belongs to the 54th to 62nd amino acids of the protein platelet factor 4 (PF4). The peptide fragment composed of the amino acid sequence of SEQ ID NO: 2 belongs to the 429th to 438th amino acids of the protein inter-alpha-trypsin inhibitor heavy chain 4 (ITIH4). The peptide fragment composed of the amino acid sequence of SEQ ID NO: 3 belongs to the 198th to 207th amino acids of the protein apolipoprotein E (APOE).

在一實施例中,首先,鑑定與篩選混合血漿中具有信號強度差異的胜肽片段。再來,測定篩選的各個胜肽片段在個別血漿樣品中的信號強度差異。接著,建立根據胜肽片段含量檢測大腸直腸癌的方法。最後,以機器學習方法結合不同胜肽片段含量用於檢測大腸直腸癌的效能評估。 In one embodiment, first, peptide fragments with different signal intensities in mixed plasma are identified and screened. Next, the signal intensity differences of each screened peptide fragment in individual plasma samples are measured. Then, a method for detecting colorectal cancer based on the content of peptide fragments is established. Finally, a machine learning method is used to combine different peptide fragment contents for the performance evaluation of colorectal cancer detection.

在鑑定與篩選混合血漿中具有信號強度差異的胜肽片段的步驟中:本發明所使用的血漿樣品係經由臺北醫學大學人體試驗委員會審核通過(案件編號:N20130822和N202007061),從臺北醫學大學聯合人體生物資料庫申請286例大腸直腸癌待測個體的血漿樣品和120例非大腸直腸癌個體的血漿樣品。 In the step of identifying and screening peptide fragments with different signal intensities in mixed plasma: The plasma samples used in the present invention were approved by the Human Trial Committee of Taipei Medical University (case numbers: N20130822 and N202007061), and 286 plasma samples of colorectal cancer subjects and 120 plasma samples of non-colorectal cancer subjects were applied for from the Taipei Medical University Joint Human Biodatabase.

本發明中大腸直腸癌的分期係根據2009年國際抗癌聯盟(Union for international cancer control)和美國癌症聯合委員會(American joint committee on cancer,AJCC)發行的第七版TNM分期系統進行分類,其中TMN分別為:腫瘤侵犯深度(T)、淋巴結侵犯數目(N)、是否遠端轉移(M)。 The staging of colorectal cancer in the present invention is based on the seventh edition of the TNM staging system issued by the Union for international cancer control and the American Joint Committee on Cancer (AJCC) in 2009, where TMN refers to: depth of tumor invasion (T), number of lymph node invasion (N), and whether there is distant metastasis (M).

血漿樣品分組基本資料如圖2,從發現組中挑選大腸直腸癌I期、大腸直腸癌II期、大腸直腸癌III期、大腸直腸癌IV期各10例,以及非大腸直腸癌20例根據性別年齡進行配對,各分組血漿樣品進行定量混合,以獲得各分組的混 合血漿樣品,用於鑑定與篩選在非大腸直腸癌與各分期大腸直腸癌中具有信號強度差異的胜肽片段。 The basic data of plasma sample grouping is shown in Figure 2. 10 cases of colorectal cancer stage I, colorectal cancer stage II, colorectal cancer stage III, and colorectal cancer stage IV were selected from the discovery group, and 20 cases of non-colorectal cancer were matched according to gender and age. The plasma samples of each group were quantitatively mixed to obtain mixed plasma samples of each group for identification and screening of peptide fragments with different signal intensities in non-colorectal cancer and colorectal cancer of various stages.

各分組的混合血漿樣品,首先,以麥胚芽凝集素-瓊脂顆粒(Agarose-bound wheat germ agglutinin,WGA,Vector Laboratories,Burlingame,CA,USA)對各分組的混合血漿樣品進行蛋白質純化,以獲得各分組的混合血漿的蛋白質純化樣品。再來,對各分組的混合血漿樣品的蛋白質純化樣品進行溶液中分解(In-solution digestion),將蛋白質分解為胜肽片段,並通過奈米級液相層析串聯質譜儀(Orbitrap Elite hybrid mass spectrometer,Thermo Electron,Bremen,Germany)分析。接著,將奈米級液相層析串聯質譜儀分析蒐集的數據資料通過蛋白質體學質譜分析軟體(PEAKS 7軟體,Bioinformatics Solutions,Waterloo,ON,Canada)比對通用蛋白質資料庫(Universal Protein Knowledgebase,UniProt,18 January 2020),以鑑定各胜肽片段所屬的蛋白質、胜肽片段上的轉譯後修飾,以及非標記定量法(label-free quantification)對不同訊號強度的胜肽片段進行定量,並根據PEAKS 7軟體的分析資料篩選出具有信號強度差異的胜肽片段,篩選結果如圖3,最終篩選出3個不具轉譯後修飾的胜肽片段,包括:SEQ ID NO:1,SEQ ID NO:2,SEQ ID NO:3。其中,SEQ ID NO:1滯留時間為5.9分鐘,挑選三個離子對,母離子荷質比(charge to mass ratio,m/z)皆為520.3,子離子荷質比分別為902.5,789.4和251.1,碰撞能量為20.1電子伏特,加速電壓為130伏特。其中,SEQ ID NO:2滯留時間為6.2分鐘,挑選三個離子對,母離子荷質比皆為500.2,子離子荷質比分別為815.4,702.3和587.3,碰撞能量為16.5電子伏特,加速電壓為130伏特。其中,SEQ ID NO:3滯留時間為4.4分鐘,挑選三個離子對,母離子荷質比皆為484.7,子離子荷質比分別為588.3,489.2和360.1,碰撞能量為22電子伏特,加速電壓為130伏特。 The mixed plasma samples of each group were first purified by using agarose-bound wheat germ agglutinin (WGA, Vector Laboratories, Burlingame, CA, USA) to obtain the purified protein samples of the mixed plasma of each group. Then, the purified protein samples of the mixed plasma samples of each group were subjected to in-solution digestion to decompose the proteins into peptide fragments, which were then analyzed by nanoscale liquid chromatography-tandem mass spectrometry (Orbitrap Elite hybrid mass spectrometer, Thermo Electron, Bremen, Germany). Next, the data collected by nanoscale liquid chromatography-tandem mass spectrometry analysis were compared with the Universal Protein Knowledgebase (UniProt, 18 January 2020) through proteomics mass spectrometry analysis software (PEAKS 7 software, Bioinformatics Solutions, Waterloo, ON, Canada) to identify the protein to which each peptide fragment belongs, the post-translational modification on the peptide fragment, and the label-free quantification method (label-free quantification) to quantify the peptide fragments with different signal intensities. The peptide fragments with different signal intensities were screened out according to the analysis data of PEAKS 7 software. The screening results are shown in Figure 3. Finally, three peptide fragments without post-translational modification were screened out, including: SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3. Among them, the retention time of SEQ ID NO: 1 is 5.9 minutes, three ion pairs are selected, the charge to mass ratio (m/z) of the parent ion is 520.3, the charge to mass ratio of the daughter ion is 902.5, 789.4 and 251.1 respectively, the collision energy is 20.1 electron volts, and the acceleration voltage is 130 volts. Among them, the retention time of SEQ ID NO: 2 is 6.2 minutes, three ion pairs are selected, the charge to mass ratio of the parent ion is 500.2, the charge to mass ratio of the daughter ion is 815.4, 702.3 and 587.3 respectively, the collision energy is 16.5 electron volts, and the acceleration voltage is 130 volts. Among them, the retention time of SEQ ID NO: 3 is 4.4 minutes. Three ion pairs are selected, the charge-mass ratio of the parent ion is 484.7, the charge-mass ratio of the daughter ion is 588.3, 489.2 and 360.1 respectively, the collision energy is 22 electron volts, and the acceleration voltage is 130 volts.

其中,離子對係由一母離子和一子離子組成的群組,數值皆以荷質比表示,其中,荷質比係一個帶電粒子所帶電荷與其質量之比。 Among them, an ion pair is a group consisting of a parent ion and a daughter ion, and the values are all expressed in charge-to-mass ratio, where the charge-to-mass ratio is the ratio of the charge carried by a charged particle to its mass.

其中,麥胚芽凝集素為自植物小麥萃取的凝集素,對醣蛋白上的N-乙醯-D-葡萄糖胺(N-acetyl-D-glucosamine,GlcNAc)結構具有高特異性的結合能力。本實施例以麥胚芽凝集素-瓊脂顆粒純化混合血漿樣品步驟如下所述,首先,將100μL麥胚芽凝集素-瓊脂顆粒加入600μL的微量離心管中,之後加入20μL混合血漿樣品,並以搖臂式震盪器在常溫下充分混合。接著,將微量離心管離心後,以400μL磷酸鹽緩衝液(phosphate buffered saline,PBS)沖洗3次去除未吸附在麥胚芽凝集素-瓊脂顆粒上的蛋白質後,以50μL析出緩衝液(0.5M N-acetylglucosamine dissolved in 1mM acetic acid)沖洗2次,使麥胚芽凝集素-瓊脂顆粒上吸附的蛋白質析出,收集到另一個微量離心管後使用布萊德福分析(Bradford assay)進行蛋白質定量,並從各個樣品中各取出20μg析出之蛋白質至新的微量離心管中,再使用真空濃縮儀抽乾,最後獲得經麥胚芽凝集素-瓊脂顆粒純化的混合血漿的蛋白質純化樣品。 Among them, wheat germ agglutinin is a lectin extracted from wheat plant, and has a highly specific binding ability to N-acetyl-D-glucosamine (GlcNAc) structure on glycoprotein. In this embodiment, the steps of purifying mixed plasma samples with wheat germ agglutinin-agglutinin particles are as follows: first, 100 μL of wheat germ agglutinin-agglutinin particles are added to a 600 μL microcentrifuge tube, and then 20 μL of mixed plasma sample is added, and the mixture is fully mixed at room temperature with a rocker shaker. Next, after centrifugation, the microcentrifuge tube was washed three times with 400 μL phosphate buffered saline (PBS) to remove the protein not adsorbed on the WGAG-AG particles, and then washed twice with 50 μL precipitation buffer (0.5M N-acetylglucosamine dissolved in 1mM acetic acid) to precipitate the protein adsorbed on the WGAG-AG particles. The collected protein was collected in another microcentrifuge tube and quantified by Bradford assay. 20 μL of each sample was taken out and the protein was quantified by Bradford assay. g of precipitated protein into a new microcentrifuge tube and then dried using a vacuum concentrator to obtain a protein purified sample of mixed plasma purified by WWAG-AG particles.

其中,溶液中分解步驟如下:首先,取10μg各分組的個別血漿的蛋白質純化樣品並加入20μL的含有0.1%甲酸的水溶液進行回溶。接著,加入1μL的550mM的二硫蘇糖醇(Dithiothreitol,DTT)在56℃反應45分鐘。接著,加入2μL的450mM碘乙醯胺(Iodoacetamide,IAM)在室溫避光反應45分鐘,再來,加入0.5μg的胰蛋白酶在37℃反應16小時,最後,加入適量1%甲酸水溶液使整體溶液最終甲酸濃度為0.1%,以終止酵素反應。 The decomposition steps in the solution are as follows: First, take 10 μg of purified protein samples of individual plasma of each group and add 20 μL of an aqueous solution containing 0.1% formic acid for re-dissolution. Then, add 1 μL of 550mM dithiothreitol (DTT) and react at 56°C for 45 minutes. Then, add 2 μL of 450mM iodoacetamide (IAM) and react at room temperature in the dark for 45 minutes. Next, add 0.5 μg of trypsin and react at 37°C for 16 hours. Finally, add an appropriate amount of 1% formic acid aqueous solution to make the final formic acid concentration of the whole solution 0.1% to terminate the enzyme reaction.

其中,奈米級液相層析串聯質譜儀分析用於分析經蛋白酶分解為胜肽片段的樣品。胜肽混合物以層析管柱(內徑:75μm,,長度:25cm,C18 BEH column,孔徑:130Å,填充材料粒徑:1.7μm,Waters)進行分離,移動相(mobile phase)A:含0.1%甲酸的水溶液,移動相B:含0.1%甲酸的乙腈溶液,分離梯度為60分鐘內移動相B由5%增加至35%,移動相A梯度則隨著移動相B的百分比改變,由95%減少至65%,流速:300nL/分鐘,管柱溫度:35℃。以奈米級液相層析串聯 質譜儀分析獲得全掃瞄質譜圖(full-scan MS spectra),篩選設定包括:荷質比介於350-1600,自動增益控制(Automatic Gain Control,AGC)目標值為106。質譜儀系統解析度設定為120K(系統解析度數值由荷質比400提供)。前20個訊號最強的離子依序被分離至線性離子阱(linear ion trap)中通過碰撞誘發鍵解反應(collision-induced dissociation,CID)予以碎裂化獲得子離子質譜資料,自動增益控制目標值為104,動態排除時間為60秒。 Among them, nano-liquid chromatography-tandem mass spectrometry analysis is used to analyze samples that have been decomposed into peptide fragments by proteases. The peptide mixture was separated by a chromatography column (inner diameter: 75μm, length: 25cm, C18 BEH column, pore size: 130Å, filler material particle size: 1.7 μm , Waters), mobile phase A: aqueous solution containing 0.1% formic acid, mobile phase B: acetonitrile solution containing 0.1% formic acid, the separation gradient is that mobile phase B increases from 5% to 35% within 60 minutes, and the mobile phase A gradient decreases from 95% to 65% as the percentage of mobile phase B changes, flow rate: 300nL/min, column temperature: 35℃. Full-scan MS spectra were obtained by nano-liquid chromatography-tandem mass spectrometry analysis. The screening settings included: the mass-to-charge ratio was between 350 and 1600, and the target value of automatic gain control (AGC) was 10 6 . The system resolution of the mass spectrometer was set to 120K (the system resolution value was provided by the mass-to-charge ratio of 400). The first 20 ions with the strongest signals were separated in sequence into a linear ion trap and fragmented by collision-induced dissociation (CID) to obtain the daughter ion mass spectrum data. The target value of automatic gain control was 10 4 and the dynamic exclusion time was 60 seconds.

其中,PEAKS 7軟體(Bioinformatics Solutions,Waterloo,ON,Canada)分析,設定參數如下:質譜容許度(MS tolerance):10ppm,二次質譜容許度(MS/MS tolerance):0.6Da,錯誤偵測率(false detection rate):1%。另外,PEAKS 7軟體的蛋白質定量附加工具(PEAKS Q)功能可以對資料庫檢索結果進行分析,通過總離子流(total ion current,TIC)標準化(normalization)和非標記定量法對不同訊號強度的蛋白質的胜肽序列進行定量分析,設定參數如下:質譜容許度(MS tolerance):10ppm,滯留時間容許度(Retention time shift tolerance):60分鐘。使用PEAKS 7軟體的轉譯後修飾鑑定模塊(Peaks PTM)中設定辨識胜肽序列上的轉譯後修飾,包含醣基化(glycosylation)和甲基化(methylation),設定參數如下,Carbamidomethylation(C)/+57.0215Da設定為個定(fixed),oxidation(M)/+15.9949Da、醣基化相關轉譯後修飾和甲基化相關轉譯後修飾設定為可變(variables),其中醣基化相關的轉譯後修飾包括hexose modified CRKTW(+162.0528Da),fucose modified TS(+146.0579Da),O-GlcNac modified STN(+203.195Da),Hex1HexNAc1 modified N(+511.1901Da),Hex1HexNAc1NeuAc1 modified NTS(+656.2276Da),以及Hex1HexNAc1NeuAc2 modified NTS(+947.3231Da)。其中甲基化相關轉譯後修飾包括:methylation modified CDE-HIKLNQRST(+14.0156Da),dimethylation modified KNR(+28.0313Da),K(+32.0564Da),K(+34.0631Da),and trimethylation modified KAR(+43.0058Da)。 PEAKS 7 software (Bioinformatics Solutions, Waterloo, ON, Canada) was used for analysis, and the parameters were set as follows: MS tolerance: 10 ppm, MS/MS tolerance: 0.6 Da, and false detection rate: 1%. In addition, the protein quantitative additional tool (PEAKS Q) of PEAKS 7 software can analyze the database search results and quantitatively analyze the peptide sequences of proteins with different signal intensities through total ion current (TIC) normalization and label-free quantitative method. The parameters were set as follows: MS tolerance: 10 ppm, and retention time shift tolerance: 60 minutes. The post-translational modifications (PTM) of PEAKS 7 software were set to identify the post-translational modifications on the peptide sequence, including glycosylation and methylation. The parameters were set as follows: Carbamidomethylation (C)/+57.0215Da was set as fixed, oxidation (M)/+15.9949Da, glycosylation-related PTMs and methylation-related PTMs were set as variables. Glycosylation-related PTMs included hexose modified CRKTW (+162.0528Da), fucose modified TS (+146.0579Da), O-GlcNac modified STN (+203.195Da), Hex1HexNAc1 modified N(+511.1901Da), Hex1HexNAc1NeuAc1 modified NTS(+656.2276Da), and Hex1HexNAc1NeuAc2 modified NTS(+947.3231Da). The methylation-related post-translational modifications include: methylation modified CDE-HIKLNQRST(+14.0156Da), dimethylation modified KNR(+28.0313Da), K(+32.0564Da), K(+34.0631Da), and trimethylation modified KAR(+43.0058Da).

其中,篩選具信號強度差異的胜肽片段,設定參數如下:統計檢定顯著差異值(-10 lgP)小於13,等同於p值小於0.05。並篩選具有信號強度差異的胜肽片段,其中,信號強度差異係個別胜肽片段在任一分期的大腸直腸癌組別的血漿樣品中的信號強度,須大於1.5倍非大腸直腸癌的血漿樣品中的信號強度,或是小於0.8倍非大腸直腸癌的血漿樣品中的信號強度。信噪比(signal-to-noise ratio,S/N)大於5,最後,排除錯誤切割的胜肽片段(mis-cleavage peptides),排除大於10個胺基酸的胜肽片段。 Among them, peptide fragments with different signal intensities were screened, and the parameters were set as follows: the statistically significant difference value (-10 lgP) was less than 13, which was equivalent to a p value less than 0.05. And peptide fragments with different signal intensities were screened, wherein the signal intensity difference was that the signal intensity of individual peptide fragments in the plasma samples of any stage of colorectal cancer group must be greater than 1.5 times the signal intensity in the plasma samples of non-colorectal cancer, or less than 0.8 times the signal intensity in the plasma samples of non-colorectal cancer. The signal-to-noise ratio (S/N) was greater than 5. Finally, mis-cleavage peptides were excluded, and peptide fragments with more than 10 amino acids were excluded.

測定篩選的各個胜肽片段在個別血漿樣品中的信號強度差異: 在測定篩選的各個胜肽片段在個別血漿樣品中的信號強度差異的步驟中,針對個別樣品的具有信號強度差異的6個胜肽片段進行個別血漿樣品的驗證,包括3個不具轉譯後修飾的胜肽片段:SEQ ID NO:1,SEQ ID NO:2,SEQ ID NO:3。血漿樣品分組基本資料如圖2,從發現組中挑選大腸直腸癌I期、大腸直腸癌II期、大腸直腸癌III期、大腸直腸癌IV期各10例,以及非大腸直腸癌20例根據性別年齡進行配對,並將大腸直腸癌I期和大腸直腸癌II期歸類為早期大腸直腸癌組,大腸直腸癌III期和大腸直腸癌IV期歸類為晚期大腸直腸癌組。 Determining the signal intensity differences of each screened peptide fragment in individual plasma samples: In the step of determining the signal intensity differences of each screened peptide fragment in individual plasma samples, six peptide fragments with signal intensity differences in individual samples were verified in individual plasma samples, including three peptide fragments without post-translational modification: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3. The basic data of plasma sample grouping is shown in Figure 2. 10 cases of colorectal cancer stage I, colorectal cancer stage II, colorectal cancer stage III, and colorectal cancer stage IV were selected from the discovery group, and 20 cases of non-colorectal cancer were matched according to gender and age. Colorectal cancer stage I and colorectal cancer stage II were classified as early colorectal cancer group, and colorectal cancer stage III and colorectal cancer stage IV were classified as late colorectal cancer group.

各分組的個別血漿樣品,首先以麥胚芽凝集素-瓊脂顆粒對個別血漿樣品進行蛋白質純化,以獲得個別血漿樣品的蛋白質純化樣品。再來,對個別血漿樣品的蛋白質純化樣品進行溶液中分解,將蛋白質分解為胜肽片段,並通過高校液相層析串聯質譜儀(UPLC-MS/MS)分析,針對圖3中的3個胜肽片段進行非標記定量法分析,並使用單因子變異數分析(one-way analysis of variance,one-way ANOVA)。 The individual plasma samples of each group were first purified by wheat germ agglutinin-aggregate particles to obtain purified protein samples of individual plasma samples. Then, the purified protein samples of individual plasma samples were decomposed in solution to decompose the proteins into peptide fragments, which were analyzed by UPLC-MS/MS. The three peptide fragments in Figure 3 were analyzed by non-labeled quantitative method and one-way analysis of variance (one-way ANOVA) was used.

結果如圖4~圖6所示,其中圖4顯示在SEQ ID NO:1含量檢測中,非大腸直腸癌組信號強度為953.92±1257.19,早期大腸直腸癌組信號強度為2245.91±2046.9,晚期大腸直腸癌組信號強度為2920.63±3134.69。其中圖5顯示在 SEQ ID NO:2組別中,非大腸直腸癌組信號強度為3144.23±3285,早期大腸直腸癌組信號強度為1536.31±1373.44,晚期大腸直腸癌組信號強度為1427.26±1386.54。其中圖6顯示在SEQ ID NO:3組別中,非大腸直腸癌組信號強度為3937±2082.73,早期大腸直腸癌組信號強度為2164.65±863.8,晚期大腸直腸癌組信號強度為1550.17±1405.8。 The results are shown in Figures 4 to 6, where Figure 4 shows that in the SEQ ID NO: 1 content detection, the signal strength of the non-colorectal cancer group is 953.92±1257.19, the signal strength of the early colorectal cancer group is 2245.91±2046.9, and the signal strength of the late colorectal cancer group is 2920.63±3134.69. Figure 5 shows that in the SEQ ID NO: 2 group, the signal strength of the non-colorectal cancer group is 3144.23±3285, the signal strength of the early colorectal cancer group is 1536.31±1373.44, and the signal strength of the late colorectal cancer group is 1427.26±1386.54. Figure 6 shows that in the SEQ ID NO: 3 group, the signal strength of the non-colorectal cancer group is 3937±2082.73, the signal strength of the early colorectal cancer group is 2164.65±863.8, and the signal strength of the late colorectal cancer group is 1550.17±1405.8.

其中圖4顯示SEQ ID NO:1在早期大腸直腸癌組的信號強度和晚期大腸直腸癌組的信號強度皆比非大腸直腸癌組的信號強度高,其中早期大腸直腸癌組的信號強度為非大腸直腸癌組的2.35倍,晚期大腸直腸癌組的信號強度為非大腸直腸癌組的3.06倍。其中圖5顯示SEQ ID NO:2在早期大腸直腸癌組的信號強度和晚期大腸直腸癌組的信號強度組皆較非大腸直腸癌組的信號強度低,其中早期大腸直腸癌組的信號強度為非大腸直腸癌組的0.48倍,晚期大腸直腸癌組的信號強度為非大腸直腸癌組的0.45倍。其中圖6顯示SEQ ID NO:3在早期大腸直腸癌組的信號強度和晚期大腸直腸癌組的信號強度組皆較非大腸直腸癌組的信號強度低,其中早期大腸直腸癌組的信號強度為非大腸直腸癌組的0.54倍,晚期大腸直腸癌組的信號強度為非大腸直腸癌組的0.39倍。 Figure 4 shows that the signal intensity of SEQ ID NO: 1 in the early colorectal cancer group and the late colorectal cancer group are both higher than the signal intensity of the non-colorectal cancer group, wherein the signal intensity of the early colorectal cancer group is 2.35 times that of the non-colorectal cancer group, and the signal intensity of the late colorectal cancer group is 3.06 times that of the non-colorectal cancer group. Figure 5 shows that the signal intensity of SEQ ID NO: 2 in the early colorectal cancer group and the late colorectal cancer group are both lower than the signal intensity of the non-colorectal cancer group, wherein the signal intensity of the early colorectal cancer group is 0.48 times that of the non-colorectal cancer group, and the signal intensity of the late colorectal cancer group is 0.45 times that of the non-colorectal cancer group. Figure 6 shows that the signal intensity of SEQ ID NO: 3 in the early colorectal cancer group and the late colorectal cancer group are both lower than that in the non-colorectal cancer group. The signal intensity of the early colorectal cancer group is 0.54 times that of the non-colorectal cancer group, and the signal intensity of the late colorectal cancer group is 0.39 times that of the non-colorectal cancer group.

在建立根據胜肽片段含量量診斷大腸直腸癌的方法的步驟中: 根據本發明中,找出的三個具有信強度差異的胜肽片段,根據3個胜肽片段進行同位素標記多重反應監測技術(stable isotope-labeled MRM assay)分析。其中,3個胜肽片段包括:SEQ ID NO:1,SEQ ID NO:2與SEQ ID NO:3。 In the step of establishing a method for diagnosing colorectal cancer based on the content of peptide fragments: According to the present invention, three peptide fragments with different signal strengths are found, and the three peptide fragments are analyzed by stable isotope-labeled MRM assay. Among them, the three peptide fragments include: SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3.

血漿樣品分組基本資料如圖2,從驗證組中挑選大腸直腸癌I期47例、大腸直腸癌II期53例、大腸直腸癌III期50例、大腸直腸癌IV期各56例,以及非大腸直腸癌80例根據性別年齡進行配對,並將大腸直腸癌I期和大腸直腸癌II期歸類為早期大腸直腸癌組,大腸直腸癌III期和大腸直腸癌IV期 歸類為晚期大腸直腸癌組。另外,將大腸直腸癌I期、II期、III期和IV期合併的結果,歸類為全期大腸直腸癌組。 The basic data of plasma sample grouping is shown in Figure 2. From the validation group, 47 cases of colorectal cancer stage I, 53 cases of colorectal cancer stage II, 50 cases of colorectal cancer stage III, 56 cases of colorectal cancer stage IV, and 80 cases of non-colorectal cancer were selected and matched according to gender and age. Colorectal cancer stage I and colorectal cancer stage II were classified as early colorectal cancer group, and colorectal cancer stage III and colorectal cancer stage IV were classified as late colorectal cancer group. In addition, the results of colorectal cancer stage I, stage II, stage III and stage IV were combined and classified as full-stage colorectal cancer group.

各分組的個別血漿樣品,首先以麥胚芽凝集素-瓊脂顆粒對個別血漿樣品進行蛋白質純化,以獲得個別血漿樣品的蛋白質純化樣品。再來,對個別血漿樣品的蛋白質純化樣品,先加入延長胜肽序列,再進行溶液中分解,將蛋白質分解為胜肽片段,並通過三段四極桿式質譜儀(triple quadrupole mass spectrometry,1260 Infinity II Quaternary Pump LC system)分析,針對SEQ ID NO:1,SEQ ID NO:2與SEQ ID NO:3進行使用內添加標準品的定量分析,統計結果使用單因子變異數分析(one-way analysis of variance,one-way ANOVA)。 The individual plasma samples of each group were first purified by wheat germ agglutinin-aggregate particles to obtain the protein purified samples of the individual plasma samples. Then, the extended peptide sequence was added to the protein purified samples of the individual plasma samples, and then decomposed in the solution to decompose the protein into peptide fragments. The fragments were analyzed by triple quadrupole mass spectrometry (1260 Infinity II Quaternary Pump LC system). Quantitative analysis of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 using internally added standards was performed, and the statistical results were analyzed using one-way analysis of variance (one-way ANOVA).

其中,同位素標記多重反應監測技術(stable isotope-labeled MRM assay)分析中,先加入延長胜肽序列,再進行溶液中分解的步驟如下: 首先,取10μg各分組的個別血漿的蛋白質純化樣品並加入20μL的含有0.1%甲酸的水溶液進行回溶。接著,加入1μL的內添加標準品(Internal standard),再來,加入1μL的550mM二硫蘇糖醇(Dithiothreitol,DTT)在56℃反應45分鐘。接著,加入2μL的450mM碘乙醯胺(Iodoacetamide,IAM)在室溫避光反應45分鐘,再來,加入0.5μg的胰蛋白酶在37℃反應16小時,最後,加入適量1%甲酸水溶液使整體溶液最終甲酸濃度為0.1%,以終止酵素反應。 Among them, in the stable isotope-labeled MRM assay analysis, the extended peptide sequence is first added, and then the steps of decomposition in the solution are as follows: First, take 10μg of the protein purified sample of each group of individual plasma and add 20μL of aqueous solution containing 0.1% formic acid for re-dissolution. Then, add 1μL of internal standard, and then add 1μL of 550mM dithiothreitol (DTT) and react at 56℃ for 45 minutes. Next, 2 μL of 450 mM iodoacetamide (IAM) was added to react at room temperature in the dark for 45 minutes. Next, 0.5 μg of trypsin was added to react at 37°C for 16 hours. Finally, an appropriate amount of 1% formic acid aqueous solution was added to make the final formic acid concentration of the entire solution 0.1% to terminate the enzyme reaction.

其中,使用內添加標準品的定量分析步驟如下:使用無同位素標記的延長胜肽序列1.95ng/mL~1000ng/mL定量的峰面積對具有同位素標記的延長胜肽序列分析所獲得的峰面積相除,獲得無同位素標記的延長胜肽序列/同位素標記的延長胜肽序列的比值與標準品濃度之關係,作為校正曲線。同時,血漿樣品中檢測的SEQ ID NO:1,SEQ ID NO:2,SEQ ID NO:3與分析所獲得的同位素 標記的延長胜肽序列分析峰面積相除,獲得的比值並透過校正曲線得出血漿樣品中的SEQ ID NO:1,SEQ ID NO:2,SEQ ID NO:3濃度。 The quantitative analysis steps using the internally added standard are as follows: the peak area of the non-isotope-labeled extended peptide sequence 1.95ng/mL~1000ng/mL is divided by the peak area obtained by the isotope-labeled extended peptide sequence analysis to obtain the relationship between the ratio of the non-isotope-labeled extended peptide sequence/isotope-labeled extended peptide sequence and the concentration of the standard as a calibration curve. At the same time, the SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3 detected in the plasma sample are divided by the peak area of the isotope-labeled extended peptide sequence analysis obtained by the analysis, and the ratio obtained is used to obtain the concentration of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3 in the plasma sample through the calibration curve.

其中,三段四極桿式質譜儀分析中,胜肽混合物以層析管柱(C18 column,Phenomex,Kinetex,填充材料粒徑:2.6μm,孔徑:100Å,長度:50公分,內徑:2.1毫米),移動相A為含0.1%甲酸的水溶液,移動相B為含0.1%甲酸的乙腈溶液。整體層析時間為12分鐘,包括:0-1分鐘:95%移動相A,5%移動相B;1-8分鐘,45%移動相A和55%移動相B;8-9分鐘,100%B,後運轉時間(Post run time)3分鐘。每針樣品注射體積為5μL,移動相流速0.4mL/分鐘。其中,同位素標記多重反應監測技術,係以胜肽片段SEQ ID NO:1、SEQ ID NO:2與SEQ ID NO:3的胺基酸序列為基礎序列,並根據基礎序列所屬蛋白質的胺基酸序列進行補回,在基礎序列頭尾各補回4個胺基酸,以模擬酵素切割的狀況,以下稱為延長胜肽片段。根據篩選出的SEQ ID NO:1、SEQ ID NO:2與SEQ ID NO:3合成的延長胜肽序列,包含:無同位素標記的延長胜肽序列,以及具有穩定的同位素標記的延長胜肽序列,延長胜肽序列如圖7所示,以下稱為內添加標準品。其中,以無同位素標記的延長胜肽序列製備1.95ng/mL~1000ng/mL濃度的標準品作為校正曲線,再於標準品和血漿樣品中加入等量具有穩定的同位素標記的延長胜肽序列,以備後續分析。 Among them, in the three-stage quadrupole mass spectrometer analysis, the peptide mixture was chromatographically separated by a column (C18 column, Phenomex, Kinetex, filler material particle size: 2.6μm, pore size: 100Å, length: 50 cm, inner diameter: 2.1 mm), mobile phase A was an aqueous solution containing 0.1% formic acid, and mobile phase B was an acetonitrile solution containing 0.1% formic acid. The overall chromatographic time was 12 minutes, including: 0-1 minute: 95% mobile phase A, 5% mobile phase B; 1-8 minutes, 45% mobile phase A and 55% mobile phase B; 8-9 minutes, 100% B, and a post-run time of 3 minutes. The injection volume of each sample was 5μL, and the mobile phase flow rate was 0.4mL/min. Among them, the isotope-labeled multiple reaction monitoring technology uses the amino acid sequences of peptide fragments SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 as the basic sequence, and according to the amino acid sequence of the protein to which the basic sequence belongs, four amino acids are added back at the head and tail of the basic sequence to simulate the state of enzyme cutting, which is hereinafter referred to as the extended peptide fragment. The extended peptide sequence synthesized according to the screened SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 includes: an extended peptide sequence without isotope labeling and an extended peptide sequence with a stable isotope labeling. The extended peptide sequence is shown in Figure 7, which is hereinafter referred to as the internally added standard. Among them, the non-isotope-labeled extended peptide sequence was used to prepare a standard with a concentration of 1.95ng/mL~1000ng/mL as a calibration curve, and then an equal amount of the extended peptide sequence with a stable isotope label was added to the standard and plasma samples for subsequent analysis.

其中,同位素標記係以碳的同位素(13C)和氮的同位素(15N)進行標記。結果如圖8~圖16所示。其中,圖8、圖9和圖10分別代表SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3各個胜肽片段的三個離子對的滯留時間和信號強度。 The isotope labeling is performed with carbon isotope ( 13 C) and nitrogen isotope ( 15 N). The results are shown in Figures 8 to 16. Figures 8, 9 and 10 represent the retention time and signal intensity of three ion pairs of each peptide fragment of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3, respectively.

其中,圖11、圖12和圖13分別代表SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3的校正曲線圖,經由雞血清測定最低定量極限(lower limit of quantification,LLOQ)以及最高定量極限(upper limit of quantification,ULOQ),其中圖11顯示之SEQ ID NO:1定量範圍介於3.90~1000ng/mL,其中圖12顯示之SEQ ID NO:2定量範圍介於1.95~250ng/mL,其中圖13顯示之SEQ ID NO:3定量範圍介於1.95~250ng/mL。其中,圖14、圖15和圖16分別代表各個分組的個別樣品中各胜肽濃度含量結果點陣圖。其中,SEQ ID NO:1在各組的定量結果中,非大腸直腸癌組的濃度為9.59±27.14ng/mL,早期大腸直腸癌組的濃度為25.41±33.36ng/mL,晚期大腸直腸癌組的濃度為27.81±11.55ng/mL,全期大腸直腸癌組的濃度為26.61±33.82ng/mL。其中,SEQ ID NO:2在各組的定量結果中,非大腸直腸癌組的濃度為3.05±3.44ng/mL,早期大腸直腸癌組的濃度為2.07±0.27ng/mL,晚期大腸直腸癌組的濃度為2.10±1.56ng/mL,全期大腸直腸癌組的濃度為2.08±0.39ng/mL。其中,SEQ ID NO:3在各組的定量結果中,非大腸直腸癌組的濃度為24.79±13.58ng/mL,早期大腸直腸癌組的濃度為14.37±4.23ng/mL,晚期大腸直腸癌組的濃度為16.85±12.87ng/mL,全期大腸直腸癌組的濃度為15.61±3.75ng/mL。 Among them, Figures 11, 12 and 13 represent the calibration curves of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3, respectively, and the lower limit of quantification (LLOQ) and the upper limit of quantification (ULOQ) are determined by chicken serum, wherein the quantitative range of SEQ ID NO: 1 shown in Figure 11 is between 3.90 and 1000 ng/mL, wherein the quantitative range of SEQ ID NO: 2 shown in Figure 12 is between 1.95 and 250 ng/mL, wherein the quantitative range of SEQ ID NO: 3 shown in Figure 13 is between 1.95 and 250 ng/mL. Among them, Figures 14, 15 and 16 represent the dot matrix plots of the concentration results of each peptide in each sample of each group. Among them, in the quantitative results of each group, the concentration of SEQ ID NO: 1 in the non-colorectal cancer group was 9.59±27.14ng/mL, the concentration in the early colorectal cancer group was 25.41±33.36ng/mL, the concentration in the late colorectal cancer group was 27.81±11.55ng/mL, and the concentration in the all-stage colorectal cancer group was 26.61±33.82ng/mL. Among them, in the quantitative results of each group, the concentration of SEQ ID NO: 2 in the non-colorectal cancer group was 3.05±3.44ng/mL, the concentration in the early colorectal cancer group was 2.07±0.27ng/mL, the concentration in the late colorectal cancer group was 2.10±1.56ng/mL, and the concentration in the all-stage colorectal cancer group was 2.08±0.39ng/mL. Among them, the quantitative results of SEQ ID NO: 3 in each group showed that the concentration of non-colorectal cancer group was 24.79±13.58ng/mL, the concentration of early colorectal cancer group was 14.37±4.23ng/mL, the concentration of late colorectal cancer group was 16.85±12.87ng/mL, and the concentration of all-stage colorectal cancer group was 15.61±3.75ng/mL.

定量結果如圖14、15、16所示,圖14顯示SEQ ID NO:1在各組的定量結果中,非大腸直腸癌組的濃度為9.59±27.14ng/mL,早期大腸直腸癌組的濃度為25.41±33.36ng/mL,晚期大腸直腸癌組的濃度為27.81±11.55ng/mL,全期大腸直腸癌組的濃度為26.61±33.82ng/mL,其中,早期大腸直腸癌組的含量為非大腸直腸癌組的0.68倍,SEQ ID NO:1在晚期大腸直腸癌組的含量為非大腸直腸癌組的0.69倍。圖15顯示SEQ ID NO:2在各組的定量結果,非大腸直腸癌組的濃度為3.05±3.44ng/mL,早期大腸直腸癌組的濃度為2.07±0.27ng/mL,晚期大腸直腸癌組的濃度為2.10±1.56ng/mL,全期大腸直腸癌組的濃度為2.08±0.39ng/mL,其中在SEQ ID NO:2組別中在早期大腸直腸癌組的含量為非大腸直腸癌組的2.65倍,SEQ ID NO:2在晚期大腸直腸癌組的含量為非大腸直腸癌組的2.90倍。圖16顯示SEQ ID NO:3在各組的定量結果中,非大腸直腸癌組的濃度為24.79±13.58ng/mL,早期大腸直腸癌組的濃度為14.37±4.23ng/mL,晚期大腸直腸癌組的濃度為16.85±12.87ng/mL,全期大腸直腸癌組的濃度為15.61±3.75ng/mL,其中SEQ ID NO:3在 早期大腸直腸癌組的含量為非大腸直腸癌組的0.58倍,SEQ ID NO:3在晚期大腸直腸癌組的含量為非大腸直腸癌組的0.68倍。 The quantitative results are shown in Figures 14, 15, and 16. Figure 14 shows that in the quantitative results of SEQ ID NO: 1 in each group, the concentration of the non-colorectal cancer group was 9.59±27.14ng/mL, the concentration of the early colorectal cancer group was 25.41±33.36ng/mL, the concentration of the late colorectal cancer group was 27.81±11.55ng/mL, and the concentration of the all-stage colorectal cancer group was 26.61±33.82ng/mL. Among them, the content of the early colorectal cancer group was 0.68 times that of the non-colorectal cancer group, and the content of SEQ ID NO: 1 in the late colorectal cancer group was 0.69 times that of the non-colorectal cancer group. Figure 15 shows the quantitative results of SEQ ID NO: 2 in each group. The concentration of the non-colorectal cancer group was 3.05±3.44ng/mL, the concentration of the early colorectal cancer group was 2.07±0.27ng/mL, the concentration of the late colorectal cancer group was 2.10±1.56ng/mL, and the concentration of the all-stage colorectal cancer group was 2.08±0.39ng/mL. Among the SEQ ID NO: 2 groups, the content of SEQ ID NO: 2 in the early colorectal cancer group was 2.65 times that of the non-colorectal cancer group, and the content of SEQ ID NO: 2 in the late colorectal cancer group was 2.90 times that of the non-colorectal cancer group. Figure 16 shows the quantitative results of SEQ ID NO: 3 in each group. The concentration of non-colorectal cancer group is 24.79±13.58ng/mL, the concentration of early colorectal cancer group is 14.37±4.23ng/mL, the concentration of late colorectal cancer group is 16.85±12.87ng/mL, and the concentration of all-stage colorectal cancer group is 15.61±3.75ng/mL. The content of SEQ ID NO: 3 in the early colorectal cancer group is 0.58 times that of the non-colorectal cancer group, and the content of SEQ ID NO: 3 in the late colorectal cancer group is 0.68 times that of the non-colorectal cancer group.

在通過機器學習結合不同胜肽片段含量用於檢測大腸直腸癌的效能評估:通過機器學習結合不同胜肽片段含量用於檢測大腸直腸癌的效能評估的步驟中,使用GraphPad Prism(vers.5.0;GraphPad Software,SanDiego,CA,USA)繪製接收者操作特徵曲線(receiver operating characteristic curve,ROC curve)並計算曲線下面積和(area under curve,AUC),並根據約登指數(Youden index)作為判斷診斷效果的臨界值(cutoff value)顯示靈敏度(sensitivity)、特異性(specificity)、準確度(accuracy)、p值,以及統計檢定力(power),其中p值係使用學生t檢定計算組間是否達顯著差異。 In the step of evaluating the performance of combining different peptide fragment contents for detecting colorectal cancer by machine learning: In the step of evaluating the performance of combining different peptide fragment contents for detecting colorectal cancer by machine learning, GraphPad Prism (vers.5.0; GraphPad Software, San Diego, CA, USA) was used to draw the receiver operating characteristic curve (ROC curve) and calculate the area under the curve (AUC), and the Youden index was used as the cutoff value for judging the diagnostic effect to display the sensitivity, specificity, accuracy, p value, and statistical power, where the p value was calculated using the Student's t test to determine whether there was a significant difference between the groups.

另外,為了評估3個個別胜肽任一或任二以上組成的群組用於診斷大腸直腸癌的效能,將經內添加標準品進行定量的結果通過機器學習方法進行整合,包括:決策樹(decision tree,DT)、隨機森林(random forest,RF)、支持向量機(support vector machine,SVM),以及多元邏輯式迴歸(logistic regression,LR),並使用生物醫學統計軟體(MedCalc Statistical Software,vers.15.4;MedCalc Software,Ostend,Belgium)對任2個不同接收者操作特徵曲線進行配對比較(Pair-wise comparisons)。 In addition, in order to evaluate the efficacy of any one or any two or more of the three individual peptides in the diagnosis of colorectal cancer, the results of the quantitative addition of the standard were integrated through machine learning methods, including: decision tree (DT), random forest (RF), support vector machine (SVM), and multivariate logistic regression (LR), and paired comparisons of any two different receiver operating characteristic curves were performed using biomedical statistical software (MedCalc Statistical Software, vers. 15.4; MedCalc Software, Ostend, Belgium).

其中,機器學習方法為使用scikit-learn(vers.0.21.3)進行10折交叉驗證(10-fold cross validation),係將數據資料拆分為十等分,將其中十分之九的數據資料用於機器學習模型訓練,剩餘之十分之一的數據資料用於驗證機器學習模席訓練結果,參數調諧根據每次10折交叉驗證訓練組(training set)和驗證組(validation set)獲得的曲線下面積和數值進行調整,其中各個機器學習模型的最終參數如下:決策樹參數設置如下:決策樹深度(tree depth):10,核心模型(kernel of the model)設置為吉尼(gini)。在隨機森林中參數設置如下:決策樹數量(tree value number):200,核心模型設置為gini。支持向量參數設置如下:γ值(value of gamma):1×10-10,C值(value of C):1×10-7,kernel of the model設置為徑向基底函數(radial basis function,RBF)。多元邏輯式迴歸使用默認設置(default)。最後通過曲線下面積和、靈敏度、特異性、準確度、學生t檢定,以及統計檢定力評估各個機器學習模型的表現,結果如圖17~圖22所示。 Among them, the machine learning method uses scikit-learn (vers.0.21.3) to perform 10-fold cross validation, which is to split the data into ten equal parts, use nine-tenths of the data for machine learning model training, and the remaining one-tenth of the data for verifying the machine learning model training results. The parameter tuning is adjusted according to the area under the curve and the value obtained by the training set and validation set in each 10-fold cross validation. The final parameters of each machine learning model are as follows: The decision tree parameter settings are as follows: Decision tree depth: 10, the core model (kernel of the model) is set to Gini. The parameters in the random forest are set as follows: the number of decision trees (tree value number): 200, the core model is set to gini. The support vector parameters are set as follows: γ value (value of gamma): 1× 10-10 , C value (value of C): 1× 10-7 , and the kernel of the model is set to radial basis function (RBF). The multivariate logical regression uses the default settings. Finally, the performance of each machine learning model is evaluated by the sum of areas under the curve, sensitivity, specificity, accuracy, Student's t test, and statistical power. The results are shown in Figures 17 to 22.

其中圖17為通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸腸癌組和早期大腸直腸癌組的效能,其中,以SEQ ID NO:1含量區分的多元邏輯式迴歸曲線下面積和為0.67,以SEQ ID NO:2含量區分的多元邏輯式迴歸曲線下面積和為0.80,以SEQ ID NO:3含量區分的多元邏輯式迴歸曲線下面積和為0.82,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的的多元邏輯式迴歸曲線下面積和為0.90,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的的隨機森林曲線下面積和為0.88。其中圖18通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸直腸癌組和晚期大腸直腸癌組的效能,其中,以SEQ ID NO:1含量區分的多元邏輯式迴歸曲線下面積和為0.63,以SEQ ID NO:2含量區分的多元邏輯式迴歸曲線下面積和為0.72,以SEQ ID NO:3含量區分的多元邏輯式迴歸曲線下面積和為0.70,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的的多元邏輯式迴歸曲線下面積和為0.85,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的的隨機森林曲線下面積和為0.88。以接收者操作特徵曲線表示,並計算曲線下面積和。其中圖19為通過多元邏輯式迴歸和隨機森林評估胜肽組合物用於區分非大腸直腸癌組和全期大腸直腸癌組的效能,其中,以SEQ ID NO:1含量區分的曲線下面積和為0.66以SEQ ID NO:2含量區分的曲線下面積和為0.77以SEQ ID NO:3含量區分的曲線下面積和為0.79,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的多元邏輯式迴歸曲線下面 積和為0.88,合併SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3區分的隨機森林曲線下面積和為0.96。 FIG. 17 shows the efficacy of the peptide combination in distinguishing the non-colorectal cancer group from the early colorectal cancer group by multivariate logistic regression and random forest evaluation, wherein the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1 is 0.67, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 2 is 0.80, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 3 is 0.82, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.90, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.91, and the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.93. The sum of the areas under the random forest curve for NO:3 classification is 0.88. FIG. 18 shows the efficacy of the peptide combination in distinguishing the non-colorectal cancer group from the advanced colorectal cancer group by multivariate logistic regression and random forest evaluation, wherein the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1 is 0.63, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 2 is 0.72, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 3 is 0.70, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.85, the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.86, and the sum of the areas under the multivariate logistic regression curve for differentiation based on the content of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 combined is 0.94. The sum of areas under the random forest curve for NO:3 classification is 0.88. It is represented by the receiver operating characteristic curve and the sum of areas under the curve is calculated. Figure 19 shows the efficacy of the peptide combination in differentiating the non-colorectal cancer group and the full-stage colorectal cancer group by multivariate logistic regression and random forest evaluation, wherein the sum of the area under the curve for differentiation based on the content of SEQ ID NO: 1 is 0.66, the sum of the area under the curve for differentiation based on the content of SEQ ID NO: 2 is 0.77, the sum of the area under the curve for differentiation based on the content of SEQ ID NO: 3 is 0.79, the sum of the area under the multivariate logistic regression curve for differentiation based on SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 is 0.88, and the sum of the area under the random forest curve for differentiation based on SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 is 0.96.

其中圖20為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量用於檢測大腸直腸癌的效能,其中,SEQ ID NO:1的曲線下面積和為0.67(0.58~0.74),靈敏度為75.7%(67.1%-87.5%),特異性為56.5%(43.3%-69.0%),準確度為63.2%(55.9%-80.2%)。其中,SEQ ID NO:2的曲線下面積和為0.80(0.77-0.83),靈敏度為64.3%(51.9%-75.4%),特異性為87.1%(76.1%-95.3%),準確度為76.3%(72.7%-82.3%)。其中,SEQ ID NO:3的曲線下面積和為0.82(0.74-0.88),靈敏度為78.6%(67.1%-87.5%),特異性為74.2%(61.5%-84.5%),準確度為77.5%(73.3%-83.9%)。 Figure 20 shows the performance of detecting the individual contents of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3 for detecting colorectal cancer, wherein the sum of the area under the curve of SEQ ID NO: 1 is 0.67 (0.58-0.74), the sensitivity is 75.7% (67.1%-87.5%), the specificity is 56.5% (43.3%-69.0%), and the accuracy is 63.2% (55.9%-80.2%). The sum of the area under the curve of SEQ ID NO: 2 is 0.80 (0.77-0.83), the sensitivity is 64.3% (51.9%-75.4%), the specificity is 87.1% (76.1%-95.3%), and the accuracy is 76.3% (72.7%-82.3%). Among them, the sum of the areas under the curve of SEQ ID NO: 3 is 0.82 (0.74-0.88), the sensitivity is 78.6% (67.1%-87.5%), the specificity is 74.2% (61.5%-84.5%), and the accuracy is 77.5% (73.3%-83.9%).

其中圖21為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量,以任二者組成的群組用於檢測大腸直腸癌的效能。其中,以SEQ ID NO:1的含量和SEQ ID NO:2的含量用於檢測大腸直腸癌的效能,以決策樹、隨機森林、支持向量機與多元邏輯式迴歸,曲線下面積和介於0.76~0.85,靈敏度介於66.8%~83.2%,特異性介於62.3%~84.2%,準確度介於72.9%~78.7%。其中,以SEQ ID NO:1的含量和SEQ ID NO:3的含量用於檢測大腸直腸癌的效能,以決策樹、隨機森林、支持向量機與多元邏輯式迴歸,曲線下面積和介於0.76~0.86,靈敏度介於79.7%~86.1%,特異性介於66.9%~82.0%,準確度介於74.5%~80.2%。其中,以SEQ ID NO:2的含量和SEQ ID NO:3的含量用於檢測大腸直腸癌的效能,以決策樹、隨機森林、支持向量機與多元邏輯式迴歸,曲線下面積和介於0.77~0.87,靈敏度介於72.9%~85.9%,特異性介於75.0%~85.6%,準確度介於75.6%~82.5%。 Figure 21 shows the performance of detecting the individual contents of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, and any two of them as a group for detecting colorectal cancer. The performance of the contents of SEQ ID NO: 1 and SEQ ID NO: 2 for detecting colorectal cancer was 0.76-0.85 with the decision tree, random forest, support vector machine, and multivariate logistic regression, the sensitivity was 66.8%-83.2%, the specificity was 62.3%-84.2%, and the accuracy was 72.9%-78.7%. Among them, the content of SEQ ID NO: 1 and the content of SEQ ID NO: 3 were used to detect the efficacy of colorectal cancer. The sum of the areas under the curves ranged from 0.76 to 0.86, the sensitivity ranged from 79.7% to 86.1%, the specificity ranged from 66.9% to 82.0%, and the accuracy ranged from 74.5% to 80.2% using decision trees, random forests, support vector machines, and multivariate logistic regression. Among them, the content of SEQ ID NO: 2 and SEQ ID NO: 3 were used to detect colorectal cancer. The sum of the areas under the curves ranged from 0.77 to 0.87, the sensitivity ranged from 72.9% to 85.9%, the specificity ranged from 75.0% to 85.6%, and the accuracy ranged from 75.6% to 82.5% using decision trees, random forests, support vector machines, and multivariate logistic regression.

其中圖22為檢測SEQ ID NO:1、SEQ ID NO:2,以及SEQ ID NO:3的個別含量,合併三者組成的群組用於檢測大腸直腸癌的效能,以決策樹、隨 機森林、支持向量機與多元邏輯式迴歸,曲線下面積和介於0.78~0.97,靈敏度介於73.9%~89.2%,特異性介於77.8%~97.9%,準確度介於76.9%~92.3%。 Figure 22 shows the performance of detecting the individual contents of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, and combining the three to form a group for detecting colorectal cancer. The sum of the areas under the curves of decision trees, random forests, support vector machines, and multivariate logistic regression ranged from 0.78 to 0.97, the sensitivity ranged from 73.9% to 89.2%, the specificity ranged from 77.8% to 97.9%, and the accuracy ranged from 76.9% to 92.3%.

經由上述實驗結果可得知,由SEQ ID NO:1,SEQ ID NO:2和SEQ ID NO:3任二以上組成的胜肽組合物用於檢測大腸直腸癌的效能,較單一胜肽檢測大腸直腸癌效能更佳,特異性表現相當外,靈敏度和準確度在大多數組合中皆有較佳的表現。 From the above experimental results, it can be seen that the performance of a peptide combination composed of any two or more of SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 in detecting colorectal cancer is better than that of a single peptide. In addition to having comparable specificity, the sensitivity and accuracy are better in most combinations.

綜上所述,本發明提出的胜肽組合物,可賦予檢測高敏感度與高特異性,意即本發明所提的方法可為臨床上的檢測帶來應用價值。 In summary, the peptide combination proposed in the present invention can provide high sensitivity and high specificity for detection, which means that the method proposed in the present invention can bring application value to clinical detection.

惟以上所述者,僅為本發明的較佳實施例,但不能以此限定本發明的實施範圍;故,凡依本發明申請專利範圍及發明說明書內容所作之簡單的等效改變與修飾,皆仍屬本發明專利涵蓋的範圍內。 However, the above is only a preferred embodiment of the present invention, but it cannot limit the scope of implementation of the present invention; therefore, all simple equivalent changes and modifications made according to the scope of the patent application of the present invention and the content of the invention specification are still within the scope of the patent of the present invention.

<110> 臺北醫學大學 <110> Taipei Medical University

<120> 用於檢測大腸直腸癌的組合物的用途與方法 <120> Use and method of the composition for detecting colorectal cancer

<160> 3 <160> 3

<210> 1 <210> 1

<211> 9 <211> 9

<212> PRT <212> PRT

<213> 智人(Homo sapiens) <213> Homo sapiens

<400> 序列(3個英文代表一個胺基酸)

Figure 110136054-A0305-02-0023-1
<400> sequence (3 English letters represent one amino acid)
Figure 110136054-A0305-02-0023-1

<210> 2 <210> 2

<211> 10 <211> 10

<212> PRT <212> PRT

<213> 智人(Homo sapiens) <213> Homo sapiens

<400> 序列(3個英文代表一個胺基酸)

Figure 110136054-A0305-02-0023-2
<400> sequence (3 English letters represent one amino acid)
Figure 110136054-A0305-02-0023-2

<210> 3 <210> 3

<211> 9 <211> 9

<212> PRT <212> PRT

<213> 智人(Homo sapiens) <213> Homo sapiens

<400> 序列(3個英文代表一個胺基酸)

Figure 110136054-A0305-02-0024-3
<400> sequence (3 English letters represent one amino acid)
Figure 110136054-A0305-02-0024-3

Claims (8)

一種用於檢測大腸直腸癌的胜肽組合物的用途,其係用於製備檢測大腸直腸癌的組合物,其中該胜肽組合物係由3個胜肽片段所組成,各該胜肽片段的胺基酸序列分別如SEQ ID NO:1、SEQ ID NO:2及SEQ ID NO:3所示,其中該胜肽組合物係一血液樣品經一麥胚芽凝集素純化後取得真空抽乾的一蛋白質純化樣品,再以一甲酸水溶液回溶該蛋白質純化樣品及進行一胰蛋白酶分解而獲得;其中該SEQ ID NO:1、該SEQ ID NO:2及該SEQ ID NO:3上皆不具轉譯後修飾;其中判斷一待測個體是否罹患大腸癌的方式,為當一比較血液樣品經該純化方法後獲得的該SEQ ID NO:1的濃度小於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:1的濃度,而該比較血液樣品經該純化方法後獲得的該SEQ ID NO:2的濃度大於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:2的濃度,且該比較血液樣品經該純化方法後獲得的該SEQ ID NO:3的濃度大於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:3的濃度時,則判斷該待測個體可能罹患大腸直腸癌;其中該比較血液樣品係源自一非大腸直腸癌個體。 A use of a peptide composition for detecting colorectal cancer, which is used to prepare a composition for detecting colorectal cancer, wherein the peptide composition is composed of three peptide fragments, and the amino acid sequences of the peptide fragments are shown in SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3, respectively. The peptide composition is obtained by obtaining a vacuum-dried protein purified sample after a blood sample is purified by wheat germ agglutinin, and then the protein purified sample is re-dissolved in a formic acid aqueous solution and subjected to a trypsin digestion; wherein the SEQ ID NO: 1, the SEQ ID NO: 2 and the SEQ ID NO: 3 do not have post-translational modifications; wherein the method for judging whether a test subject suffers from colorectal cancer is when the SEQ ID NO: 1 obtained after a comparison blood sample is purified by the method is equal to the SEQ ID NO: 2. When the concentration of SEQ ID NO:1 is less than the concentration of SEQ ID NO:1 obtained after the test blood sample is purified by the purification method, and the concentration of SEQ ID NO:2 obtained after the comparison blood sample is purified by the purification method is greater than the concentration of SEQ ID NO:2 obtained after the test blood sample is purified by the purification method, and the concentration of SEQ ID NO:3 obtained after the comparison blood sample is purified by the purification method is greater than the concentration of SEQ ID NO:3 obtained after the test blood sample is purified by the purification method, it is judged that the test individual may suffer from colorectal cancer; wherein the comparison blood sample is derived from an individual without colorectal cancer. 如請求項第1項所述之用於檢測大腸直腸癌的胜肽組合物的用途,係通過一質譜分析或一免疫分析檢測該胜肽組合物的含量。 The use of the peptide composition for detecting colorectal cancer as described in claim 1 is to detect the content of the peptide composition by a mass spectrometry analysis or an immunoassay. 如請求項第1項所述之用於檢測大腸直腸癌的胜肽組合物的用途,其中,該血液樣品為一血清樣品、一血漿樣品,或一全血樣品。 The use of a peptide composition for detecting colorectal cancer as described in claim 1, wherein the blood sample is a serum sample, a plasma sample, or a whole blood sample. 一種檢測大腸直腸癌的方法,係包括: (a)經一檢測方法檢測一待測個體的一待測血液樣品經一純化方法後獲得之一胜肽組合物的含量,其中該胜肽組合物係由3個胜肽片段所組成,各該胜肽片段的胺基酸序列分別如SEQ ID NO:1、SEQ ID NO:2及SEQ ID NO:3所示;(b)比較經(a)測得之該胜肽組合物的含量,與一非大腸直腸癌個體的一比較血液樣品經該純化方法後獲得的該胜肽組合物的含量,判斷該待測個體是否罹患大腸直腸癌;其中該SEQ ID NO:1、該SEQ ID NO:2及該SEQ ID NO:3上皆不具轉譯後修飾;其中判斷該待測個體是否罹患大腸癌的方式,為當該比較血液樣品經該純化方法後獲得的該SEQ ID NO:1的濃度小於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:1的濃度,而該比較血液樣品經該純化方法後獲得的該SEQ ID NO:2的濃度大於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:2的濃度,且該比較血液樣品經該純化方法後獲得的該SEQ ID NO:3的濃度大於該待測血液樣品經該純化方法後獲得的該SEQ ID NO:3的濃度時,則判斷該待測個體可能罹患大腸直腸癌;其中,該純化方法為麥胚芽凝集素純化方法。 A method for detecting colorectal cancer comprises: (a) detecting the content of a peptide composition obtained by a purification method in a blood sample of a test subject, wherein the peptide composition is composed of three peptide fragments, and the amino acid sequences of each peptide fragment are shown as SEQ ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3 respectively; (b) comparing the content of the peptide composition detected by (a) with the content of the peptide composition obtained by the purification method in a comparison blood sample of a non-colorectal cancer subject, to determine whether the test subject suffers from colorectal cancer; wherein the SEQ ID NO: 1, the SEQ ID NO: 2 and the SEQ ID NO: 3 are respectively NO:3 is not modified after translation; wherein the method for determining whether the subject to be tested suffers from colorectal cancer is that when the concentration of SEQ ID NO:1 obtained after the comparison blood sample is purified by the method is less than the concentration of SEQ ID NO:1 obtained after the test blood sample is purified by the method, and the concentration of SEQ ID NO:2 obtained after the comparison blood sample is purified by the method is greater than the concentration of SEQ ID NO:2 obtained after the test blood sample is purified by the method, and the concentration of SEQ ID NO:3 obtained after the comparison blood sample is purified by the method is greater than the concentration of SEQ ID NO: NO: When the concentration is 3, it is judged that the individual to be tested may suffer from colorectal cancer; wherein the purification method is the wheat germ agglutinin purification method. 如請求項第4項所述之檢測大腸直腸癌的方法,其中比較方法係選自由多元邏輯式迴歸、決策樹、隨機森林及支持向量機所組成之群組。 A method for detecting colorectal cancer as described in claim 4, wherein the comparison method is selected from the group consisting of multivariate logistic regression, decision tree, random forest and support vector machine. 如請求項第4項所述之檢測大腸直腸癌的方法,其中該檢測方法係選自一質譜分析或一免疫分析。 A method for detecting colorectal cancer as described in claim 4, wherein the detection method is selected from a mass spectrometry analysis or an immunoassay. 如請求項第4項所述之檢測大腸直腸癌的方法,其中,該待測血液樣品與該比較血液樣品係皆為一血清樣品,或該待測血液樣品與該比較血液樣品係皆為一血漿樣品,或該待測血液樣品與該比較血液樣品係皆為一全血樣品。 The method for detecting colorectal cancer as described in claim 4, wherein the blood sample to be tested and the blood sample for comparison are both serum samples, or the blood sample to be tested and the blood sample for comparison are both plasma samples, or the blood sample to be tested and the blood sample for comparison are both whole blood samples. 如請求項第6項所述之檢測大腸直腸癌的方法,其中,該質譜分析係選自由基質輔助雷射脫附游離質譜分析、液相層析質譜分析、液相層析串聯質譜分析、及氣相層析質譜分析所組成之群組。 A method for detecting colorectal cancer as described in claim 6, wherein the mass spectrometry analysis is selected from the group consisting of radical-assisted laser desorption/ionization mass spectrometry analysis, liquid chromatography-mass spectrometry analysis, liquid chromatography-tandem mass spectrometry analysis, and gas chromatography-mass spectrometry analysis.
TW110136054A 2021-09-28 2021-09-28 Uses and methods of compositions for the detection of colorectal cancer TWI838648B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW110136054A TWI838648B (en) 2021-09-28 2021-09-28 Uses and methods of compositions for the detection of colorectal cancer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW110136054A TWI838648B (en) 2021-09-28 2021-09-28 Uses and methods of compositions for the detection of colorectal cancer

Publications (2)

Publication Number Publication Date
TW202314245A TW202314245A (en) 2023-04-01
TWI838648B true TWI838648B (en) 2024-04-11

Family

ID=86943365

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110136054A TWI838648B (en) 2021-09-28 2021-09-28 Uses and methods of compositions for the detection of colorectal cancer

Country Status (1)

Country Link
TW (1) TWI838648B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7575876B2 (en) * 2005-10-27 2009-08-18 The University Of Washington Biomarkers for neurodegenerative disorders
US8586006B2 (en) * 2006-08-09 2013-11-19 Institute For Systems Biology Organ-specific proteins and methods of their use
TWI518326B (en) * 2013-11-29 2016-01-21 國立陽明大學 Biomarker, antibody, kit and detecting method for detecting colorectal cancer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7575876B2 (en) * 2005-10-27 2009-08-18 The University Of Washington Biomarkers for neurodegenerative disorders
US8586006B2 (en) * 2006-08-09 2013-11-19 Institute For Systems Biology Organ-specific proteins and methods of their use
TWI518326B (en) * 2013-11-29 2016-01-21 國立陽明大學 Biomarker, antibody, kit and detecting method for detecting colorectal cancer

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
期刊 Chunxia Wang, et al., Wheat germ agglutinin-conjugated PLGA nanoparticles for enhanced intracellular delivery of paclitaxel to colon cancer cells", Int J Pharm., Volume 400, Issues 1–2, 15 November 2010, pages 201-210 *
期刊 I-Jung Tsai, et al., "Clinical Assay for the Early Detection of Colorectal Cancer Using Mass Spectrometric Wheat Germ Agglutinin Multiple Reaction Monitoring", cancers, Volume 13, Issue 9, 02 May 2021, https://doi.org/10.3390/cancers13092190; *

Also Published As

Publication number Publication date
TW202314245A (en) 2023-04-01

Similar Documents

Publication Publication Date Title
US20130330746A1 (en) Biomarkers useful for diagnosing prostate cancer, and methods thereof
US20090204334A1 (en) Lung cancer biomarkers
EP2851688B1 (en) Use of glycoprotein C4BPA as marker for detecting pancreatic cancer
WO2023082820A1 (en) Marker for lung adenocarcinoma diagnosis and application thereof
Zheng et al. New serum biomarkers for detection of endometriosis using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
US20100190183A1 (en) Protein labelling with tags comprising isotope-coded sub-tags and isobaric sub-tags
CN114924073B (en) Tumor diagnosis marker combination in colorectal progression stage and application thereof
US20070087392A1 (en) Method for diagnosing head and neck squamous cell carcinoma
CN116879558B (en) Ovarian cancer diagnosis marker, detection reagent and detection kit
TWI838648B (en) Uses and methods of compositions for the detection of colorectal cancer
Thierolf et al. Towards a comprehensive proteome of normal and malignant human colon tissue by 2‐D‐LC‐ESI‐MS and 2‐DE proteomics and identification of S100A12 as potential cancer biomarker
KR20160146102A (en) Diagnostic method for colon cancer using N- glycan mass spectrometry
Zhai et al. Evaluation of serum phosphopeptides as potential biomarkers of gastric cancer
US20110236993A1 (en) Pancreatic cancer markers
CN112924692B (en) Diabetes diagnosis kit based on polypeptide quantitative determination and method thereof
CN112924690B (en) Serum polypeptide combined marker for early warning and/or diagnosis of diabetes, detection kit and method
JP7062063B2 (en) Glycans specific to prostate cancer and testing methods using them
Zhai et al. Serum phosphopeptide profiling for colorectal cancer diagnosis using liquid chromatography–mass spectrometry
KR20200038449A (en) A method of isolation of protein and lipid using zwitterionic detergents and the use thereof
CN116519952B (en) Marker for pre-operation screening and diagnosis of carotid aneurysm and application thereof
US20150133342A1 (en) Mrm-ms signature assay
CN114965733B (en) Colorectal advanced adenoma diagnosis metabolic marker combination and application thereof
WO2019013256A1 (en) Quality evaluation method for biological specimen and marker therefor
CN118091136B (en) Exosome protein marker for diagnosis of neuroblastoma, kit and identification method thereof
CN115184611B (en) Endometrial cancer stratification related marker and application thereof