TW202022890A - Computer-aided recognition system, its method and its computer program product thereof - Google Patents

Computer-aided recognition system, its method and its computer program product thereof Download PDF

Info

Publication number
TW202022890A
TW202022890A TW107144007A TW107144007A TW202022890A TW 202022890 A TW202022890 A TW 202022890A TW 107144007 A TW107144007 A TW 107144007A TW 107144007 A TW107144007 A TW 107144007A TW 202022890 A TW202022890 A TW 202022890A
Authority
TW
Taiwan
Prior art keywords
dnn
neuron
computer
hidden
basic
Prior art date
Application number
TW107144007A
Other languages
Chinese (zh)
Other versions
TWI681407B (en
Inventor
謝孟軒
謝孟儒
高嘉鴻
Original Assignee
謝孟軒
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 謝孟軒 filed Critical 謝孟軒
Priority to TW107144007A priority Critical patent/TWI681407B/en
Application granted granted Critical
Publication of TWI681407B publication Critical patent/TWI681407B/en
Publication of TW202022890A publication Critical patent/TW202022890A/en

Links

Images

Abstract

A computer-aided prediction system for predicting a diabetes mellitus patient’s onset possibility of CRC is provided. The system includes a deep neural network (DNN) model used to analyze a plurality of baseline characteristics from the diabetes mellitus patient by a deep neural path. The DNN model includes a plurality of neurons and an output layer. At least a part of the neurons corresponds to the baseline characteristics. The output layer outputs an output result related to the onset possibility of CRC.

Description

電腦輔助預測系統、方法及電腦程式產品Computer-aided prediction system, method and computer program product

本發明屬於電腦輔助預測技術領域,特別是預測直腸癌病發可能性的電腦輔助預測技術領域。The invention belongs to the technical field of computer-aided prediction, in particular to the technical field of computer-aided prediction for predicting the possibility of rectal cancer.

糖尿病是最常見的疾病之一,並且全球罹病人數正逐年增加。對於糖尿病患者而言,良好的醫療照護將可提升生存率。然而,最近的研究發現,糖尿病患者罹患直腸癌的風險會一般人更高,一旦發病,將影響糖尿病患者的醫療照護品質,並使生存率大幅下降。對此,若能即早預測糖尿病患者的罹癌機率,就能給予適當的醫療照護,進而能提升患者的生存率。目前雖有一些技術可預測糖尿病患者的併發症,例如適應糖尿病併發症嚴重程度指數(Adapted Diabetes Complication Severity Index, aDCSI),但其精準度仍不符合預期。由此可知,目前仍急需一種能精準預測糖尿病患者罹患直腸癌可能性的技術Diabetes is one of the most common diseases, and the number of patients worldwide is increasing year by year. For diabetic patients, good medical care will improve survival. However, recent studies have found that the risk of rectal cancer for diabetic patients is generally higher. Once the disease occurs, it will affect the quality of medical care for diabetic patients and significantly reduce the survival rate. In this regard, if the risk of cancer in diabetic patients can be predicted as early as possible, appropriate medical care can be given, which can improve the survival rate of patients. Although there are some technologies that can predict the complications of diabetic patients, such as the Adapted Diabetes Complication Severity Index (adCSI), the accuracy is still not up to expectations. It can be seen that there is still an urgent need for a technology that can accurately predict the possibility of diabetic patients suffering from rectal cancer.

本發明提出一種電腦輔助預測技術,是以深度神經網路為基礎,並配合有併發直腸癌或未併發直腸癌的大量糖尿病患者的病理因子資料來對深度神經網路的基本模型進行訓練,當訓練完成後,深度神經網路即可準確地預測糖尿病患者罹患直腸癌的可能性。The present invention proposes a computer-assisted prediction technology, which is based on a deep neural network and cooperates with pathological factor data of a large number of diabetic patients with or without rectal cancer to train the basic model of the deep neural network. After the training is completed, the deep neural network can accurately predict the possibility of diabetic patients suffering from rectal cancer.

根據本發明的一觀點,茲提出一種電腦輔助預測系統,用以預測糖尿病患者罹患直腸癌的可能性。該系統包含深度神經網路模型,用以透過深度神經元路徑對糖尿病患者的複數個病理因子資料進行特徵分析。深度神經網路模型包含複數個神經元及輸出層。至少部分神經元對應病理因子資料。輸出層根據該特徵分析而輸出與罹患可能性有關的輸出結果。其中,DNN模型是透過複數次訓練來決定每個神經元對應的權重值,進而建立深度神經元路徑。According to an aspect of the present invention, a computer-aided prediction system is proposed to predict the possibility of diabetic patients suffering from rectal cancer. The system includes a deep neural network model to analyze the characteristics of multiple pathological factors of diabetic patients through deep neuron paths. The deep neural network model includes multiple neurons and output layers. At least some neurons correspond to pathological factor data. The output layer outputs output results related to the possibility of suffering based on the feature analysis. Among them, the DNN model determines the weight value corresponding to each neuron through multiple trainings, and then establishes a deep neuron path.

根據本發明的另一觀點,是提供一種電腦輔助預測方法,用以預測糖尿病患者罹患直腸癌的可能性,該方法是透過電腦輔助預測系統來執行,其中電腦輔助預測系統包含具有複數個神經元及輸出層的DNN模型。該方法包含步驟:取得糖尿病患者的複數個病理因子資料;藉由DNN模型,透過深度神經元路徑對該等病理因子資料進行特徵分析;以及藉由輸出層,根據特徵分析而輸出與罹癌可能性有關的輸出結果;其中,DNN模型是透過複數次訓練來決定每個神經元對應的權重值,進而建立深度神經元路徑。According to another aspect of the present invention, a computer-aided prediction method is provided to predict the possibility of a diabetic patient suffering from rectal cancer. The method is implemented through a computer-aided prediction system, wherein the computer-aided prediction system includes a plurality of neurons And the DNN model of the output layer. The method includes the steps of: obtaining a plurality of pathological factor data of a diabetic patient; using the DNN model to perform feature analysis on the pathological factor data through a deep neuron path; and using the output layer to output and analyze the potential for cancer. Sex-related output results; among them, the DNN model determines the weight value corresponding to each neuron through multiple trainings, and then establishes a deep neuron path.

根據本發明又另一觀點,是提供一種電腦程式產品,儲存於非暫態電腦可讀取媒體之中,用以使電腦輔助預測系統進行運作,其中電腦輔助預測系統是用以預測糖尿病患者罹患直腸癌的可能性,並包含具有複數個神經元及輸出層的DNN模型。電腦程式產品包含: 一指令,取得糖尿病患者的複數個病理因子資料;一指令,使DNN模型透過深度神經元路徑對病理因子資料進行一特徵分析;以及一指令,使輸出層根據特徵分析而輸出與罹癌可能性有關的輸出結果;其中,DNN模型是透過複數次訓練來決定每個神經元對應的權重值,藉此建立深度神經元路徑。According to yet another aspect of the present invention, a computer program product is provided, which is stored in a non-transitory computer readable medium for the operation of a computer-aided prediction system, wherein the computer-aided prediction system is used to predict the occurrence of diabetic patients The possibility of rectal cancer, and includes a DNN model with multiple neurons and output layers. The computer program product includes: a command to obtain multiple pathological factor data of diabetic patients; a command to make the DNN model perform a feature analysis on the pathological factor data through the deep neuron path; and a command to make the output layer output according to the feature analysis Output results related to the possibility of cancer; among them, the DNN model determines the weight value corresponding to each neuron through multiple trainings, thereby establishing a deep neuron path.

以下說明書將提供本發明的多個實施例。可理解的是,這些實施例並非用以限制。本發明的各實施例的特徵可加以修飾、置換、組合、分離及設計以應用於其他實施例。The following description will provide multiple embodiments of the invention. It can be understood that these embodiments are not intended to limit. The features of each embodiment of the present invention can be modified, substituted, combined, separated, and designed to be applied to other embodiments.

圖1(A)是本發明一實施例的電腦輔助預測系統1的系統架構圖。如圖1所示,電腦輔助預測系統1包含一深度神經網路模型10(Deep Neural Network,以下簡稱DNN模型10),用以預測糖尿病患者罹患直腸癌的機率。在一實施例中,電腦輔助預測系統1更可包含一資料取得介面20。資料取得介面20用以取得來自外部的資料,亦即使用者(例如醫師)可透過資料取得介面20將患者的資料輸入至電腦輔助預測系統1中。此外,在一實施例中,電腦輔助預測系統1更可包含一測試模組40,用以測試DNN模型10的預測能力。FIG. 1(A) is a system architecture diagram of a computer-aided prediction system 1 according to an embodiment of the present invention. As shown in FIG. 1, the computer-aided prediction system 1 includes a deep neural network model 10 (Deep Neural Network, hereinafter referred to as DNN model 10) for predicting the probability of diabetic patients suffering from rectal cancer. In one embodiment, the computer-aided prediction system 1 may further include a data acquisition interface 20. The data acquisition interface 20 is used to acquire data from the outside, that is, a user (such as a doctor) can input the patient's data into the computer-aided prediction system 1 through the data acquisition interface 20. In addition, in one embodiment, the computer-aided prediction system 1 may further include a test module 40 for testing the predictive ability of the DNN model 10.

圖1(B)是本發明一實施例的DNN模型10的架構圖,請同時參考圖1(A)。DNN模型10包含了一輸入層12、複數個隱藏神經層14及一輸出層16。輸入層12具有複數個基本神經元13,其中每個基本神經元13對應一種病理因子資料及一個權重值。隱藏神經層14各自具有複數個隱藏神經元15,其中該等隱藏神經元15與該等基本神經元13連結,且各自亦對應一個權重值。輸出層16包含二輸出神經元17,用以產生二輸出結果,其中該等輸出結果各自對應糖尿病患者的罹癌機率及未罹癌機率。在一實施例中,基本神經元13、隱藏神經元15及輸出神經元17可形成一深度神經元路徑18,而DNN模型10可透過深度神經元路徑18對糖尿病患者的複數個病理因子資料進行一特徵分析,輸出層16可根據該特徵分析而輸出該等輸出結果(罹患癌機率以及未罹癌機率)。更詳細地說明,當DNN模型10取得糖尿病患者的複數個病理因子資料時,可將糖尿病患者的該等病理因子輸入至深度神經元路徑18之中,並利用深度神經元路徑18上每個基本神經元13、隱藏神經元15及輸出神經元17對糖尿病患者的該等病理因子進行特徵分析。在一實施例中,「特徵分析」可視為每個神經元對病理因子所進行的運算,而「運算」可包含一加總運算、一激發運算、一加權運算或該等至少二者之組合,且不限於此。藉此,本發明的DNN模型1可準確地預測該糖尿病患者的罹癌的可能性。接著將說明各元件的細節。FIG. 1(B) is an architecture diagram of a DNN model 10 according to an embodiment of the present invention. Please also refer to FIG. 1(A). The DNN model 10 includes an input layer 12, a plurality of hidden neural layers 14 and an output layer 16. The input layer 12 has a plurality of basic neurons 13, and each basic neuron 13 corresponds to a kind of pathological factor data and a weight value. The hidden neural layers 14 each have a plurality of hidden neurons 15, wherein the hidden neurons 15 are connected to the basic neurons 13, and each also corresponds to a weight value. The output layer 16 includes two output neurons 17 for generating two output results, wherein the output results respectively correspond to the cancer probability and the non-cancer probability of diabetic patients. In one embodiment, the basic neuron 13, hidden neuron 15 and output neuron 17 can form a deep neuron path 18, and the DNN model 10 can perform data on multiple pathological factors of diabetic patients through the deep neuron path 18 A feature analysis, the output layer 16 can output the output results (the probability of suffering from cancer and the probability of not suffering from cancer) according to the feature analysis. In more detail, when the DNN model 10 obtains the data of multiple pathological factors of the diabetic patient, the pathological factors of the diabetic patient can be input into the deep neuron path 18, and each basic path of the deep neuron path 18 is used. The neuron 13, hidden neuron 15 and output neuron 17 perform characteristic analysis of these pathological factors in diabetic patients. In one embodiment, "feature analysis" can be regarded as the operation performed by each neuron on pathological factors, and the "operation" can include a summation operation, an excitation operation, a weighting operation, or a combination of at least two of these , And not limited to this. Thereby, the DNN model 1 of the present invention can accurately predict the possibility of cancer in the diabetic patient. Next, the details of each element will be explained.

電腦輔助預測系統1可以是一資料處理裝置,其可透過任何具有微處理器的裝置來實現,例如桌上型電腦、筆記型電腦、智慧型行動裝置、伺服器或雲端主機等類似裝置。在一實施例中,電腦輔助預測系統1可具備網路通訊功能,以將資料透過網路進行傳輸,其中網路通訊可以是有線網路或無線網路,因此電腦輔助預測系統1亦可透過網路來取得資料。在一實施例中,電腦輔助預測系統1可由微處理器中執行一電腦程式產品30來實現其功能,其中電腦程式產品30可具有複數個指令,該等指令可使處理器執行特殊運作,進而使處理器實現如DNN模型10或測試模組40等功能。在一實施例中,電腦程式產品30可儲存於一非暫態電腦可讀取媒體(例如記憶體)之中,但不限於此。在一實施例中,電腦程式產品30亦可預先儲存於網路伺服器中,以供使用者下載。The computer-aided prediction system 1 can be a data processing device, which can be implemented by any device with a microprocessor, such as a desktop computer, a notebook computer, a smart mobile device, a server or a cloud host, and the like. In one embodiment, the computer-aided forecasting system 1 may have a network communication function to transmit data through the network. The network communication may be a wired network or a wireless network, so the computer-aided forecasting system 1 may also use Access to the Internet. In one embodiment, the computer-aided prediction system 1 can implement a computer program product 30 in a microprocessor to realize its functions. The computer program product 30 can have a plurality of instructions that enable the processor to perform special operations, and then The processor realizes functions such as the DNN model 10 or the test module 40. In one embodiment, the computer program product 30 can be stored in a non-transitory computer readable medium (such as a memory), but it is not limited thereto. In one embodiment, the computer program product 30 can also be pre-stored in a network server for users to download.

在一實施例中,資料取得介面20可以是用以取得外部資料的一實體連接埠,例如當電腦輔助預測系統1是由電腦時,資料取得介面20可以是電腦上USB介面、各種傳輸線接頭等,但並非限定。此外,資料取得介面20亦可與無線通訊晶片整合,因此能以無線傳輸的方式接收資料。In one embodiment, the data acquisition interface 20 may be a physical port for acquiring external data. For example, when the computer-aided prediction system 1 is a computer, the data acquisition interface 20 may be a USB interface on the computer, various transmission line connectors, etc. , But not limited. In addition, the data acquisition interface 20 can also be integrated with a wireless communication chip, so data can be received in a wireless transmission manner.

本發明的DNN模型10是一種資料分析的人工智慧模型,其是以複數個運算節點作為神經網路的神經元,且每個神經元的運算可視為病理因子的特徵分析。在利用大量的資料進行訓練後,DNN模型10可建構出每個神經元所對應的權重值。在一實施例中,在進行訓練之前,DNN模型10的基本模型(即未訓練的基本架構)可預先被建立,例如預先設定好神經元的數量、隱藏神經層14的數量、神經元之間的連結等,而系統1再透過電腦程式產品30中的指令使尚未訓練的DNN模型10進行訓練,以決定每個基本神經元13及隱藏神經元15的權重值,進而建立出深度神經元路徑18。在一實施例中,基本模型可經歷多次訓練而產生多個神經元路徑,並可透過測試模組40來測試每個神經元路徑的準確度。需注意的是,為區分訓練前與訓練後的DNN模型10,下文中對於未訓練的DNN模型10皆以基本模型來稱之,而訓練完成後則以DNN模型10稱之。The DNN model 10 of the present invention is an artificial intelligence model for data analysis, which uses a plurality of computing nodes as neurons of the neural network, and the operation of each neuron can be regarded as the characteristic analysis of pathological factors. After training with a large amount of data, the DNN model 10 can construct a weight value corresponding to each neuron. In one embodiment, before training, the basic model of the DNN model 10 (that is, the untrained basic architecture) can be established in advance, for example, the number of neurons, the number of hidden neural layers 14, and the number of neurons The system 1 trains the untrained DNN model 10 through the instructions in the computer program product 30 to determine the weight value of each basic neuron 13 and hidden neuron 15 to establish a deep neuron path 18. In one embodiment, the basic model may undergo multiple trainings to generate multiple neuron paths, and the accuracy of each neuron path may be tested through the testing module 40. It should be noted that, in order to distinguish the DNN model 10 before and after training, the untrained DNN model 10 is referred to as the basic model below, and the DNN model 10 after the training is completed.

圖2是本發明一實施例的DNN模型10(已完成訓練)的細部架構示意圖,請同時參考圖1(A)及1(B)。為了要準確預測糖尿病患者罹患直腸癌的可能性,本發明的DNN模型10(或基本模型)的隱藏神經層14的數量、基本神經元13的數量及隱藏神經元15的數量皆可視為可變參數。在圖2的實施例中,輸入層12可具有37個基本神經元13,亦即DNN模型10是以37個病理因子作為特徵分析時的基礎。此外,DNN模型10可具有3個隱藏神經層14,且每個隱藏神經層14各自包含30個隱藏神經元15。如圖2所示,輸入層12連結至第一個隱藏神經層141,第一個隱藏神經層142連結至第二個隱藏神經層142,第二個隱藏神經層142連結至第三個隱藏神經層143,第三麼隱藏神經層143連結至輸出層16,因此,當一患者的37個病理因子資料被輸入至DNN模型10時,會先在輸入層12進行分析,之後依序進入隱藏神經層141~143進行分析,之後再由輸出層16根據分析結果產生輸出結果17。上述「分析」是指每個神經元對於接收到的資料所進行的「運算」。在一實施例中,當資料通過一個神經元時,可視為一次運算的執行。FIG. 2 is a schematic diagram of the detailed structure of the DNN model 10 (trained) according to an embodiment of the present invention. Please refer to FIGS. 1(A) and 1(B) at the same time. In order to accurately predict the possibility of diabetic patients suffering from rectal cancer, the number of hidden neural layers 14, the number of basic neurons 13, and the number of hidden neurons 15 of the DNN model 10 (or basic model) of the present invention can all be regarded as variable parameter. In the embodiment of FIG. 2, the input layer 12 may have 37 basic neurons 13, that is, the DNN model 10 is based on 37 pathological factors as the basis for feature analysis. In addition, the DNN model 10 may have three hidden neural layers 14, and each hidden neural layer 14 includes 30 hidden neurons 15. As shown in Figure 2, the input layer 12 is connected to the first hidden neural layer 141, the first hidden neural layer 142 is connected to the second hidden neural layer 142, and the second hidden neural layer 142 is connected to the third hidden nerve. Layer 143, the third hidden neural layer 143 is connected to the output layer 16. Therefore, when the 37 pathological factors of a patient are input to the DNN model 10, it will be analyzed in the input layer 12 first, and then the hidden nerves will be sequentially entered The layers 141 to 143 are analyzed, and then the output layer 16 generates an output result 17 according to the analysis result. The above-mentioned "analysis" refers to the "calculation" performed by each neuron on the received data. In one embodiment, when data passes through a neuron, it can be regarded as the execution of an operation.

在一實施例中,輸入層12中的每個基本神經元13皆會與第一隱藏神經層141中的每個隱藏神經元連結,亦即每個基本神經元13的運算結果會各自傳送至第一隱藏神經層141的每個隱藏神經元15。第一隱藏神經層141的每個隱藏神經元15皆會與第二隱藏神經層142中的每個隱藏神經元15連結,亦即第一隱藏神經層141的每個隱藏神經元15的運算結果會傳送至第二隱藏神經層142的每個隱藏神經元。第二隱藏神經層142中的每個隱藏神經元15皆會與第三隱藏神經層143中的每個隱藏神經元15連結,亦即第二隱藏神經層142的每個隱藏神經元15的運算結果會傳送至第三隱藏神經層143中的每個隱藏神經元15。第三隱藏神經層143中的每個隱藏神經元15皆會與輸出層16中的每個輸出神經元17連結,亦即第三隱藏神經層143的每個隱藏神經元15的運算結果會傳送至每個輸出神經元17。經由輸出神經元17的運算後,輸出層16可產生患者的罹癌機率及為罹癌機率。In one embodiment, each basic neuron 13 in the input layer 12 is connected to each hidden neuron in the first hidden neural layer 141, that is, the calculation result of each basic neuron 13 is sent to Each hidden neuron 15 of the first hidden neural layer 141. Each hidden neuron 15 in the first hidden neural layer 141 is connected to each hidden neuron 15 in the second hidden neural layer 142, which is the calculation result of each hidden neuron 15 in the first hidden neural layer 141 It will be transmitted to each hidden neuron of the second hidden neural layer 142. Each hidden neuron 15 in the second hidden neural layer 142 is connected to each hidden neuron 15 in the third hidden neural layer 143, that is, the operation of each hidden neuron 15 in the second hidden neural layer 142 The result is transmitted to each hidden neuron 15 in the third hidden neural layer 143. Each hidden neuron 15 in the third hidden neural layer 143 is connected to each output neuron 17 in the output layer 16, that is, the calculation result of each hidden neuron 15 in the third hidden neural layer 143 is transmitted To each output neuron 17. After calculation by the output neuron 17, the output layer 16 can generate the cancer probability and the cancer probability of the patient.

接著將說明運算過程的細節。如圖2所示,在一實施例中,當37個病理因子資料進入輸入層12後,每個病理因子資料會各自與相對應的權重值進行加權運算(亦即與權重值進行相乘),之後所有加權後的資料再一併傳送至第一隱藏神經層141中的每個隱藏神經元15。在一實施例中,對於每個隱藏神經層141~143的每個隱藏神經元15而言,其會將接收到的資料先進行一加總運算,之後再將加總運算的結果進行一第一型態激發運算,而之後再將第一型態激發運算的結果與該隱藏神經元15所對應的權重值進行加權運算,而每個隱藏神經元15的加權運算結果將一併進入下一個隱藏神經層14或輸出層16之中。在一實施例中,對於輸出層16的每個輸出神經元17而言,所取得的資料會先進行加總步驟,之後再進行一第二型態激發運算,而第二型態激發運算後的結果將形成一機率值。Next, the details of the calculation process will be explained. As shown in FIG. 2, in one embodiment, after 37 pathological factor data enter the input layer 12, each pathological factor data will be weighted with the corresponding weight value (that is, multiplied by the weight value). , And then all the weighted data are sent to each hidden neuron 15 in the first hidden neural layer 141 together. In one embodiment, for each hidden neuron 15 of each hidden neural layer 141 to 143, it first performs a summation operation on the received data, and then performs a summation operation on the result of the summation operation. A type of excitation operation, and then the result of the first type of excitation operation and the weight value corresponding to the hidden neuron 15 are weighted, and the weighted operation result of each hidden neuron 15 will be entered into the next one. Hidden in the neural layer 14 or the output layer 16. In one embodiment, for each output neuron 17 of the output layer 16, the acquired data will be summed first, and then a second type excitation operation is performed, and after the second type excitation operation The result of will form a probability value.

在一實施例中,第一型態激發運算與第二型態激發運算可不相同。在一實施例中,第一激發運算是定義為使用線性整流函數(Rectified Linear Unit,ReLU)作為激發函數(Activation function)來進行運算。在一實施例中,第二激發運算是定義為使用Softmax函數作為激發函數來進行運算。由於ReLU函數的輸出區間為0至無限大,因此適合作為類神經網路的中間部分的神經元的激發器,而由於Softmax函數的輸出區間為0至1,因此適合作為類神經網路的輸出端的神經元的激發器,例如可使輸出結果形成機率。需注意的是,本發明不限於此,亦即本發明亦可使用其它激發函數來進行激發運算。In one embodiment, the first type of excitation operation and the second type of excitation operation may be different. In an embodiment, the first activation operation is defined as using a linear rectification function (Rectified Linear Unit, ReLU) as an activation function (Activation function) to perform the operation. In one embodiment, the second excitation operation is defined as using the Softmax function as the excitation function to perform the operation. Since the output range of the ReLU function is from 0 to infinity, it is suitable as an exciter of neurons in the middle part of a neural network, and since the output range of the Softmax function is 0 to 1, it is suitable for the output of a neural network The trigger of the neuron at the end, for example, can make the output result form a probability. It should be noted that the present invention is not limited to this, that is, the present invention can also use other excitation functions to perform excitation calculations.

藉此,當DNN模型10完成訓練後,只要將一患者的37個病理因子輸入至DNN模型10中,DNN模型10即可預測該患者罹患直腸癌的可能性。在一實施例中,這些病理因子資料可先進行正規化或標準化的程序而形成相同標準下的數值,例如每個病理因子可進行正規化或標準化的程序而轉換為一個分數,而這些分數可經由神經元進行加權運算,且最終形成罹癌機率及未罹癌機率。In this way, after the DNN model 10 is trained, as long as 37 pathological factors of a patient are input into the DNN model 10, the DNN model 10 can predict the possibility of the patient suffering from rectal cancer. In one embodiment, the pathological factor data can be normalized or standardized to form a value under the same standard. For example, each pathological factor can be normalized or standardized to be converted into a score, and these scores can be The weighted calculation is performed by neurons, and the probability of suffering from cancer and the probability of not suffering from cancer are finally formed.

此外,對於DNN模型10而言,神經元的資料來源,可能會影響著DNN模型10的預測能力。在一實施例中,患者的37個病理因子資料可包含生理性資料(Biographical)、共病症資料(Comorbidities)、糖尿病併發症資料(Diabetes Complications)、治療藥物資料(Medications)及指數資料(Scoring System),但不限於此。In addition, for the DNN model 10, the source of neuron data may affect the predictive ability of the DNN model 10. In one embodiment, the patient’s 37 pathological factor data may include physiological data (Biographical), comorbidities data (Comorbidities), diabetes complications data (Diabetes Complications), treatment medication data (Medications) and index data (Scoring System). ), but not limited to this.

在一實施例中,「生理性資料」可包含年齡、性別、低都市化(Lowest Urbanization)、中都市化(Medium Urbanization)、高都市化(High Urbanization)、最高都市化(Highest Urbanization)、白領階級(White Collar Occupation)、藍領階級(Blue Collar Occupation)及其它職業階級(Other Occupation)等資訊,但不限於此。在一實施例中,「共併症資料」可包含高血壓、高脂血症、中風、充血性心力衰竭、結腸直腸息肉、肥胖、COPD、CAD、哮喘、吸煙、炎症性腸病、腸易激綜合徵、CKD及酒精相關疾病等資訊,但不限於此。在一實施例中,「糖尿病併發症資料」可包含視網膜病變、腎病、神經病變、腦血管、心血管及代謝等資訊,但不限於此。在一實施例中,「治療藥物資料」可包含二甲雙胍(Metformin)、他汀類藥物(Statin),胰島素(Insulin)、磺脲類藥物(Sulfonylureas)、其他抗糖尿病藥物(Other antidiabetic drugs)、TZD及PVD等資訊,但不限於此。在一實施例中,「指數資料」可包含糖尿病併發症嚴重程度指數(aDCSI Index)資訊,但不限於此。In one embodiment, the "physiological data" may include age, gender, Lowest Urbanization, Medium Urbanization, High Urbanization, Highest Urbanization, and white-collar workers. Information such as White Collar Occupation, Blue Collar Occupation and Other Occupation, but not limited to this. In one embodiment, "comorbidity data" may include hypertension, hyperlipidemia, stroke, congestive heart failure, colorectal polyps, obesity, COPD, CAD, asthma, smoking, inflammatory bowel disease, bowel disease Information about irritable syndrome, CKD and alcohol-related diseases, but not limited to this. In one embodiment, the "diabetic complications data" may include information such as retinopathy, nephropathy, neuropathy, cerebrovascular, cardiovascular, and metabolism, but is not limited thereto. In one embodiment, "therapeutic drug information" may include metformin (Metformin), statins (Statin), insulin (Insulin), sulfonylureas (Sulfonylureas), other antidiabetic drugs, TZD and PVD and other information, but not limited to this. In one embodiment, the "index data" may include information on the severity of diabetic complications index (aDCSI Index), but is not limited thereto.

在一實施例中,每個病理因子資料可以被數值化為相對應的分數,其中數值化的方式可依照資料性質而不相同,舉例來說,某些特徵可依照「特徵的有無」而對應不同分數(例如性別的不同會對應不同分數、藥物的使用與否會對應不同分數等),而某些特徵本身可分為多個級距,並且透過級距而對應至不同分數(例如25歲可對應一分數,30歲可對應另一分數等);上述內容僅是舉例,本發明不限於此。In one embodiment, each pathological factor data can be digitized into a corresponding score, wherein the digitization method can be different according to the nature of the data, for example, certain characteristics can be corresponded according to the "presence of characteristics" Different scores (for example, different genders will correspond to different scores, drug use or not will correspond to different scores, etc.), and some features themselves can be divided into multiple levels, and through the levels, they correspond to different scores (for example, 25 years old). It can correspond to one score, 30 years old can correspond to another score, etc.); the above content is only an example, and the present invention is not limited thereto.

接著將說明電腦輔助預測系統1的基本運作方式。圖3是本發明一實施例的電腦輔助預測方法的基本步驟流程圖,該方法是由圖1(A)的電腦輔助預測系統1執行,其中DNN模型10屬於已訓練完成的狀態,並請同時參考圖1(A)至圖3。如圖3所示,首先步驟S31被執行,資料取得介面20取得一糖尿病患者的病理因子資料。之後,步驟S32被執行,DNN模型10的輸入層12取得病理因子資料,並將加權後的病理因子資料傳送至隱藏神經層14。之後,步驟S33被執行,每個隱藏神經層14的每個隱藏神經元會對接收到的資料進行運算,其中最後一個隱藏神經層14會將運算後的結果傳送至輸出層16。之後,步驟S34被執行,輸出層16對接收到的資料進行運算,進而輸出該患者的罹癌機率及未罹癌機率。之後,步驟S35被執行,系統1根據輸出層16的輸出結果,預測該患者的罹癌可能性。Next, the basic operation mode of the computer-aided prediction system 1 will be explained. Fig. 3 is a flowchart of the basic steps of a computer-aided prediction method according to an embodiment of the present invention. The method is executed by the computer-aided prediction system 1 of Fig. 1(A), where the DNN model 10 is in a state of completed training, and please also Refer to Figure 1 (A) to Figure 3. As shown in FIG. 3, first step S31 is executed, and the data obtaining interface 20 obtains pathological factor data of a diabetic patient. After that, step S32 is executed, the input layer 12 of the DNN model 10 obtains pathological factor data, and transmits the weighted pathological factor data to the hidden nerve layer 14. After that, step S33 is executed, and each hidden neuron of each hidden neural layer 14 will perform an operation on the received data, and the last hidden neural layer 14 will transmit the calculated result to the output layer 16. After that, step S34 is executed, and the output layer 16 performs calculations on the received data, and then outputs the cancer probability and non-cancer probability of the patient. After that, step S35 is executed, and the system 1 predicts the possibility of cancer of the patient according to the output result of the output layer 16.

關於步驟S31,在一實施例中,病理因子資料可以是前述的37個病理因子,並且已經由正規化或標準差運算而形成一分數。Regarding step S31, in an embodiment, the pathological factor data may be the aforementioned 37 pathological factors, and a score has been formed by normalization or standard deviation calculation.

關於步驟S32,在一實施例中,每個病理因子所對應的權重值(基本神經元13的權重值)皆已在DNN模型10(基本模型)的訓練過程中被決定,換言之,DNN模型10的訓練目的之一即是在決定每個病理因子所對應的權重值為何。在一實施例中,每個病理因子的分數會在基本神經元13中進行加權運算,之後再被傳送至第一個隱藏神經層14中的每個隱藏神經元15。Regarding step S32, in one embodiment, the weight value (the weight value of the basic neuron 13) corresponding to each pathological factor has been determined during the training process of the DNN model 10 (basic model), in other words, the DNN model 10 One of the training purposes of is to determine the weight value corresponding to each pathological factor. In one embodiment, the score of each pathological factor is weighted in the basic neuron 13 and then transmitted to each hidden neuron 15 in the first hidden neural layer 14.

關於步驟S33,在一實施例中,每個隱藏神經層14的每個隱藏神經元15會對接收到的資料進行加總運算、激發運算(第一型態激發運算)及加權運算,其中每個隱藏神經元15對應的權重值亦是在DNN模型10(基本模型)的訓練過程中被決定,亦即DNN模型10的訓練目的之一即是在決定每個隱藏神經元所對應的權重值為何。在一實施例中,最後一個隱藏神經層14中的每個隱藏神經元15的加權運算結果,將被傳送至輸出層16中的每個輸出神經元17。Regarding step S33, in one embodiment, each hidden neuron 15 of each hidden neural layer 14 performs summation operation, excitation operation (first type excitation operation), and weighting operation on the received data, wherein each The weight value corresponding to each hidden neuron 15 is also determined during the training process of DNN model 10 (basic model), that is, one of the training purposes of DNN model 10 is to determine the weight value corresponding to each hidden neuron Why. In an embodiment, the weighted operation result of each hidden neuron 15 in the last hidden neural layer 14 will be transmitted to each output neuron 17 in the output layer 16.

關於步驟S34,在一實施例中,輸出層16的每個輸出神經元會對接收到的資料進行加總及激發運算(第二型態激發運算)。在一實施例中,輸出層16的二輸出結果的加總為1(亦即加總結果對應100%的預測機率)。Regarding step S34, in one embodiment, each output neuron of the output layer 16 performs a summation and excitation operation on the received data (the second type excitation operation). In one embodiment, the sum of the two output results of the output layer 16 is 1 (that is, the sum result corresponds to a 100% prediction probability).

關於步驟S35,在一實施例中,系統1會比較輸出層16的二輸出結果,並將較高的機率作為預測結果,舉例來說,當對應未罹癌機率的輸出結果為0.75(即表示75%),而對應罹癌機率的輸出結果為0.25(即表示25%)時,則系統1會預測該患者的罹癌可能性較低,但本發明不限於此。Regarding step S35, in one embodiment, the system 1 compares the two output results of the output layer 16 and uses a higher probability as the prediction result. For example, when the output result corresponding to the probability of not suffering from cancer is 0.75 (that is, it means 75%), and the output result corresponding to the cancer probability is 0.25 (that is, 25%), the system 1 will predict that the patient has a lower cancer probability, but the present invention is not limited to this.

由此可知,當DNN模型10建立完成後,只要將患者的病理因子資料輸入至電腦輔助預測系統1中,DNN模型10即可預測該患者的罹癌可能性,藉此,患者可提早進行預防,生存機率可大幅提升。It can be seen that after the establishment of the DNN model 10 is completed, as long as the patient’s pathological factor data is input into the computer-aided prediction system 1, the DNN model 10 can predict the likelihood of the patient suffering from cancer, so that the patient can be prevented early , The survival rate can be greatly improved.

此外,為了使DNN模型10能夠執行步驟S31至S35,DNN模型10必須先透過訓練來建立每個神經元的權重值。以下將詳細說明DNN模型10的建立過程。In addition, in order for the DNN model 10 to perform steps S31 to S35, the DNN model 10 must first establish the weight value of each neuron through training. The establishment process of the DNN model 10 will be described in detail below.

圖4是本發明一實施例的DNN模型10的建立過程的步驟流程圖,其中該等步驟可由電腦輔助預測系統1的處理器執行電腦程式產品30中的指令而實現,並請同時參考圖1至圖4。4 is a flow chart of the steps of the process of establishing the DNN model 10 according to an embodiment of the present invention. The steps can be implemented by the processor of the computer-aided prediction system 1 executing instructions in the computer program product 30. Please also refer to FIG. 1 To Figure 4.

首先,步驟S41被執行,DNN模型10的基本模型被設定完成。之後,步驟S42被執行,輸入層12從全部訓練用資料中取得一最小批量的資料。之後,步驟S43被執行,基本模型利用取得的資料進行訓練,以決定基本模型中的每個神經元的權重值,藉此建立一個候選神經元路徑。之後步驟S44被執行,基本模型的輸入層12取得最小批量的另外複數筆訓練用資料,並重新執行步驟S43。之後步驟S45被執行,重複執行步驟S44,直到達到一預設條件。之後步驟S46被執行,重新執行步驟S42至S45複數次(iteration程序)。之後步驟S47被執行,預測模型40評估每個候選神經元路徑的預測能力。之後步驟S48被執行,系統1將具備預測能力最好的候選神經元路徑設定為DNN模型10實際使用的深度神經元路徑18。上述步驟至少可透過系統1的處理器執行電腦程式產品30的指令或其它電腦程式產品的指令而實現。First, step S41 is executed, and the basic model of the DNN model 10 is set. After that, step S42 is executed, and the input layer 12 obtains a minimum batch of data from all the training data. After that, step S43 is executed, and the basic model is trained using the acquired data to determine the weight value of each neuron in the basic model, thereby establishing a candidate neuron path. After that, step S44 is executed, and the input layer 12 of the basic model obtains another plural pieces of training data in the smallest batch, and executes step S43 again. Then step S45 is executed, and step S44 is repeatedly executed until a preset condition is reached. After that, step S46 is executed, and steps S42 to S45 are executed multiple times (iteration procedure). After step S47 is executed, the prediction model 40 evaluates the prediction ability of each candidate neuron path. Then step S48 is executed, and the system 1 sets the candidate neuron path with the best predictive ability as the deep neuron path 18 actually used by the DNN model 10. The above-mentioned steps can at least be implemented by the processor of the system 1 executing instructions of the computer program product 30 or instructions of other computer program products.

關於步驟S41,此步驟是用以找出DNN模型10(基本模型)的最佳變數參數,此處變數參數可例如是隱藏神經層的數量、激發函數為何等,且不限於此。此步驟可由系統1接收使用者所輸入的指令,並依照指令來進行基本模型的設定來實現。在一實施例中,此步驟是使用少數訓練用資料先建立出複數個具備不同參數的簡化基本模型,之後再利用K折交互驗證方法(K-fold cross validation)找出其中一個效能最佳的簡化基本模型,並將該簡化基本模型設定為DNN模型10的基本模型(亦即最佳參數值可被找出)。此處「少數的訓練用資料」可例如是所有訓練用資料的1/100,但不限於此。在一實施例中,K-fold cross validation是對每個簡化基本模型進行K次驗證,每次驗證包含了訓練過程及測試過程,其中訓練過程是決定該簡化基本模型的各神經元的權重值,測試過程是用以測試該簡化基本模型的預測能力。對於一個簡化基本模型而言,每次驗證是將前述少數的訓練用資料以(K-1):1的數量分為訓練組及測試組,其中訓練組用於訓練過程,測試組則用於測試過程。當K次驗證完成後,系統1再將該簡化基本模型的每次驗證的準確度取平均值,並將該平均值作為該簡化基本模型的準確度。在一實施例中,K為10,亦即每個簡化基本模型將進行10次驗證,且每次驗證是將訓練用資料以9:1的數量分為訓練組及測試組,但本發明不限於此。此外,假如DNN模型10需使用最佳超參數(best hyperparameter)、優化器(optimizer),則最佳超參數(best hyperparameter)、優化器(optimizer)亦可在步驟S41被設定好。藉此,步驟S41可找出具備最佳參數的簡化基本模型,以作為後續深度訓練所使用的基本模型。Regarding step S41, this step is used to find the best variable parameter of the DNN model 10 (basic model). The variable parameter here can be, for example, the number of hidden neural layers, what is the excitation function, etc., and is not limited thereto. This step can be implemented by the system 1 receiving instructions input by the user, and setting the basic model according to the instructions. In one embodiment, this step is to use a small amount of training data to first establish a plurality of simplified basic models with different parameters, and then use K-fold cross validation to find one of the best performing models The basic model is simplified, and the simplified basic model is set as the basic model of the DNN model 10 (that is, the optimal parameter value can be found). Here, "a few training data" can be, for example, 1/100 of all training data, but it is not limited to this. In one embodiment, K-fold cross validation is to verify each simplified basic model K times, and each validation includes a training process and a testing process, where the training process is to determine the weight value of each neuron of the simplified basic model , The testing process is used to test the predictive ability of the simplified basic model. For a simplified basic model, each verification is to divide the aforementioned small amount of training data into a training group and a test group by (K-1):1. The training group is used for the training process, and the test group is used for Testing process. After K times of verification are completed, the system 1 then averages the accuracy of each verification of the simplified basic model, and uses the average value as the accuracy of the simplified basic model. In an embodiment, K is 10, that is, each simplified basic model will be verified 10 times, and each verification is to divide the training data into a training group and a test group in a 9:1 quantity, but the present invention does not Limited to this. In addition, if the DNN model 10 needs to use best hyperparameters and optimizers, the best hyperparameters and optimizers can also be set in step S41. In this way, in step S41, a simplified basic model with optimal parameters can be found to be used as a basic model for subsequent deep training.

關於步驟S42,系統1可先取得全部訓練用資料,並從全部訓練用資料中提取最小批量的資料數量輸入至基本模型中,使基本模型利用該等最小批量的資料進行深度訓練(第一次深度訓練)。在一實施例中,全部訓練用資料的數量是定義為至少一百萬筆,而最小批量是定義至少為100筆資料。在一實施例中,全部訓練用資料為1315899筆,而最小批量是128筆資料,因此系統1會從1315899筆訓練用資料中隨機選取128筆資料輸入至基本模型中,但本發明不限於此。需注意的是,每個訓練用資料包含了一位糖尿病患者的37個病理因子資料及該患者實際罹癌與否的資訊。Regarding step S42, the system 1 may first obtain all training data, and extract the smallest batch of data from all training data and input it into the basic model, so that the basic model uses the smallest batch of data for deep training (first time In-depth training). In one embodiment, the number of all training materials is defined as at least one million, and the minimum batch is defined as at least 100 materials. In one embodiment, all training data is 1,315,899, and the minimum batch is 128 data. Therefore, the system 1 randomly selects 128 data from 1,315,899 training data to input into the basic model, but the invention is not limited to this. . It should be noted that each training data contains 37 pathological factors of a diabetic patient and information about whether the patient actually has cancer or not.

關於步驟S43,此步驟是基本模型利用步驟S42中所取得的資料來進行訓練,由於訓練用資料包含了糖尿病患者的實際罹癌與否的資訊,因此基本模型可藉此分析出罹癌情況下可能的病理因子的特性以及未罹癌情況下可能的病理因子特性,進而決定每個神經元的權重值。在一實施例中,基本模型是執行梯度下降運算法來進行訓練,進而決定每個神經元的權重值。在一實施例中,梯度下降運算法可以是Stochastic gradient descent或Adam with Nesterov’s accelerated gradient descent二者至少之一,且不限於此。採用Stochastic gradient descent的目的之一是可減少基本模型的預測值及真實結果之間的差異(loss),例如使基本模型的預測結果與真實結果之間具備局部最小差異值,而採用Adam with Nesterov’s accelerated gradient descent的目的之一是使基本模型的預測結果與真實結果之間具備絕對最小差異值的機率提升。此外,由於本發明的重點之一在於藉由Stochastic gradient descent及Adam with Nesterov’s accelerated gradient descent的特性來提升基本模型的預測能力的準確度,而關於Stochastic gradient descent及Adam with Nesterov’s accelerated gradient descent的執行過程則並非重點,因此在此不對執行過程進行詳述。當完成步驟S43後,基本模型可完成一次訓練,一候選神經元網路可被建立。Regarding step S43, this step is that the basic model uses the data obtained in step S42 for training. Since the training data contains information on whether the diabetic patient actually has cancer or not, the basic model can analyze the condition of cancer The characteristics of the possible pathological factors and the characteristics of the possible pathological factors in the absence of cancer determine the weight value of each neuron. In one embodiment, the basic model is to perform the gradient descent algorithm for training, and then determine the weight value of each neuron. In an embodiment, the gradient descent algorithm may be at least one of Stochastic gradient descent or Adam with Nesterov’s accelerated gradient descent, and is not limited thereto. One of the purposes of using Stochastic gradient descent is to reduce the difference between the predicted value of the basic model and the real result (loss), for example, to make the prediction result of the basic model and the real result have a local minimum difference value, while using Adam with Nesterov's One of the purposes of accelerated gradient descent is to increase the probability of the absolute minimum difference between the predicted result of the basic model and the actual result. In addition, since one of the key points of the present invention is to improve the accuracy of the basic model's predictive ability through the characteristics of Stochastic gradient descent and Adam with Nesterov's accelerated gradient descent, the execution process of Stochastic gradient descent and Adam with Nesterov's accelerated gradient descent It is not the focus, so the implementation process is not described in detail here. After step S43 is completed, the basic model can be trained once, and a candidate neural network can be established.

關於步驟S44,系統1會將另外一組最小批量的資料(即另外128筆資料)輸入至基本模型中,基本模型再利用該組最小批量的資料重新進行步驟S43的訓練,並藉此產生另一候選神經元網路。在一實施例中,每次系統所選擇的最小批量的資料皆是隨機選取,因此每組最小批量的資料可能會有重複的資料被選取,但並非限定。Regarding step S44, the system 1 inputs another set of minimum batch data (that is, another 128 data) into the basic model, and the basic model uses the minimum batch data to re-train in step S43, and generates another set of data. A candidate neural network. In one embodiment, the minimum batch of data selected by the system is randomly selected each time, so duplicate data may be selected for each minimum batch of data, but it is not limited.

關於步驟S45,系統1會重複執行步驟S41,進而產生基本模型的複數個神經元網路,直至一個預設條件被達成。在一實施例中,「預設條件」是指所有的訓練用資料都已輸入至基本模型之中,且基本模型已訓練完成;在另一實施例中,「預設條件」亦可以是指定數量的神經元網路已被建立出來。關於「預設條件」的描述僅是舉例,本發明不限於此。Regarding step S45, the system 1 repeats step S41 to generate a plurality of neural networks of the basic model until a preset condition is achieved. In one embodiment, the "preset condition" means that all training data has been input to the basic model, and the basic model has been trained; in another embodiment, the "preset condition" can also be specified A number of neural networks have been established. The description of "preset conditions" is only an example, and the present invention is not limited thereto.

關於步驟S46,此步驟用以對基本模型的訓練進行迭代(iteration)程序,亦即重新執行步驟S42至S45,直至達到指定次數,藉此進一步提升神經元網路的預測能力。在一實施例中,「指定次數」是設定為至少1000次,但並非限定。Regarding step S46, this step is used to iterate the training of the basic model, that is, re-execute steps S42 to S45 until the specified number of times is reached, thereby further improving the predictive ability of the neural network. In one embodiment, the "designated number of times" is set to at least 1000 times, but it is not limited.

關於步驟S47,此步驟是透過測試模組40對每個候選神經元網路進行效能的評估。在一實施例中。測試模組40的測試可包含權重平均召回(Weighted average recall)分析,用以分析該等候選神經元網路的靈敏度。在一實施例中,測試模組40的測試可包含正預測值分析,用以分析出該等候選神經元網路的準確度。在一實施例中,測試模組40的測試可包含F1分析,用以分析出該等候選神經元網路的F1值(即靈敏度和精準度的調和平均值)。在一實施例中,權重平均召回、正預測值及F1分析中之至少二者會一併執行。藉此,每個候選神經元網路的預測效能可被評估出來。Regarding step S47, this step is to evaluate the performance of each candidate neural network through the test module 40. In one embodiment. The test of the test module 40 may include weighted average recall analysis to analyze the sensitivity of the candidate neural networks. In one embodiment, the test of the test module 40 may include positive predictive value analysis to analyze the accuracy of the candidate neural networks. In one embodiment, the test of the test module 40 may include F1 analysis to analyze the F1 values (ie, the harmonic average of sensitivity and accuracy) of the candidate neural networks. In an embodiment, at least two of the weighted average recall, the positive predictive value, and the F1 analysis are performed together. In this way, the predictive performance of each candidate neural network can be evaluated.

關於步驟S48,此步驟是用以選取預測效能最佳的候選神經元網路作為實際使用的DNN模型10的深度神經元網路18。當步驟S48完成後,DNN模型10的訓練已完成,往後使用者(醫師)只要將患者的病理因子資料輸入至DNN模型10,DNN模型10即可分析出患者的罹癌可能性。Regarding step S48, this step is used to select the candidate neural network with the best prediction performance as the deep neural network 18 of the DNN model 10 actually used. After step S48 is completed, the training of the DNN model 10 has been completed. In the future, the user (physician) only needs to input the patient's pathological factor data into the DNN model 10, and the DNN model 10 can analyze the possibility of cancer of the patient.

此外,在一實施例中,在訓練過程中,每個隱藏神經層14及輸出層16可被施加一個dropout(即一種用以避免過度訓練(overfitting)的正規化技術)。在一實施例中,輸出層16可使用categorical cross entropy function作為一損失函數。 另外,在一實施例中,每個神經元的權重值可使用正規化He起始值(Normalized He initialization)而被初始化。本發明不限於此。In addition, in one embodiment, during the training process, each hidden neural layer 14 and output layer 16 may be applied with a dropout (that is, a regularization technique to avoid overfitting). In an embodiment, the output layer 16 may use a categorical cross entropy function as a loss function. In addition, in an embodiment, the weight value of each neuron can be initialized using a normalized He initialization value (Normalized He initialization). The present invention is not limited to this.

圖5是本發明一實施例的實驗數據示意圖,其是以ROC曲線來呈現本發明的DNN模型10與傳統的aDCSI模型對於預估糖尿病患者罹癌機率的準確度,其Y軸為真陽性率(以True positive rate標註),X軸為偽陽性率(以False positive rate標註),其中兩者是以相同的資料進行測試。如圖5所示,DNN模型10的ROC曲線的曲線下面積(AUC)約為0.738,而aDCSI模型的AUC約為0.492,由此可知,本發明的DNN模型10擁有比傳統的aDCSI模型更好的預測能力。FIG. 5 is a schematic diagram of experimental data of an embodiment of the present invention, which uses ROC curve to present the accuracy of the DNN model 10 of the present invention and the traditional aDCSI model for predicting the probability of cancer in diabetic patients, and the Y axis is the true positive rate (Marked with True positive rate), the X-axis is the false positive rate (marked with False positive rate), where the two are tested with the same data. As shown in Figure 5, the area under the curve (AUC) of the ROC curve of the DNN model 10 is about 0.738, and the AUC of the aDCSI model is about 0.492. It can be seen that the DNN model 10 of the present invention has better performance than the traditional aDCSI model. Predictive power.

藉此,本發明所使用的DNN模型可建立完成,換言之,只要將患者的病理因子資料輸入至DNN模型中,DNN模型即可自動預測出該患者罹患直腸癌的可能性。藉由深度學習訓練,本發明的電腦輔助預測系統可精準地預測出患者的罹癌機率,可輔助患者尋求最佳的醫療照護方式。In this way, the DNN model used in the present invention can be established. In other words, as long as the patient's pathological factor data is input into the DNN model, the DNN model can automatically predict the possibility of the patient suffering from rectal cancer. Through deep learning training, the computer-aided prediction system of the present invention can accurately predict the cancer probability of patients, and can assist patients in seeking the best medical care.

此外,在一實施例中,本發明的電腦輔助預測系統、方法及電腦程式產品可由論文“Development of a Prediction Model for Colorectal Cancer among Patients with Type 2 Diabetes Mellitus Using a Deep Neural Network,Meng-Hsuen Hsieh, Li-Min Sun, Cheng-Li Lin, Meng-Ju Hsieh, Kyle Sun, Chung-Y. Hsu, An-Kuo Chou, and Chia-Hung Kao”記載的內容來實現,但不限於此。In addition, in one embodiment, the computer-aided prediction system, method, and computer program product of the present invention can be found in the paper "Development of a Prediction Model for Colorectal Cancer among Patients with Type 2 Diabetes Mellitus Using a Deep Neural Network, Meng-Hsuen Hsieh, Li-Min Sun, Cheng-Li Lin, Meng-Ju Hsieh, Kyle Sun, Chung-Y. Hsu, An-Kuo Chou, and Chia-Hung Kao", but not limited to this.

儘管本發明已透過上述實施例來說明,可理解的是,根據本發明的精神及本發明所主張的申請專利範圍,許多修飾及變化都是可能的。Although the present invention has been illustrated through the above-mentioned embodiments, it is understandable that many modifications and changes are possible according to the spirit of the present invention and the scope of the patent application claimed by the present invention.

1:電腦輔助預測系統10:深度神經網路(DNN)模型1012:輸入層13:基本神經元14:隱藏神經層141:第一個隱藏神經層142:第二個隱藏神經層143:第三個隱藏神經層15:隱藏神經元16:輸出層17:輸出神經元18:深度神經路徑20:資料取得介面30:電腦程式產品40:測試模組S31~S35:步驟S41~S48:步驟1: Computer-aided prediction system 10: Deep neural network (DNN) model 1012: Input layer 13: Basic neuron 14: Hidden neural layer 141: First hidden neural layer 142: Second hidden neural layer 143: Third Hidden neural layer 15: Hidden neuron 16: Output layer 17: Output neuron 18: Deep neural path 20: Data acquisition interface 30: Computer program product 40: Test module S31~S35: Step S41~S48: Step

圖1(A)是本發明一實施例的直腸癌電腦輔助預測系統的系統架構圖; 圖1(B)是本發明一實施例的DNN模型的架構圖; 圖2本發明一實施例的DNN模型 (已完成訓練)的細部架構示意圖; 圖3是本發明一實施例的電腦輔助預測方法的基本步驟流程圖; 圖4是本發明一實施例的DNN模型的建立過程的步驟流程圖; 圖5是本發明一實施例的實驗數據示意圖。Fig. 1(A) is a system architecture diagram of a rectal cancer computer-aided prediction system according to an embodiment of the present invention; Fig. 1(B) is an architecture diagram of a DNN model according to an embodiment of the present invention; Fig. 2 DNN according to an embodiment of the present invention A schematic diagram of the detailed architecture of the model (trained); FIG. 3 is a flowchart of the basic steps of a computer-aided prediction method according to an embodiment of the invention; FIG. 4 is a flowchart of the steps of a DNN model establishment process according to an embodiment of the invention; 5 is a schematic diagram of experimental data of an embodiment of the present invention.

1:電腦輔助預測系統 1: Computer-aided prediction system

10:深度神經網路(DNN)模型10 10: Deep neural network (DNN) model 10

20:資料取得介面 20: Data acquisition interface

30:電腦程式產品 30: Computer Program Products

40:測試模組 40: test module

Claims (10)

一種電腦輔助預測系統,用以預測一糖尿病患者罹患直腸癌的可能性,包含:     一深度神經網路(DNN)模型,透過一深度神經元路徑對該糖尿病患者的複數個病理因子資料進行一特徵分析,其中該DNN模型包含:            複數個神經元,其中至少一部分該等神經元對應該等病理因子資料;以及            一輸出層,根據該特徵分析而輸出與罹癌可能性有關的至少一輸出結果; 其中,該DNN模型是透過複數次訓練而決定每個神經元所對應的一權重值,進而建立該深度神經元路徑。A computer-assisted prediction system for predicting the possibility of a diabetic patient suffering from rectal cancer, including: A deep neural network (DNN) model, which uses a deep neuron path to perform a feature on multiple pathological factors of the diabetic patient Analysis, where the DNN model includes: a plurality of neurons, at least some of which correspond to the pathological factor data; and an output layer that outputs at least one output result related to the possibility of cancer based on the characteristic analysis; Among them, the DNN model determines a weight value corresponding to each neuron through multiple trainings, and then establishes the deep neuron path. 如請求項1所述的電腦輔助系統,其中該DNN模型更包含一輸入層及複數個隱藏神經層,其中輸入層包含對應該等病理因子資料的複數個基本神經元,該等隱藏神經層各自包含複數個的隱藏神經元,且其中一個隱藏神經層的每個隱藏神經元各自與所有基本神經元連結。The computer-aided system according to claim 1, wherein the DNN model further includes an input layer and a plurality of hidden neural layers, wherein the input layer includes a plurality of basic neurons corresponding to the pathological factor data, and each of the hidden neural layers It contains a plurality of hidden neurons, and each hidden neuron in one of the hidden neural layers is connected to all basic neurons. 如請求項2所述的電腦輔助系統,其中該等隱藏神經層的數量是藉由執行一k折交互驗證方法(k-fold cross-validation)而決定,其中k設定為10。The computer-aided system according to claim 2, wherein the number of the hidden neural layers is determined by executing a k-fold cross-validation method, where k is set to 10. 如請求項2所述的電腦輔助系統,其中該DNN模型是透過隨機選取複數筆訓練用資料來進行每次訓練,並建立出複數個候選深度神經網路,其中每次訓練是定義為對等訓練用資料執行至少一梯度下降運算來決定每個基本神經元及每個隱藏神經元所對應的權重值,且每個訓練用資料包含一糖尿病患者的37個病理因子資料及該糖尿病患者是否罹患直腸癌的資訊。The computer-aided system according to claim 2, wherein the DNN model is to conduct each training by randomly selecting a plurality of training data, and to establish a plurality of candidate deep neural networks, wherein each training is defined as equivalent The training data performs at least one gradient descent operation to determine the weight value corresponding to each basic neuron and each hidden neuron, and each training data includes 37 pathological factors of a diabetic patient and whether the diabetic patient suffers Information on rectal cancer. 如請求項4所述的電腦輔助系統,其中該DNN模型實際使用時的該深度神經網路是該等候選深度神經網路中具備最佳預測能力之一者,且該等深度神經網路的預測能力是透過權重平均召回(Weighted average recall)、正預測值及F1分析其中至少之一來決定。The computer-aided system according to claim 4, wherein the deep neural network when the DNN model is actually used is one of the candidate deep neural networks with the best predictive ability, and the depth of the deep neural network The predictive ability is determined by at least one of weighted average recall, positive predictive value, and F1 analysis. 一種電腦輔助預測方法,用以預測一糖尿病患者罹患直腸癌的可能性,該方法是透過一電腦輔助預測系統來執行,其中該電腦輔助預測系統包含一DNN模型,且該DNN模型包含複數個神經元及一輸出層,該方法包含步驟:         取得該糖尿病患者的複數個病理因子資料;         藉由該DNN模型,透過一深度神經元路徑對該等病理因子資料進行一特徵分析;以及         藉由該輸出層,根據該特徵分析而輸出與罹癌可能性有關的至少一輸出結果;     其中,該DNN模型是透過複數次訓練來決定每個基本神經元的一權重值,進而建立該深度神經元路徑。A computer-aided prediction method for predicting the possibility of a diabetic patient suffering from rectal cancer. The method is executed by a computer-aided prediction system, wherein the computer-aided prediction system includes a DNN model, and the DNN model includes a plurality of nerves This method includes the steps of: Obtaining multiple pathological factor data of the diabetic patient; Using the DNN model to perform a feature analysis on the pathological factor data through a deep neuron path; and Using the output The layer, according to the feature analysis, outputs at least one output result related to the possibility of suffering from cancer; wherein the DNN model determines a weight value of each basic neuron through multiple trainings, and then establishes the deep neuron path. 如請求項6所述的電腦輔助預測方法,其中該DNN模型更包含一輸入層及複數個隱藏神經層,其中輸入層包含對應該等病理因子資料的複數個基本神經元,該等隱藏神經層各自包含複數個的隱藏神經元,且其中一個隱藏神經層的每個隱藏神經元各自與所有基本神經元連結。The computer-aided prediction method according to claim 6, wherein the DNN model further includes an input layer and a plurality of hidden neural layers, wherein the input layer includes a plurality of basic neurons corresponding to the pathological factor data, and the hidden neural layers Each contains a plurality of hidden neurons, and each hidden neuron in one of the hidden neural layers is connected to all basic neurons. 如請求項7所述的電腦輔助預測方法,其中該DNN模型是透過隨機選取複數筆訓練用資料來進行每次訓練,並建立出複數個候選深度神經網路,其中每次訓練是定義為對等訓練用資料執行至少一梯度下降運算來決定每個基本神經元及每個隱藏神經元所對應的權重值,且每個訓練用資料包含一糖尿病患者的37個病理因子資料及該糖尿病患者是否罹患直腸癌的資訊。The computer-aided prediction method according to claim 7, wherein the DNN model is to conduct each training by randomly selecting a plurality of training data, and to establish a plurality of candidate deep neural networks, wherein each training is defined as a pair For training data, perform at least one gradient descent operation to determine the weight value corresponding to each basic neuron and each hidden neuron, and each training data contains 37 pathological factors of a diabetic patient and whether the diabetic patient is Information on suffering from rectal cancer. 如請求項8所述的電腦輔助系統,其中該DNN模型實際使用時的該深度神經網路是該等候選深度神經網路中具備最佳預測能力之一者,且該等深度神經網路的預測能力是透過權重平均召回分析、正預測值分析及F1分析其中至少之一來決定。The computer-aided system according to claim 8, wherein the deep neural network when the DNN model is actually used is one of the candidate deep neural networks with the best predictive ability, and the depth of the deep neural network The predictive ability is determined by at least one of weighted average recall analysis, positive predictive value analysis, and F1 analysis. 一種電腦程式產品,儲存於一非暫態電腦可讀取媒體之中,用以使一電腦輔助預測系統進行運作,其中該電腦輔助預測系統是用以預測一糖尿病患者罹患直腸癌的可能性,且該電腦輔助預測系統包含具有複數個基本神經元及一輸出層的一DNN模型,該電腦程式產品包含:       一指令,取得該糖尿病患者的複數個病理因子資料;       一指令,使該DNN模型透過一深度神經元路徑對該等病理因子資料進行一特徵分析;以及       一指令,使該輸出層根據該特徵分析而輸出與罹癌可能性有關的至少一輸出結果;       其中,該DNN模型是透過複數次訓練來決定每個基本神經元的一權重值,進而建立該深度神經元路徑。A computer program product stored in a non-transitory computer readable medium for operating a computer-aided prediction system, wherein the computer-aided prediction system is used to predict the possibility of a diabetic patient suffering from rectal cancer, And the computer-aided prediction system includes a DNN model with a plurality of basic neurons and an output layer. The computer program product includes: a command to obtain the data of multiple pathological factors of the diabetic patient; a command to make the DNN model pass A deep neuron path performs a feature analysis on the pathological factor data; and an instruction to cause the output layer to output at least one output result related to the possibility of cancer based on the feature analysis; where the DNN model is a complex number Training to determine a weight value of each basic neuron, and then establish the path of the deep neuron.
TW107144007A 2018-12-07 2018-12-07 Computer-aided recognition system, its method and its computer program product thereof TWI681407B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW107144007A TWI681407B (en) 2018-12-07 2018-12-07 Computer-aided recognition system, its method and its computer program product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW107144007A TWI681407B (en) 2018-12-07 2018-12-07 Computer-aided recognition system, its method and its computer program product thereof

Publications (2)

Publication Number Publication Date
TWI681407B TWI681407B (en) 2020-01-01
TW202022890A true TW202022890A (en) 2020-06-16

Family

ID=69942404

Family Applications (1)

Application Number Title Priority Date Filing Date
TW107144007A TWI681407B (en) 2018-12-07 2018-12-07 Computer-aided recognition system, its method and its computer program product thereof

Country Status (1)

Country Link
TW (1) TWI681407B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447569B (en) * 2015-12-18 2018-10-19 北京柏惠维康科技有限公司 A kind of breast cancer cell characteristic analysis system based on deep learning
CN107024586A (en) * 2017-04-20 2017-08-08 中国人民解放军第五九医院 Method based on artificial neural network tumor-marker joint-detection auxiliary diagnosis liver cancer
CN108695001A (en) * 2018-07-16 2018-10-23 武汉大学人民医院(湖北省人民医院) A kind of cancer lesion horizon prediction auxiliary system and method based on deep learning

Also Published As

Publication number Publication date
TWI681407B (en) 2020-01-01

Similar Documents

Publication Publication Date Title
Masethe et al. Prediction of heart disease using classification algorithms
Gürbüz et al. A new adaptive support vector machine for diagnosis of diseases
WO2020123723A1 (en) System and method for providing health information
Moujahid et al. Convolutional neural network based classification of patients with pneumonia using X-ray lung images
Xia et al. A multi-modality network for cardiomyopathy death risk prediction with CMR images and clinical information
CN111370102A (en) Department diagnosis guiding method, device and equipment
Chen et al. Patient stratification using electronic health records from a chronic disease management program
Wang et al. Measurement and application of patient similarity in personalized predictive modeling based on electronic medical records
TWI681407B (en) Computer-aided recognition system, its method and its computer program product thereof
Astuti et al. The impact of different fold for cross validation of missing values imputation method on hepatitis dataset
CN112270994B (en) Method, device, terminal and storage medium for constructing risk prediction model
Bankar et al. Symptom Analysis using a Machine Learning approach for Early Stage Lung Cancer
Sharifi et al. A cluster-based machine learning model for large healthcare data analysis
Alotaibi et al. Stroke in-patients' transfer to the ICU using ensemble based model
Zhang et al. Nonlaboratory-based risk assessment model for type 2 diabetes mellitus screening in Chinese rural population: a joint bagging-boosting model
Taylor et al. A Model to Detect Heart Disease using Machine Learning Algorithm
WO2020199692A1 (en) Method and apparatus for screening predictive image features for cancer metastasis, and storage medium
Alcalá-Rmz et al. Convolutional Neural Network for Classification of Diabetic Retinopathy Grade
Wrobel Diagnosing parkinson’s disease with the use of a reduced set of patients’ voice features samples
Imperiale et al. Risk Stratification Strategies for Colorectal Cancer Screening: From Logistic Regression to Artificial Intelligence
Liu et al. A batch normalization autoencoder model for breast cancer multidimensional follow-up data
Jiang et al. Variable selection and prediction of clinical outcome with multiply-imputed data via bayesian model averaging
Mishra et al. Heart Disease Prediction System
Balaha et al. Hybrid deep learning and genetic algorithms approach (HMB-DLGAHA) for the early ultrasound diagnoses of breast cancer
TWI723312B (en) Computer-aided recognition system for treatment response of rectal cancer, and its method and computer program product