TWI681407B

TWI681407B - Computer-aided recognition system, its method and its computer program product thereof

Info

Publication number: TWI681407B
Application number: TW107144007A
Authority: TW
Inventors: 謝孟軒; 謝孟儒; 高嘉鴻
Original assignee: 謝孟軒
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2020-01-01
Also published as: TW202022890A

Abstract

A computer-aided prediction system for predicting a diabetes mellitus patient’s onset possibility of CRC is provided. The system includes a deep neural network (DNN) model used to analyze a plurality of baseline characteristics from the diabetes mellitus patient by a deep neural path. The DNN model includes a plurality of neurons and an output layer. At least a part of the neurons corresponds to the baseline characteristics. The output layer outputs an output result related to the onset possibility of CRC.

Description

Computer-aided prediction system, method and computer program product

本發明屬於電腦輔助預測技術領域，特別是預測直腸癌病發可能性的電腦輔助預測技術領域。 The invention belongs to the field of computer-aided prediction technology, in particular to the field of computer-aided prediction technology for predicting the possibility of rectal cancer.

糖尿病是最常見的疾病之一，並且全球罹病人數正逐年增加。對於糖尿病患者而言，良好的醫療照護將可提升生存率。然而，最近的研究發現，糖尿病患者罹患直腸癌的風險會一般人更高，一旦發病，將影響糖尿病患者的醫療照護品質，並使生存率大幅下降。對此，若能即早預測糖尿病患者的罹癌機率，就能給予適當的醫療照護，進而能提升患者的生存率。目前雖有一些技術可預測糖尿病患者的併發症，例如適應糖尿病併發症嚴重程度指數(Adapted Diabetes Complication Severity Index，aDCSI)，但其精準度仍不符合預期。由此可知，目前仍急需一種能精準預測糖尿病患者罹患直腸癌可能性的技術 Diabetes is one of the most common diseases, and the number of patients worldwide is increasing year by year. For diabetics, good medical care will improve survival. However, recent studies have found that people with diabetes have a higher risk of developing rectal cancer. Once the disease occurs, it will affect the quality of medical care for patients with diabetes and reduce the survival rate significantly. In this regard, if we can predict the cancer risk of diabetic patients early, we can give appropriate medical care, which can improve the survival rate of patients. At present, although there are some technologies that can predict the complications of diabetics, such as the Adapted Diabetes Complication Severity Index (aDCSI), its accuracy still does not meet expectations. It can be seen that there is still an urgent need for a technology that can accurately predict the possibility of diabetic patients suffering from rectal cancer

本發明提出一種電腦輔助預測技術，是以深度神經網路為基礎，並配合有併發直腸癌或未併發直腸癌的大量糖尿病患者的病理因子資料來對深度神經網路的基本模型進行訓練，當訓練完成後，深度神經網路即可準確地預測糖尿病患者罹患直腸癌的可能性。 The present invention proposes a computer-aided prediction technology, which is based on a deep neural network and combined with pathological factor data of a large number of diabetic patients with or without rectal cancer. The basic model of the neural network is trained. When the training is completed, the deep neural network can accurately predict the possibility of diabetic patients suffering from rectal cancer.

根據本發明的一觀點，茲提出一種電腦輔助預測系統，用以預測糖尿病患者罹患直腸癌的可能性。該系統包含深度神經網路模型，用以透過深度神經元路徑對糖尿病患者的複數個病理因子資料進行特徵分析。深度神經網路模型包含複數個神經元及輸出層。至少部分神經元對應病理因子資料。輸出層根據該特徵分析而輸出與罹患可能性有關的輸出結果。其中，DNN模型是透過複數次訓練來決定每個神經元對應的權重值，進而建立深度神經元路徑。 According to one aspect of the present invention, a computer-aided prediction system is proposed to predict the possibility of diabetic patients suffering from rectal cancer. The system includes a deep neural network model to analyze the characteristics of multiple pathological factors in diabetic patients through a deep neuron path. The deep neural network model includes a plurality of neurons and output layers. At least some neurons correspond to pathological factors. The output layer outputs the output result related to the possibility of suffering based on this feature analysis. Among them, the DNN model determines the weight value corresponding to each neuron through a plurality of trainings, and then establishes a deep neuron path.

根據本發明的另一觀點，是提供一種電腦輔助預測方法，用以預測糖尿病患者罹患直腸癌的可能性，該方法是透過電腦輔助預測系統來執行，其中電腦輔助預測系統包含具有複數個神經元及輸出層的DNN模型。該方法包含步驟：取得糖尿病患者的複數個病理因子資料；藉由DNN模型，透過深度神經元路徑對該等病理因子資料進行特徵分析；以及藉由輸出層，根據特徵分析而輸出與罹癌可能性有關的輸出結果；其中，DNN模型是透過複數次訓練來決定每個神經元對應的權重值，進而建立深度神經元路徑。 According to another aspect of the present invention, there is provided a computer-aided prediction method for predicting the possibility of diabetic patients suffering from rectal cancer. The method is implemented by a computer-aided prediction system, wherein the computer-aided prediction system includes a plurality of neurons And the DNN model of the output layer. The method includes the steps of: obtaining a plurality of pathological factor data of a diabetic patient; performing feature analysis on the pathological factor data through a deep neuron path through a DNN model; and outputting the possibility of cancer development based on the feature analysis through an output layer Sex-related output results; among them, the DNN model determines the weight value corresponding to each neuron through multiple trainings, and then establishes a deep neuron path.

根據本發明又另一觀點，是提供一種電腦程式產品，儲存於非暫態電腦可讀取媒體之中，用以使電腦輔助預測系統進行運作，其中電腦輔助預測系統是用以預測糖尿病患者罹患直腸癌的可能性，並包含具有複數個神經元及輸出層的DNN模型。電腦程式產品包含：一指令，取得糖尿病患者的複數個病理因子資料；一指令，使DNN模型透過深度神經元路徑對病理因子資料進行一特徵分析；以及一指令，使輸出層根據特徵分析而輸出與罹癌可能性有關的輸出結果；其中，DNN模型是透過複數次訓練來決定每個神經元對應的權重值，藉此建立深度神經元路徑。 According to yet another aspect of the present invention, it is to provide a computer program product, which is stored in a non-transitory computer readable medium for the operation of a computer-aided prediction system, wherein the computer-aided prediction system is used to predict the occurrence of diabetes The possibility of rectal cancer, and includes a DNN model with multiple neurons and output layers. The computer program product includes: an instruction to obtain a plurality of pathological factor data of diabetic patients; an instruction to enable the DNN model to perform a feature analysis on the pathological factor data through the deep neuron path; and an instruction to enable the output layer to output based on the feature analysis Loss related to cancer risk Outcome; Among them, the DNN model determines the weight value corresponding to each neuron through multiple trainings, thereby establishing a deep neuron path.

1‧‧‧電腦輔助預測系統 1‧‧‧ Computer-aided prediction system

10‧‧‧深度神經網路(DNN)模型10 10‧‧‧Deep Neural Network (DNN) Model 10

12‧‧‧輸入層 12‧‧‧ input layer

13‧‧‧基本神經元 13‧‧‧Basic neuron

14‧‧‧隱藏神經層 14‧‧‧ hidden nerve layer

141‧‧‧第一個隱藏神經層 141‧‧‧The first hidden nerve layer

142‧‧‧第二個隱藏神經層 142‧‧‧The second hidden nerve layer

143‧‧‧第三個隱藏神經層 143‧‧‧The third hidden nerve layer

15‧‧‧隱藏神經元 15‧‧‧ hidden neurons

16‧‧‧輸出層 16‧‧‧Output layer

17‧‧‧輸出神經元 17‧‧‧ output neuron

18‧‧‧深度神經路徑 18‧‧‧ Deep Neural Path

20‧‧‧資料取得介面 20‧‧‧Data access interface

30‧‧‧電腦程式產品 30‧‧‧Computer program products

40‧‧‧測試模組 40‧‧‧Test module

S31~S35‧‧‧步驟 S31~S35‧‧‧Step

S41~S48‧‧‧步驟 S41~S48‧‧‧Step

圖1(A)是本發明一實施例的直腸癌電腦輔助預測系統的系統架構圖；圖1(B)是本發明一實施例的DNN模型的架構圖；圖2本發明一實施例的DNN模型(已完成訓練)的細部架構示意圖；圖3是本發明一實施例的電腦輔助預測方法的基本步驟流程圖；圖4是本發明一實施例的DNN模型的建立過程的步驟流程圖；圖5是本發明一實施例的實驗數據示意圖。 1(A) is a system architecture diagram of a computer-aided prediction system for rectal cancer according to an embodiment of the invention; FIG. 1(B) is an architecture diagram of a DNN model according to an embodiment of the invention; FIG. 2 is a DNN according to an embodiment of the invention Schematic diagram of the detailed architecture of the model (completed training); FIG. 3 is a flowchart of basic steps of a computer-aided prediction method according to an embodiment of the present invention; FIG. 4 is a flowchart of steps of a process of establishing a DNN model according to an embodiment of the present invention; 5 is a schematic diagram of experimental data according to an embodiment of the invention.

以下說明書將提供本發明的多個實施例。可理解的是，這些實施例並非用以限制。本發明的各實施例的特徵可加以修飾、置換、組合、分離及設計以應用於其他實施例。 The following description will provide various embodiments of the present invention. Understandably, these embodiments are not intended to be limiting. The features of the embodiments of the present invention can be modified, replaced, combined, separated, and designed to be applied to other embodiments.

圖1(A)是本發明一實施例的電腦輔助預測系統1的系統架構圖。如圖1所示，電腦輔助預測系統1包含一深度神經網路模型10(Deep Neural Network，以下簡稱DNN模型10)，用以預測糖尿病患者罹患直腸癌的機率。在一實施例中，電腦輔助預測系統1更可包含一資料取得介面20。資料取得介面20用以取得來自外部的資料，亦即使用者(例如醫師)可透過資料取得介面20將患者的資料輸入至電腦輔助預測系統1中。此外，在一實施例中，電腦輔助預測系統1更可包含一測試模組40，用以測試DNN模型10的預測能力。 FIG. 1(A) is a system architecture diagram of a computer-aided prediction system 1 according to an embodiment of the invention. As shown in FIG. 1, the computer-aided prediction system 1 includes a deep neural network model 10 (Deep Neural Network, hereinafter referred to as DNN model 10) for predicting the probability of diabetic patients suffering from rectal cancer. In one embodiment, the computer-aided prediction system 1 can further include a data acquisition interface 20. The data obtaining interface 20 is used to obtain data from outside, that is, the user (for example, a physician) can use the data obtaining interface 20 to The data is input into the computer-aided prediction system 1. In addition, in an embodiment, the computer-aided prediction system 1 may further include a test module 40 for testing the prediction ability of the DNN model 10.

圖1(B)是本發明一實施例的DNN模型10的架構圖，請同時參考圖1(A)。DNN模型10包含了一輸入層12、複數個隱藏神經層14及一輸出層16。輸入層12具有複數個基本神經元13，其中每個基本神經元13對應一種病理因子資料及一個權重值。隱藏神經層14各自具有複數個隱藏神經元15，其中該等隱藏神經元15與該等基本神經元13連結，且各自亦對應一個權重值。輸出層16包含二輸出神經元17，用以產生二輸出結果，其中該等輸出結果各自對應糖尿病患者的罹癌機率及未罹癌機率。在一實施例中，基本神經元13、隱藏神經元15及輸出神經元17可形成一深度神經元路徑18，而DNN模型10可透過深度神經元路徑18對糖尿病患者的複數個病理因子資料進行一特徵分析，輸出層16可根據該特徵分析而輸出該等輸出結果(罹患癌機率以及未罹癌機率)。更詳細地說明，當DNN模型10取得糖尿病患者的複數個病理因子資料時，可將糖尿病患者的該等病理因子輸入至深度神經元路徑18之中，並利用深度神經元路徑18上每個基本神經元13、隱藏神經元15及輸出神經元17對糖尿病患者的該等病理因子進行特徵分析。在一實施例中，「特徵分析」可視為每個神經元對病理因子所進行的運算，而「運算」可包含一加總運算、一激發運算、一加權運算或該等至少二者之組合，且不限於此。藉此，本發明的DNN模型1可準確地預測該糖尿病患者的罹癌的可能性。接著將說明各元件的細節。 FIG. 1(B) is an architecture diagram of a DNN model 10 according to an embodiment of the present invention. Please also refer to FIG. 1(A). The DNN model 10 includes an input layer 12, a plurality of hidden neural layers 14, and an output layer 16. The input layer 12 has a plurality of basic neurons 13, wherein each basic neuron 13 corresponds to a pathological factor data and a weight value. The hidden neural layers 14 each have a plurality of hidden neurons 15, wherein the hidden neurons 15 are connected to the basic neurons 13, and each also corresponds to a weight value. The output layer 16 includes two output neurons 17 for generating two output results, wherein each of these output results corresponds to the cancer risk and non-cancer risk of diabetic patients. In one embodiment, the basic neuron 13, the hidden neuron 15 and the output neuron 17 can form a deep neuron path 18, and the DNN model 10 can conduct a plurality of pathological factor data of diabetic patients through the deep neuron path 18 According to a feature analysis, the output layer 16 can output the output results (probability of cancer and non-cancer) according to the feature analysis. In more detail, when the DNN model 10 obtains data of a plurality of pathological factors of diabetic patients, the pathological factors of diabetic patients can be input into the deep neuron path 18, and each basic element on the deep neuron path 18 can be used Neuron 13, hidden neuron 15 and output neuron 17 perform characteristic analysis on these pathological factors of diabetic patients. In one embodiment, "feature analysis" can be regarded as an operation performed by each neuron on a pathological factor, and "operation" may include a summation operation, an excitation operation, a weighted operation, or a combination of at least two of them , And not limited to this. With this, the DNN model 1 of the present invention can accurately predict the possibility of the diabetic patient suffering from cancer. Next, the details of each element will be explained.

電腦輔助預測系統1可以是一資料處理裝置，其可透過任何具有微處理器的裝置來實現，例如桌上型電腦、筆記型電腦、智慧型行動裝置、伺服器或雲端主機等類似裝置。在一實施例中，電腦輔助預測系統1可具備網路通訊功能，以將資料透過網路進行傳輸，其中網路通訊可以是有線網路或無線網路，因此電腦輔助預測系統1亦可透過網路來取得資料。在一實施例中，電腦輔助預測系統1可由微處理器中執行一電腦程式產品30來實現其功能，其中電腦程式產品30可具有複數個指令，該等指令可使處理器執行特殊運作，進而使處理器實現如DNN模型10或測試模組40等功能。在一實施例中，電腦程式產品30可儲存於一非暫態電腦可讀取媒體(例如記憶體)之中，但不限於此。在一實施例中，電腦程式產品30亦可預先儲存於網路伺服器中，以供使用者下載。 The computer-aided prediction system 1 can be a data processing device, which can be implemented by any device with a microprocessor, such as a desktop computer, a notebook computer, a smart mobile device, a server, or a cloud host and the like. In one embodiment, the computer-aided prediction system 1 may be Communication function to transmit data through the network, where the network communication can be a wired network or a wireless network, so the computer-aided prediction system 1 can also obtain data through the network. In one embodiment, the computer-aided prediction system 1 can execute a computer program product 30 in the microprocessor to achieve its function, wherein the computer program product 30 can have a plurality of instructions, which can cause the processor to perform special operations, and The processor is enabled to implement functions such as the DNN model 10 or the test module 40. In one embodiment, the computer program product 30 may be stored in a non-transitory computer readable medium (such as memory), but is not limited thereto. In one embodiment, the computer program product 30 may also be pre-stored in a network server for users to download.

在一實施例中，資料取得介面20可以是用以取得外部資料的一實體連接埠，例如當電腦輔助預測系統1是由電腦時，資料取得介面20可以是電腦上USB介面、各種傳輸線接頭等，但並非限定。此外，資料取得介面20亦可與無線通訊晶片整合，因此能以無線傳輸的方式接收資料。 In one embodiment, the data acquisition interface 20 may be a physical port for acquiring external data. For example, when the computer-aided prediction system 1 is a computer, the data acquisition interface 20 may be a USB interface on the computer, various transmission line connectors, etc. , But not limited. In addition, the data acquisition interface 20 can also be integrated with a wireless communication chip, so that data can be received by wireless transmission.

本發明的DNN模型10是一種資料分析的人工智慧模型，其是以複數個運算節點作為神經網路的神經元，且每個神經元的運算可視為病理因子的特徵分析。在利用大量的資料進行訓練後，DNN模型10可建構出每個神經元所對應的權重值。在一實施例中，在進行訓練之前，DNN模型10的基本模型(即未訓練的基本架構)可預先被建立，例如預先設定好神經元的數量、隱藏神經層14的數量、神經元之間的連結等，而系統1再透過電腦程式產品30中的指令使尚未訓練的DNN模型10進行訓練，以決定每個基本神經元13及隱藏神經元15的權重值，進而建立出深度神經元路徑18。在一實施例中，基本模型可經歷多次訓練而產生多個神經元路徑，並可透過測試模組40來測試每個神經元路徑的準確度。需注意的是，為區分訓練前與訓練後的DNN模型10，下文中對於未訓練的DNN模型10皆以基本模型來稱之，而訓練完成後則以DNN模型10稱之。 The DNN model 10 of the present invention is an artificial intelligence model for data analysis, which uses a plurality of operation nodes as neurons of a neural network, and the operation of each neuron can be regarded as a feature analysis of pathological factors. After training with a large amount of data, the DNN model 10 can construct the weight value corresponding to each neuron. In one embodiment, before training, the basic model of the DNN model 10 (that is, the untrained basic architecture) can be established in advance, for example, the number of neurons, the number of hidden neural layers 14, and the number of neurons are preset. Link, etc., and the system 1 uses the instructions in the computer program product 30 to train the untrained DNN model 10 to determine the weight value of each basic neuron 13 and hidden neuron 15 to further establish a deep neuron path 18. In an embodiment, the basic model can undergo multiple trainings to generate multiple neuron paths, and the accuracy of each neuron path can be tested through the test module 40. It should be noted that, in order to distinguish the DNN model 10 before and after training, the untrained DNN model 10 is hereinafter referred to as the basic model, and after the training is completed, the DNN model 10 is referred to.

圖2是本發明一實施例的DNN模型10(已完成訓練)的細部架構示意圖，請同時參考圖1(A)及1(B)。為了要準確預測糖尿病患者罹患直腸癌的可能性，本發明的DNN模型10(或基本模型)的隱藏神經層14的數量、基本神經元13的數量及隱藏神經元15的數量皆可視為可變參數。在圖2的實施例中，輸入層12可具有37個基本神經元13，亦即DNN模型10是以37個病理因子作為特徵分析時的基礎。此外，DNN模型10可具有3個隱藏神經層14，且每個隱藏神經層14各自包含30個隱藏神經元15。如圖2所示，輸入層12連結至第一個隱藏神經層141，第一個隱藏神經層142連結至第二個隱藏神經層142，第二個隱藏神經層142連結至第三個隱藏神經層143，第三麼隱藏神經層143連結至輸出層16，因此，當一患者的37個病理因子資料被輸入至DNN模型10時，會先在輸入層12進行分析，之後依序進入隱藏神經層141~143進行分析，之後再由輸出層16根據分析結果產生輸出結果17。上述「分析」是指每個神經元對於接收到的資料所進行的「運算」。在一實施例中，當資料通過一個神經元時，可視為一次運算的執行。 FIG. 2 is a detailed schematic diagram of the DNN model 10 (training completed) according to an embodiment of the present invention. Please refer to FIGS. 1(A) and 1(B) at the same time. In order to accurately predict the possibility of diabetic patients suffering from rectal cancer, the number of hidden neural layers 14, the number of basic neurons 13 and the number of hidden neurons 15 of the DNN model 10 (or basic model) of the present invention can be regarded as variable parameter. In the embodiment of FIG. 2, the input layer 12 may have 37 basic neurons 13, that is, the DNN model 10 uses 37 pathological factors as the basis for feature analysis. In addition, the DNN model 10 may have 3 hidden neural layers 14, and each hidden neural layer 14 includes 30 hidden neurons 15 each. As shown in FIG. 2, the input layer 12 is connected to the first hidden nerve layer 141, the first hidden nerve layer 142 is connected to the second hidden nerve layer 142, and the second hidden nerve layer 142 is connected to the third hidden nerve layer Layer 143, the third hidden neural layer 143 is connected to the output layer 16, therefore, when 37 pathological factor data of a patient are input to the DNN model 10, it will be analyzed at the input layer 12 first, and then enter the hidden nerves in order The layers 141 to 143 perform analysis, and then the output layer 16 generates an output result 17 according to the analysis result. The above "analysis" refers to the "operation" performed by each neuron on the received data. In one embodiment, when the data passes through a neuron, it can be regarded as the execution of one operation.

在一實施例中，輸入層12中的每個基本神經元13皆會與第一隱藏神經層141中的每個隱藏神經元連結，亦即每個基本神經元13的運算結果會各自傳送至第一隱藏神經層141的每個隱藏神經元15。第一隱藏神經層141的每個隱藏神經元15皆會與第二隱藏神經層142中的每個隱藏神經元15連結，亦即第一隱藏神經層141的每個隱藏神經元15的運算結果會傳送至第二隱藏神經層142的每個隱藏神經元。第二隱藏神經層142中的每個隱藏神經元15皆會與第三隱藏神經層143中的每個隱藏神經元15連結，亦即第二隱藏神經層142的每個隱藏神經元15的運算結果會傳送至第三隱藏神經層143中的每個隱藏神經元15。第三隱藏神經層143中的每個隱藏神經元15皆會與輸出層16中的每個輸出神經元17連結，亦即第三隱藏神經層143的每個隱藏神經元15的運算結果會傳送至每個輸出神經元17。經由輸出神經元17的運算後，輸出層16可產生患者的罹癌機率及為罹癌機率。 In an embodiment, each basic neuron 13 in the input layer 12 is connected to each hidden neuron in the first hidden neuron layer 141, that is, the calculation result of each basic neuron 13 is transmitted to Each hidden neuron 15 of the first hidden neural layer 141. Each hidden neuron 15 of the first hidden neural layer 141 is connected to each hidden neuron 15 of the second hidden neural layer 142, that is, the operation result of each hidden neuron 15 of the first hidden neural layer 141 Each hidden neuron will be transmitted to the second hidden neural layer 142. Each hidden neuron 15 in the second hidden neural layer 142 is connected to each hidden neuron 15 in the third hidden neural layer 143, that is, the operation of each hidden neuron 15 in the second hidden neural layer 142 The result is transmitted to each hidden neuron 15 in the third hidden neural layer 143. Each hidden neuron 15 in the third hidden neuron layer 143 is connected to each output neuron 17 in the output layer 16, that is, The operation result of each hidden neuron 15 of the third hidden neural layer 143 is transmitted to each output neuron 17. After the calculation of the output neuron 17, the output layer 16 can generate the cancer probability and the cancer probability of the patient.

接著將說明運算過程的細節。如圖2所示，在一實施例中，當37個病理因子資料進入輸入層12後，每個病理因子資料會各自與相對應的權重值進行加權運算(亦即與權重值進行相乘)，之後所有加權後的資料再一併傳送至第一隱藏神經層141中的每個隱藏神經元15。在一實施例中，對於每個隱藏神經層141~143的每個隱藏神經元15而言，其會將接收到的資料先進行一加總運算，之後再將加總運算的結果進行一第一型態激發運算，而之後再將第一型態激發運算的結果與該隱藏神經元15所對應的權重值進行加權運算，而每個隱藏神經元15的加權運算結果將一併進入下一個隱藏神經層14或輸出層16之中。在一實施例中，對於輸出層16的每個輸出神經元17而言，所取得的資料會先進行加總步驟，之後再進行一第二型態激發運算，而第二型態激發運算後的結果將形成一機率值。 Next, the details of the calculation process will be explained. As shown in FIG. 2, in one embodiment, when 37 pathological factor data enter the input layer 12, each pathological factor data will be individually weighted with the corresponding weight value (that is, multiplied by the weight value) Then, all the weighted data are sent to each hidden neuron 15 in the first hidden neural layer 141 again. In an embodiment, for each hidden neuron 15 of each hidden neural layer 141 to 143, it will first perform a sum operation on the received data, and then perform a summation on the result of the sum operation One type of excitation operation, and then the result of the first type of excitation operation and the weight value corresponding to the hidden neuron 15 are weighted, and the weighted operation result of each hidden neuron 15 will enter the next one The neural layer 14 or the output layer 16 is hidden. In an embodiment, for each output neuron 17 of the output layer 16, the obtained data will be first summed up, and then a second type excitation operation will be performed, and after the second type excitation operation The result will form a probability value.

在一實施例中，第一型態激發運算與第二型態激發運算可不相同。在一實施例中，第一激發運算是定義為使用線性整流函數(Rectified Linear Unit，ReLU)作為激發函數(Activation function)來進行運算。在一實施例中，第二激發運算是定義為使用Softmax函數作為激發函數來進行運算。由於ReLU函數的輸出區間為0至無限大，因此適合作為類神經網路的中間部分的神經元的激發器，而由於Softmax函數的輸出區間為0至1，因此適合作為類神經網路的輸出端的神經元的激發器，例如可使輸出結果形成機率。需注意的是，本發明不限於此，亦即本發明亦可使用其它激發函數來進行激發運算。 In one embodiment, the first type excitation operation and the second type excitation operation may be different. In one embodiment, the first excitation operation is defined as using a linear rectification function (Rectified Linear Unit, ReLU) as the activation function (Activation function) to perform the operation. In one embodiment, the second excitation operation is defined as using the Softmax function as the excitation function to perform the operation. Since the output interval of the ReLU function is 0 to infinity, it is suitable as an exciter for neurons in the middle part of the neural-like network, and because the output interval of the Softmax function is 0 to 1, it is suitable as the output of the neural-like network For example, the exciter of the neuron at the end can make the output result into a probability. It should be noted that the present invention is not limited to this, that is, the present invention can also use other excitation functions to perform excitation operations.

藉此，當DNN模型10完成訓練後，只要將一患者的37個病理因子輸入至DNN模型10中，DNN模型10即可預測該患者罹患直腸癌的可能性。在一實施例中，這些病理因子資料可先進行正規化或標準化的程序而形成相同標準下的數值，例如每個病理因子可進行正規化或標準化的程序而轉換為一個分數，而這些分數可經由神經元進行加權運算，且最終形成罹癌機率及未罹癌機率。 In this way, after the DNN model 10 completes training, as long as 37 pathological factors of a patient are input into the DNN model 10, the DNN model 10 can predict the possibility of the patient suffering from rectal cancer. In an embodiment, the pathological factor data can be first normalized or standardized to form a value under the same standard. For example, each pathological factor can be normalized or standardized and converted into a score, and these scores can be Weighted calculations are carried out via neurons, and eventually the cancer and non-cancer probabilities are formed.

此外，對於DNN模型10而言，神經元的資料來源，可能會影響著DNN模型10的預測能力。在一實施例中，患者的37個病理因子資料可包含生理性資料(Biographical)、共病症資料(Comorbidities)、糖尿病併發症資料(Diabetes Complications)、治療藥物資料(Medications)及指數資料(Scoring System)，但不限於此。 In addition, for the DNN model 10, the data source of neurons may affect the predictive ability of the DNN model 10. In one embodiment, the patient's 37 pathological factor data may include physiological data (Biographical), comorbidity data (Comorbidities), diabetes complication data (Diabetes Complications), therapeutic drug data (Medications) and index data (Scoring System ), but not limited to this.

在一實施例中，「生理性資料」可包含年齡、性別、低都市化(Lowest Urbanization)、中都市化(Medium Urbanization)、高都市化(High Urbanization)、最高都市化(Highest Urbanization)、白領階級(White Collar Occupation)、藍領階級(Blue Collar Occupation)及其它職業階級(Other Occupation)等資訊，但不限於此。在一實施例中，「共併症資料」可包含高血壓、高脂血症、中風、充血性心力衰竭、結腸直腸息肉、肥胖、COPD、CAD、哮喘、吸煙、炎症性腸病、腸易激綜合徵、CKD及酒精相關疾病等資訊，但不限於此。在一實施例中，「糖尿病併發症資料」可包含視網膜病變、腎病、神經病變、腦血管、心血管及代謝等資訊，但不限於此。在一實施例中，「治療藥物資料」可包含二甲雙胍(Metformin)、他汀類藥物(Statin)，胰島素(Insulin)、磺脲類藥物(Sulfonylureas)、其他抗糖尿病藥物(Other antidiabetic drugs)、TZD及PVD等資訊，但不限於此。在一實施例中，「指數資料」可包含糖尿病併發症嚴重程度指數(aDCSI Index)資訊，但不限於此。 In an embodiment, the "physiological data" may include age, gender, low urbanization (Lowest Urbanization), medium urbanization (Medium Urbanization), high urbanization (High Urbanization), highest urbanization (Highest Urbanization), white-collar workers Class (White Collar Occupation), Blue Collar Class (Blue Collar Occupation) and other professional classes (Other Occupation) and other information, but not limited to this. In one embodiment, the "comorbidity data" may include hypertension, hyperlipidemia, stroke, congestive heart failure, colorectal polyps, obesity, COPD, CAD, asthma, smoking, inflammatory bowel disease, intestinal susceptibility Irritation syndrome, CKD and alcohol-related diseases, but not limited to this. In one embodiment, the "diabetic complications data" may include retinopathy, nephropathy, neuropathy, cerebrovascular, cardiovascular and metabolic information, but is not limited thereto. In one embodiment, "therapeutic drug information" may include metformin (Metformin), statin (Statin), insulin (Insulin), sulfonylurea (Sulfonylureas), other antidiabetic drugs (Other antidiabetic drugs), TZD and PVD and other information, But it is not limited to this. In one embodiment, the "index data" may include aDCSI Index information, but it is not limited thereto.

在一實施例中，每個病理因子資料可以被數值化為相對應的分數，其中數值化的方式可依照資料性質而不相同，舉例來說，某些特徵可依照「特徵的有無」而對應不同分數(例如性別的不同會對應不同分數、藥物的使用與否會對應不同分數等)，而某些特徵本身可分為多個級距，並且透過級距而對應至不同分數(例如25歲可對應一分數，30歲可對應另一分數等)；上述內容僅是舉例，本發明不限於此。 In an embodiment, each pathological factor data can be quantified into a corresponding score, where the way of quantification can be different according to the nature of the data, for example, certain features can be corresponding according to "the presence or absence of features" Different scores (for example, different genders will correspond to different scores, whether the use of drugs will correspond to different scores, etc.), and some features themselves can be divided into multiple grades, and through the grades are corresponding to different scores (such as 25 years old It can correspond to a score, 30 years old can correspond to another score, etc.); the above content is just an example, the invention is not limited to this.

接著將說明電腦輔助預測系統1的基本運作方式。圖3是本發明一實施例的電腦輔助預測方法的基本步驟流程圖，該方法是由圖1(A)的電腦輔助預測系統1執行，其中DNN模型10屬於已訓練完成的狀態，並請同時參考圖1(A)至圖3。如圖3所示，首先步驟S31被執行，資料取得介面20取得一糖尿病患者的病理因子資料。之後，步驟S32被執行，DNN模型10的輸入層12取得病理因子資料，並將加權後的病理因子資料傳送至隱藏神經層14。之後，步驟S33被執行，每個隱藏神經層14的每個隱藏神經元會對接收到的資料進行運算，其中最後一個隱藏神經層14會將運算後的結果傳送至輸出層16。之後，步驟S34被執行，輸出層16對接收到的資料進行運算，進而輸出該患者的罹癌機率及未罹癌機率。之後，步驟S35被執行，系統1根據輸出層16的輸出結果，預測該患者的罹癌可能性。 Next, the basic operation mode of the computer-aided prediction system 1 will be explained. FIG. 3 is a flowchart of basic steps of a computer-aided prediction method according to an embodiment of the present invention. The method is executed by the computer-aided prediction system 1 of FIG. 1(A), in which the DNN model 10 is in a trained state. Refer to FIG. 1(A) to FIG. 3. As shown in FIG. 3, first, step S31 is executed, and the data acquisition interface 20 acquires pathological factor data of a diabetic patient. After that, step S32 is executed, the input layer 12 of the DNN model 10 obtains the pathological factor data, and transmits the weighted pathological factor data to the hidden neural layer 14. After that, step S33 is executed. Each hidden neuron of each hidden neural layer 14 performs an operation on the received data, and the last hidden neural layer 14 transmits the result of the operation to the output layer 16. After that, step S34 is executed, and the output layer 16 performs calculation on the received data, and further outputs the cancer probability and non-cancer probability of the patient. After that, step S35 is executed, and the system 1 predicts the possibility of cancer of the patient based on the output result of the output layer 16.

關於步驟S31，在一實施例中，病理因子資料可以是前述的37個病理因子，並且已經由正規化或標準差運算而形成一分數。 Regarding step S31, in one embodiment, the pathological factor data may be the aforementioned 37 pathological factors, and has been formed into a score by normalization or standard deviation operation.

關於步驟S32，在一實施例中，每個病理因子所對應的權重值(基本神經元13的權重值)皆已在DNN模型10(基本模型)的訓練過程中被決定，換言之，DNN模型10的訓練目的之一即是在決定每個病理因子所對應的權重值為何。在一實施例中，每個病理因子的分數會在基本神經元13中進行加權運算，之後再被傳送至第一個隱藏神經層14中的每個隱藏神經元15。 Regarding step S32, in an embodiment, the weight value corresponding to each pathological factor (the weight value of the basic neuron 13) has been determined during the training process of the DNN model 10 (basic model), in other words In short, one of the training objectives of the DNN model 10 is to determine the weight value corresponding to each pathological factor. In an embodiment, the score of each pathological factor is weighted in the basic neuron 13 and then transmitted to each hidden neuron 15 in the first hidden neuron layer 14.

關於步驟S33，在一實施例中，每個隱藏神經層14的每個隱藏神經元15會對接收到的資料進行加總運算、激發運算(第一型態激發運算)及加權運算，其中每個隱藏神經元15對應的權重值亦是在DNN模型10(基本模型)的訓練過程中被決定，亦即DNN模型10的訓練目的之一即是在決定每個隱藏神經元所對應的權重值為何。在一實施例中，最後一個隱藏神經層14中的每個隱藏神經元15的加權運算結果，將被傳送至輸出層16中的每個輸出神經元17。 Regarding step S33, in an embodiment, each hidden neuron 15 of each hidden neural layer 14 performs summation operation, excitation operation (first type excitation operation) and weighted operation on the received data, where each The weight value corresponding to a hidden neuron 15 is also determined during the training process of the DNN model 10 (basic model), that is, one of the training purposes of the DNN model 10 is to determine the weight value corresponding to each hidden neuron Why. In an embodiment, the weighted operation result of each hidden neuron 15 in the last hidden neuron layer 14 will be transmitted to each output neuron 17 in the output layer 16.

關於步驟S34，在一實施例中，輸出層16的每個輸出神經元會對接收到的資料進行加總及激發運算(第二型態激發運算)。在一實施例中，輸出層16的二輸出結果的加總為1(亦即加總結果對應100%的預測機率)。 Regarding step S34, in an embodiment, each output neuron of the output layer 16 performs summation and excitation operations on the received data (second type excitation operation). In one embodiment, the sum of the two output results of the output layer 16 is 1 (that is, the summed result corresponds to a predicted probability of 100%).

關於步驟S35，在一實施例中，系統1會比較輸出層16的二輸出結果，並將較高的機率作為預測結果，舉例來說，當對應未罹癌機率的輸出結果為0.75(即表示75%)，而對應罹癌機率的輸出結果為0.25(即表示25%)時，則系統1會預測該患者的罹癌可能性較低，但本發明不限於此。 Regarding step S35, in one embodiment, the system 1 compares the two output results of the output layer 16, and uses a higher probability as the prediction result. For example, when the output result corresponding to the probability of not suffering from cancer is 0.75 (that is, 75%), and when the output result corresponding to the cancer probability is 0.25 (that is, 25%), the system 1 predicts that the patient has a low possibility of cancer, but the invention is not limited to this.

由此可知，當DNN模型10建立完成後，只要將患者的病理因子資料輸入至電腦輔助預測系統1中，DNN模型10即可預測該患者的罹癌可能性，藉此，患者可提早進行預防，生存機率可大幅提升。 It can be seen that after the establishment of the DNN model 10, as long as the patient's pathological factor data is input into the computer-aided prediction system 1, the DNN model 10 can predict the possibility of the patient's cancer, thereby allowing the patient to prevent it earlier , The survival probability can be greatly improved.

此外，為了使DNN模型10能夠執行步驟S31至S35，DNN模型10必須先透過訓練來建立每個神經元的權重值。以下將詳細說明DNN模型10的建立過程。 In addition, in order for the DNN model 10 to perform steps S31 to S35, the DNN model 10 must first establish the weight value of each neuron through training. The establishment process of the DNN model 10 will be described in detail below.

圖4是本發明一實施例的DNN模型10的建立過程的步驟流程圖，其中該等步驟可由電腦輔助預測系統1的處理器執行電腦程式產品30中的指令而實現，並請同時參考圖1至圖4。 FIG. 4 is a flow chart of the steps of the process of establishing the DNN model 10 according to an embodiment of the present invention. These steps can be implemented by the processor of the computer-aided prediction system 1 executing instructions in the computer program product 30. Please also refer to FIG. 1 To Figure 4.

首先，步驟S41被執行，DNN模型10的基本模型被設定完成。之後，步驟S42被執行，輸入層12從全部訓練用資料中取得一最小批量的資料。之後，步驟S43被執行，基本模型利用取得的資料進行訓練，以決定基本模型中的每個神經元的權重值，藉此建立一個候選神經元路徑。之後步驟S44被執行，基本模型的輸入層12取得最小批量的另外複數筆訓練用資料，並重新執行步驟S43。之後步驟S45被執行，重複執行步驟S44，直到達到一預設條件。之後步驟S46被執行，重新執行步驟S42至S45複數次(itoration程序)。之後步驟S47被執行，預測模型40評估每個候選神經元路徑的預測能力。之後步驟S48被執行，系統1將具備預測能力最好的候選神經元路徑設定為DNN模型10實際使用的深度神經元路徑18。上述步驟至少可透過系統1的處理器執行電腦程式產品30的指令或其它電腦程式產品的指令而實現。 First, step S41 is executed, and the basic model of the DNN model 10 is set. After that, step S42 is executed, and the input layer 12 obtains a minimum batch of data from all training data. After that, step S43 is executed, and the basic model uses the acquired data for training to determine the weight value of each neuron in the basic model, thereby establishing a candidate neuron path. After that, step S44 is executed, and the input layer 12 of the basic model obtains a minimum number of additional plural training data, and executes step S43 again. After that, step S45 is executed, and step S44 is repeatedly executed until a preset condition is reached. After that, step S46 is executed, and steps S42 to S45 are repeated a plurality of times (itoration procedure). After that, step S47 is executed, and the prediction model 40 evaluates the prediction ability of each candidate neuron path. After that, step S48 is executed, and the system 1 sets the candidate neuron path with the best prediction ability as the deep neuron path 18 actually used by the DNN model 10. The above steps can be implemented at least by the processor of the system 1 executing instructions of the computer program product 30 or instructions of other computer program products.

關於步驟S41，此步驟是用以找出DNN模型10(基本模型)的最佳變數參數，此處變數參數可例如是隱藏神經層的數量、激發函數為何等，且不限於此。此步驟可由系統1接收使用者所輸入的指令，並依照指令來進行基本模型的設定來實現。在一實施例中，此步驟是使用少數訓練用資料先建立出複數個具備不同參數的簡化基本模型，之後再利用K折交互驗證方法(K-fold cross validation)找出其中一個效能最佳的簡化基本模型，並將該簡化基本模型設定為DNN模型10的基本模型(亦即最佳參數值可被找出)。此處「少數的訓練用資料」可例如是所有訓練用資料的1/100，但不限於此。在一實施例中，K-fold cross validation是對每個簡化基本模型進行K次驗證，每次驗證包含了訓練過程及測試過程，其中訓練過程是決定該簡化基本模型的各神經元的權重值，測試過程是用以測試該簡化基本模型的預測能力。對於一個簡化基本模型而言，每次驗證是將前述少數的訓練用資料以(K-1)：1的數量分為訓練組及測試組，其中訓練組用於訓練過程，測試組則用於測試過程。當K次驗證完成後，系統1再將該簡化基本模型的每次驗證的準確度取平均值，並將該平均值作為該簡化基本模型的準確度。在一實施例中，K為10，亦即每個簡化基本模型將進行10次驗證，且每次驗證是將訓練用資料以9：1的數量分為訓練組及測試組，但本發明不限於此。此外，假如DNN模型10需使用最佳超參數(best hyperparameter)、優化器(optimizer)，則最佳超參數(best hyperparameter)、優化器(optimizer)亦可在步驟S41被設定好。藉此，步驟S41可找出具備最佳參數的簡化基本模型，以作為後續深度訓練所使用的基本模型。 Regarding step S41, this step is used to find the optimal variable parameter of the DNN model 10 (basic model), where the variable parameter may be, for example, the number of hidden nerve layers, what is the excitation function, etc., and is not limited thereto. This step can be realized by the system 1 receiving the command input by the user, and setting the basic model according to the command. In one embodiment, this step is to use a few training data to first create a plurality of simplified basic models with different parameters, and then use the K-fold cross validation method (K-fold cross validation) to find one of the best performance The simplified basic model is set as the basic model of the DNN model 10 (that is, the optimal parameter value can be found). Here, "a few training data" may be, for example, 1/100 of all training data, but it is not limited thereto. In one embodiment, K-fold cross Validation is to perform K verifications for each simplified basic model. Each verification includes a training process and a testing process, where the training process is to determine the weight value of each neuron of the simplified basic model, and the testing process is to test the simplified basic The predictive power of the model. For a simplified basic model, each verification is to divide the aforementioned few training materials into (K-1): 1 into a training group and a test group, where the training group is used for the training process and the test group is used for Testing process. When K times of verification are completed, the system 1 takes the average of the accuracy of each verification of the simplified basic model again, and uses the average as the accuracy of the simplified basic model. In an embodiment, K is 10, that is, each simplified basic model will be verified 10 times, and each verification is to divide the training data into a training group and a test group by 9:1, but the present invention does not Limited to this. In addition, if the DNN model 10 needs to use best hyperparameters and optimizers, the best hyperparameters and optimizers can also be set in step S41. In this way, step S41 can find a simplified basic model with the best parameters as a basic model for subsequent deep training.

關於步驟S42，系統1可先取得全部訓練用資料，並從全部訓練用資料中提取最小批量的資料數量輸入至基本模型中，使基本模型利用該等最小批量的資料進行深度訓練(第一次深度訓練)。在一實施例中，全部訓練用資料的數量是定義為至少一百萬筆，而最小批量是定義至少為100筆資料。在一實施例中，全部訓練用資料為1315899筆，而最小批量是128筆資料，因此系統1會從1315899筆訓練用資料中隨機選取128筆資料輸入至基本模型中，但本發明不限於此。需注意的是，每個訓練用資料包含了一位糖尿病患者的37個病理因子資料及該患者實際罹癌與否的資訊。 Regarding step S42, the system 1 can first obtain all the training data, and extract the minimum batch of data from all the training data to input into the basic model, so that the basic model uses these minimum batches of data for deep training (the first time Deep training). In one embodiment, the amount of all training data is defined as at least one million records, and the minimum batch size is defined as at least 100 records. In one embodiment, the total training data is 1315899, and the minimum batch is 128 data, so the system 1 will randomly select 128 data from the 1315899 training data and enter it into the basic model, but the invention is not limited to this . It should be noted that each training data contains information on 37 pathological factors of a diabetic patient and whether the patient actually has cancer or not.

關於步驟S43，此步驟是基本模型利用步驟S42中所取得的資料來進行訓練，由於訓練用資料包含了糖尿病患者的實際罹癌與否的資訊，因此基本模型可藉此分析出罹癌情況下可能的病理因子的特性以及未罹癌情況下可能的病理因子特性，進而決定每個神經元的權重值。在一實施例中，基本模型是執行梯度下降運算法來進行訓練，進而決定每個神經元的權重值。在一實施例中，梯度下降運算法可以是Stochastic gradient descent或Adam with Nesterov’s accelerated gradient descent二者至少之一，且不限於此。採用Stochastic gradient descent的目的之一是可減少基本模型的預測值及真實結果之間的差異(loss)，例如使基本模型的預測結果與真實結果之間具備局部最小差異值，而採用Adam with Nesterov’s accelerated gradient descent的目的之一是使基本模型的預測結果與真實結果之間具備絕對最小差異值的機率提升。此外，由於本發明的重點之一在於藉由Stochastic gradient descent及Adam with Nesterov’s accelerated gradient descent的特性來提升基本模型的預測能力的準確度，而關於Stochastic gradient descent及Adam with Nesterov’s accelerated gradient descent的執行過程則並非重點，因此在此不對執行過程進行詳述。當完成步驟S43後，基本模型可完成一次訓練，一候選神經元網路可被建立。 Regarding step S43, this step is for the basic model to use the data obtained in step S42 for training. Since the training data contains information on whether the diabetic patient actually suffered from cancer, the basic model This model can analyze the characteristics of possible pathological factors in the case of cancer and the characteristics of possible pathological factors in the case of no cancer, and then determine the weight value of each neuron. In one embodiment, the basic model performs gradient descent algorithm for training, and then determines the weight value of each neuron. In one embodiment, the gradient descent algorithm may be at least one of Stochastic gradient descent or Adam with Nesterov's accelerated gradient descent, and is not limited thereto. One of the purposes of using Stochastic gradient descent is to reduce the difference between the predicted value of the basic model and the actual result (loss), for example, to have a local minimum difference between the predicted result of the basic model and the real result, and adopt Adam with Nesterov's One of the purposes of accelerated gradient descent is to increase the probability of the absolute minimum difference between the predicted results of the basic model and the real results. In addition, since one of the key points of the present invention is to improve the accuracy of the prediction ability of the basic model through the characteristics of Stochastic gradient descent and Adam with Nesterov's accelerated gradient descent, the implementation process of Stochastic gradient descent and Adam with Nesterov's accelerated gradient descent It is not the focus, so the implementation process will not be detailed here. After step S43 is completed, the basic model can complete one training and a candidate neuron network can be established.

關於步驟S44，系統1會將另外一組最小批量的資料(即另外128筆資料)輸入至基本模型中，基本模型再利用該組最小批量的資料重新進行步驟S43的訓練，並藉此產生另一候選神經元網路。在一實施例中，每次系統所選擇的最小批量的資料皆是隨機選取，因此每組最小批量的資料可能會有重複的資料被選取，但並非限定。 Regarding step S44, the system 1 will input another set of the smallest batch of data (that is, another 128 pieces of data) into the basic model, and the basic model uses the group of the smallest batch of data to re-train in step S43, and thereby generate another A candidate neural network. In an embodiment, the smallest batch of data selected by the system each time is randomly selected. Therefore, duplicate data may be selected for each group of smallest batch of data, but it is not limited.

關於步驟S45，系統1會重複執行步驟S41，進而產生基本模型的複數個神經元網路，直至一個預設條件被達成。在一實施例中，「預設條件」是指所有的訓練用資料都已輸入至基本模型之中，且基本模型已訓練完成；在另一實施例中，「預設條件」亦可以是指定數量的神經元網路已被建立出來。關於「預設條件」的描述僅是舉例，本發明不限於此。 Regarding step S45, the system 1 repeatedly executes step S41 to generate a plurality of neuron networks of the basic model until a predetermined condition is fulfilled. In one embodiment, "preset condition" means that all training data have been input into the basic model, and the basic model has been trained; In another embodiment, the "preset condition" may also be that a specified number of neuron networks have been established. The description of "preset conditions" is only an example, and the present invention is not limited thereto.

關於步驟S46，此步驟用以對基本模型的訓練進行迭代(iteration)程序，亦即重新執行步驟S42至S45，直至達到指定次數，藉此進一步提升神經元網路的預測能力。在一實施例中，「指定次數」是設定為至少1000次，但並非限定。 Regarding step S46, this step is used to perform an iteration procedure on the training of the basic model, that is, steps S42 to S45 are re-executed until the specified number of times is reached, thereby further improving the prediction ability of the neural network. In one embodiment, the "specified number of times" is set to at least 1000 times, but it is not limited.

關於步驟S47，此步驟是透過測試模組40對每個候選神經元網路進行效能的評估。在一實施例中。測試模組40的測試可包含權重平均召回(Weighted average recall)分析，用以分析該等候選神經元網路的靈敏度。在一實施例中，測試模組40的測試可包含正預測值分析，用以分析出該等候選神經元網路的準確度。在一實施例中，測試模組40的測試可包含F1分析，用以分析出該等候選神經元網路的F1值(即靈敏度和精準度的調和平均值)。在一實施例中，權重平均召回、正預測值及F1分析中之至少二者會一併執行。藉此，每個候選神經元網路的預測效能可被評估出來。 Regarding step S47, this step is to evaluate the performance of each candidate neuron network through the test module 40. In an embodiment. The test of the test module 40 may include weighted average recall analysis to analyze the sensitivity of the candidate neuron networks. In an embodiment, the test of the test module 40 may include positive predictive value analysis to analyze the accuracy of the candidate neuron networks. In an embodiment, the test of the test module 40 may include F1 analysis to analyze the F1 values of the candidate neural network (ie, the harmonic average of sensitivity and accuracy). In one embodiment, at least two of weighted average recall, positive predictive value, and F1 analysis are performed together. In this way, the prediction performance of each candidate neuron network can be evaluated.

關於步驟S48，此步驟是用以選取預測效能最佳的候選神經元網路作為實際使用的DNN模型10的深度神經元網路18。當步驟S48完成後，DNN模型10的訓練已完成，往後使用者(醫師)只要將患者的病理因子資料輸入至DNN模型10，DNN模型10即可分析出患者的罹癌可能性。 Regarding step S48, this step is to select the candidate neuron network with the best prediction performance as the deep neuron network 18 of the DNN model 10 actually used. When step S48 is completed, the training of the DNN model 10 is completed. In the future, the user (physician) only needs to input the pathological factor data of the patient to the DNN model 10, and the DNN model 10 can analyze the possibility of the patient suffering from cancer.

此外，在一實施例中，在訓練過程中，每個隱藏神經層14及輸出層16可被施加一個dropout(即一種用以避免過度訓練(overfitting)的正規化技術)。在一實施例中，輸出層16可使用categorical cross entropy function作為一損失函數。另外，在一實施例中，每個神經元的權重值可使用正規化He起始值(Normalized He initialization)而被初始化。本發明不限於此。 In addition, in an embodiment, during the training process, each hidden neural layer 14 and output layer 16 may be applied with a dropout (ie, a regularization technique to avoid overfitting). In an embodiment, the output layer 16 may use a categorical cross entropy function as a loss function. In addition, in an embodiment, the weight value of each neuron may be initialized using a normalized He initialization value (Normalized He initialization). The present invention is not limited to this.

圖5是本發明一實施例的實驗數據示意圖，其是以ROC曲線來呈現本發明的DNN模型10與傳統的aDCSI模型對於預估糖尿病患者罹癌機率的準確度，其Y軸為真陽性率(以True positive rate標註)，X軸為偽陽性率(以False positive rate標註)，其中兩者是以相同的資料進行測試。如圖5所示，DNN模型10的ROC曲線的曲線下面積(AUC)約為0.738，而aDCSI模型的AUC約為0.492，由此可知，本發明的DNN模型10擁有比傳統的aDCSI模型更好的預測能力。 5 is a schematic diagram of experimental data according to an embodiment of the present invention. The ROC curve is used to present the accuracy of the DNN model 10 of the present invention and the traditional aDCSI model in predicting the cancer risk of diabetic patients. The Y-axis is the true positive rate. (Marked with True positive rate), the X axis is the false positive rate (marked with False positive rate), where both are tested with the same data. As shown in FIG. 5, the area under the curve (AUC) of the ROC curve of the DNN model 10 is about 0.738, and the AUC of the aDCSI model is about 0.492. It can be seen that the DNN model 10 of the present invention has better performance than the traditional aDCSI model Predictive power.

藉此，本發明所使用的DNN模型可建立完成，換言之，只要將患者的病理因子資料輸入至DNN模型中，DNN模型即可自動預測出該患者罹患直腸癌的可能性。藉由深度學習訓練，本發明的電腦輔助預測系統可精準地預測出患者的罹癌機率，可輔助患者尋求最佳的醫療照護方式。 In this way, the DNN model used in the present invention can be established. In other words, as long as the pathological factor data of the patient is input into the DNN model, the DNN model can automatically predict the possibility of the patient suffering from rectal cancer. Through deep learning training, the computer-aided prediction system of the present invention can accurately predict the cancer incidence of patients, and can assist patients in seeking the best medical care.

此外，在一實施例中，本發明的電腦輔助預測系統、方法及電腦程式產品可由論文“Development of a Prediction Model for Colorectal Cancer among Patients with Type 2 Diabetes Mellitus Using a Deep Neural Network，Meng-Hsuen Hsieh,Li-Min Sun,Cheng-Li Lin,Meng-Ju Hsieh,Kyle Sun,Chung-Y.Hsu,An-Kuo Chou,and Chia-Hung Kao”記載的內容來實現，但不限於此。 In addition, in an embodiment, the computer-aided prediction system, method and computer program product of the present invention can be described in the paper "Development of a Prediction Model for Colorectal Cancer among Patients with Type 2 Diabetes Mellitus Using a Deep Neural Network, Meng-Hsuen Hsieh, "Li-Min Sun, Cheng-Li Lin, Meng-Ju Hsieh, Kyle Sun, Chung-Y. Hsu, An-Kuo Chou, and Chia-Hung Kao" to achieve, but not limited to.

儘管本發明已透過上述實施例來說明，可理解的是，根據本發明的精神及本發明所主張的申請專利範圍，許多修飾及變化都是可能的。 Although the present invention has been described through the above embodiments, it is understandable that many modifications and changes are possible in accordance with the spirit of the present invention and the patent application scope claimed by the present invention.

1‧‧‧電腦輔助預測系統 1‧‧‧ Computer-aided prediction system

20‧‧‧資料取得介面 20‧‧‧Data access interface

30‧‧‧電腦程式產品 30‧‧‧Computer program products

40‧‧‧測試模組 40‧‧‧Test module

Claims

A computer-aided prediction system for predicting the possibility of a diabetic patient suffering from rectal cancer, including: a deep neural network (DNN) model, through a deep neuron path to a feature of a plurality of pathological factor data of the diabetic patient Analysis, wherein the DNN model includes: a plurality of neurons, at least a part of which correspond to the pathological factor data; and an output layer, which outputs at least one output result related to the possibility of cancer according to the feature analysis; Among them, the DNN model determines a weight value corresponding to each neuron through a plurality of trainings, and then establishes the deep neuron path; wherein, the pathological factor data includes physiological data, comorbidity data, and diabetes complications Data, therapeutic drug data and index data.

The computer-aided system according to claim 1, wherein the DNN model further includes an input layer and a plurality of hidden neural layers, wherein the input layer includes a plurality of basic neurons corresponding to the pathological factor data, each of the hidden neural layers It contains a plurality of hidden neurons, and each hidden neuron in one hidden neuron layer is connected to all the basic neurons.

The computer-aided system according to claim 2, wherein the number of hidden neural layers is determined by performing a k-fold cross-validation method, where k is set to 10.

The computer-aided system as described in claim 2, wherein the DNN model performs each training by randomly selecting a plurality of training data, and creates a plurality of candidate deep neural networks, where each training is defined as equivalent The training data performs at least one gradient descent operation to determine each basic god The weight value corresponding to the meridian and each hidden neuron, and each training data contains information on 37 pathological factors of a diabetic patient and whether the diabetic patient suffers from rectal cancer.

The computer-aided system according to claim 4, wherein the deep neural network when the DNN model is actually used is one of the candidate deep neural networks with the best predictive ability, and the deep neural network The prediction ability is determined by at least one of weighted average recall, positive predicted value and F1 analysis.

A computer-aided prediction method for predicting the possibility of a diabetic patient suffering from rectal cancer. The method is implemented by a computer-aided prediction system, wherein the computer-aided prediction system includes a DNN model, and the DNN model includes a plurality of nerves And an output layer, the method includes the steps of: obtaining a plurality of pathological factor data of the diabetic patient; performing a feature analysis on the pathological factor data through a deep neuron path through the DNN model; and through the output Layer, according to the feature analysis, output at least one output result related to the possibility of cancer; wherein, the DNN model determines a weight value of each basic neuron through a plurality of trainings, and then establishes the deep neuron path; Among them, the pathological factor data includes physiological data, comorbidity data, diabetes complication data, treatment drug data and index data.

The computer-aided prediction method according to claim 6, wherein the DNN model further includes an input layer and a plurality of hidden neural layers, wherein the input layer includes a plurality of basic neurons corresponding to the pathological factor data, and the hidden neural layers Each contains a plurality of hidden neurons, and each hidden neuron in one hidden neural layer is connected to all basic neurons.

The computer-aided prediction method as described in claim 7, wherein the DNN model performs each training by randomly selecting a plurality of training data, and creates a plurality of candidate deep neural networks, Each training is defined as performing at least one gradient descent operation on the peer training data to determine the weight value corresponding to each basic neuron and each hidden neuron, and each training data contains 37 diabetic patients Pathological factor information and whether the diabetic patient suffers from rectal cancer.

The computer-aided system according to claim 8, wherein the deep neural network when the DNN model is actually used is one of the candidate deep neural networks that has the best predictive ability, and the deep neural network has The prediction ability is determined by at least one of weighted average recall analysis, positive predictive value analysis and F1 analysis.

A computer program product, stored in a non-transitory computer-readable medium, for operating a computer-aided prediction system, wherein the computer-aided prediction system is used to predict the likelihood of a diabetic patient suffering from rectal cancer, And the computer-aided prediction system includes a DNN model with a plurality of basic neurons and an output layer. The computer program product includes: an instruction to obtain the plurality of pathological factor data of the diabetic patient; an instruction to make the DNN model through A deep neuron path performs a feature analysis on the pathological factor data; and an instruction to cause the output layer to output at least one output result related to the possibility of cancer according to the feature analysis; wherein, the DNN model uses complex numbers Training to determine a weight value for each basic neuron, and then establish the deep neuron path; wherein, the pathological factor data includes physiological data, comorbid disease data, diabetes complications data, treatment drug data and index data.