TWI836965B - Preoperative risk prediction system - Google Patents

Preoperative risk prediction system Download PDF

Info

Publication number
TWI836965B
TWI836965B TW112114207A TW112114207A TWI836965B TW I836965 B TWI836965 B TW I836965B TW 112114207 A TW112114207 A TW 112114207A TW 112114207 A TW112114207 A TW 112114207A TW I836965 B TWI836965 B TW I836965B
Authority
TW
Taiwan
Prior art keywords
data
knowledge base
model
time series
surgery
Prior art date
Application number
TW112114207A
Other languages
Chinese (zh)
Inventor
李偉柏
高子平
程廣義
王照元
Original Assignee
高雄醫學大學
國立中山大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 高雄醫學大學, 國立中山大學 filed Critical 高雄醫學大學
Priority to TW112114207A priority Critical patent/TWI836965B/en
Application granted granted Critical
Publication of TWI836965B publication Critical patent/TWI836965B/en

Links

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention provides a preoperative risk prediction system, comprising: a database unit, which stores a plurality of historical data of a plurality of patients; and a processing unit, which is electrically connected to the database unit and includes a knowledge-based model, a time series overlay knowledge-base model and a machine learning model. Wherein, the plurality of historical data is input to the time series overlay knowledge-base model, which performs a time series overlay data processing on the plurality of historical data to generate a plurality of time series overlay knowledge-base data. Further, the plurality of time series overlay knowledge-base data is input to the knowledge-base model, which performs a data fusion processing on the plurality of time series overlay knowledge-base data to generate a plurality of fusion knowledge-base data. The plurality of fusion knowledge-base data is input to the machine learning model, which performs a knowledge base-based machine learning on the plurality of t fusion knowledge-base data to generate a plurality of output data, which includes the predicted length of hospital stay data, the predicted mortality data and the analysis of impact factors of each patient.

Description

手術前風險預測系統 Preoperative risk prediction system

本發明涉及一種風險預測系統,且特別涉及一種在手術前以知識庫學習為基礎的機器學習方式來精準預測患者術後死亡率及住院天數的手術前風險預測系統。 The present invention relates to a risk prediction system, and in particular, to a pre-operative risk prediction system that uses a machine learning method based on knowledge base learning to accurately predict postoperative mortality and hospitalization days of patients before surgery.

一般來說,患者在進行手術前會由麻醉科醫師來進行麻醉術前照會,醫師將查閱患者的諸如過往病歷、身體檢查報告等相關資料,且患者將在麻醉術前照會時將告知醫師自身的身體狀況及用藥情形,使得麻醉科醫師可以依據上述資料,並結合諸如ASAPS(American society of anesthesiologists physical status classification system)及POSPOM(preoperative score to predict postoperative mortality)的麻醉風險分級工具來評估麻醉及手術的風險,從而確保在後續執行麻醉與手術時患者的安全。 Generally speaking, before the patient undergoes surgery, the anesthesiologist will conduct a preoperative anesthesia consultation. The doctor will review the patient's medical history, physical examination report and other relevant information. The patient will inform the doctor of his or her physical condition and medication during the preoperative anesthesia consultation, so that the anesthesiologist can assess the risks of anesthesia and surgery based on the above information and in combination with anesthesia risk grading tools such as ASAPS (American society of anesthesiologists physical status classification system) and POSPOM (preoperative score to predict postoperative mortality), thereby ensuring the safety of the patient during subsequent anesthesia and surgery.

然而,目前大多的麻醉術前照會通常是由醫師獲取患者的相關資料後,並且透過醫師的經驗來依據所獲得的資料來進行風險評估,而這種方式可能因為醫師個人經驗的差異而影響最終判斷的準確性。此外,若要在麻醉科醫師完成風險評估分級前利用模型來加以輔助,將勢必需要一種可以不採計諸如ASA分級的麻醉風險分級工具作為輸入資料的預測模型。近年來,隨而科技 的發展,諸如機器學習(Machine Learning,ML)、自然語言處理(Natural Language Processing,NLP)等人工智慧(artificial intelligence,AI)相關技術也逐漸應用於醫療產業中,因此,透過將人工智慧技術應用於麻醉術前照會的風險評估中,將有助於輔助醫師執行術前的風險評估。 However, most current anesthesia preoperative notes are usually made by doctors after obtaining relevant information about the patient, and then use the doctor's experience to conduct a risk assessment based on the obtained information. This method may affect the final result due to differences in doctors' personal experiences. Accuracy of judgment. In addition, using models to assist anesthesiologists before they complete risk assessment classification will require a predictive model that does not use anesthesia risk classification tools such as ASA classification as input data. In recent years, with the technological With the development of the medical industry, artificial intelligence (AI) related technologies such as machine learning (ML) and natural language processing (NLP) are gradually being used in the medical industry. Therefore, by applying artificial intelligence technology In the risk assessment of anesthesia preoperative notes, it will help assist physicians to perform preoperative risk assessment.

為此,如何設計出一種的手術前風險預測系統,其可以利用以知識庫學習為基礎的機器學習方式來精準預測患者術後死亡率及住院天數,以解決先前技術的技術問題,乃為本發明之發明人所研究的重要課題。 Therefore, how to design a pre-operative risk prediction system that can accurately predict the patient's postoperative mortality rate and length of hospital stay by using machine learning based on knowledge base learning to solve the technical problems of the previous technology is an important topic studied by the inventor of the present invention.

有鑑於此,本發明人乃累積多年相關領域之研究及實務經驗,提供一種手術前風險預測系統,以改善先前技術中所述的問題。 In view of this, the inventor has accumulated many years of research and practical experience in related fields and provides a pre-operative risk prediction system to improve the problems described in the previous technology.

為了解決上述先前技術中存在的問題,本發明提供一種手術前風險預測系統,其包含:資料庫單元,其儲存有複數個患者的複數個歷史資料;以及處理單元,其電性連接該資料庫單元,且包含時序疊加知識庫模型、知識庫模型(knowledge-based model)、以及機器學習模型。其中,複數個歷史資料分別輸入至時序疊加知識庫模型,且時序疊加知識庫模型依據複數個歷史資料執行時序疊加資料處理,而產生對應的複數個時序疊加知識庫資料,複數個時序疊加知識庫資料包含各患者的平均住院天數資料及死亡率資料。並且,複數個時序疊加知識庫資料輸入至知識庫模型,且知識庫模型依據複數個時序疊加知識庫資料執行資料融合處理,而產生複數個融合知識庫資料,複數個融合知識庫資料包含各患者的平均住院天數資料及死亡率資料。並且,複數個融合知識庫資料輸入至機器學習模型,且機器學習模型依據複數個融合知識庫資料執行 知識庫基礎的機器學習,而產生複數個輸出資料,複數個輸出資料包含各患者的預測住院天數資料、預測死亡率資料及影響因子分析資料。 In order to solve the problems existing in the above-mentioned prior art, the present invention provides a pre-operative risk prediction system, which includes: a database unit that stores a plurality of historical data of a plurality of patients; and a processing unit that is electrically connected to the database. unit, and includes time series overlay knowledge base model, knowledge base model (knowledge-based model), and machine learning model. Among them, a plurality of historical data are input into the time series overlay knowledge base model respectively, and the time series overlay knowledge base model performs time series overlay data processing based on the plurality of historical data to generate corresponding plurality of time series overlay knowledge base data, and a plurality of time series overlay knowledge base The data includes average length of stay and mortality data for each patient. Furthermore, a plurality of time series superimposed knowledge base data are input into the knowledge base model, and the knowledge base model performs data fusion processing based on a plurality of time series superimposed knowledge base data to generate a plurality of fused knowledge base data, and the plurality of fused knowledge base data includes each patient. average length of stay and mortality data. Moreover, a plurality of fused knowledge base data are input to the machine learning model, and the machine learning model is executed based on a plurality of fused knowledge base data. Machine learning based on the knowledge base generates a plurality of output data, and the plurality of output data includes the predicted length of stay, predicted mortality data and influencing factor analysis data for each patient.

較佳地,複數個歷史資料包含分別來自不同醫療院所的複數個資料集。 Preferably, the plurality of historical data includes a plurality of data sets respectively from different medical institutions.

較佳地,時序疊加知識庫模型配置為:擷取複數個資料集中的複數個歷史資料的手術代碼、死亡標記、以及年度別,而進行各手術代碼在各年度別的死亡率統計,以輸出複數個時序疊加知識庫資料而建立知識庫。其中,在知識庫中,由時序疊加知識庫模型輸出的標記為第m年度的複數個時序疊加知識庫資料將疊加至標記為第m-1年度的複數個時序疊加知識庫資料,m為正整數。 Preferably, the time series superposition knowledge base model is configured to extract the surgical codes, death markers, and years of multiple historical data from multiple data sets, and perform mortality statistics for each surgical code in each year to output multiple time series superposition knowledge base data to establish a knowledge base. In the knowledge base, multiple time series superposition knowledge base data marked as the mth year output by the time series superposition knowledge base model will be superimposed on multiple time series superposition knowledge base data marked as the m-1th year, where m is a positive integer.

較佳地,知識庫模型連接於知識庫,知識庫模型配置為:判別複數個歷史資料的來源醫療院所,而依據知識庫中的來源醫療院所的過往資料以手術代碼進行前融合;依據知識庫中的來源醫療院所之外的醫療院所的過往資料以手術代碼進行前融合;或者依據來源醫療院所或者來源醫療院所之外的醫療院所的去年平均死亡率以進行資料填補。 Preferably, the knowledge base model is connected to the knowledge base, and the knowledge base model is configured as follows: identifying the source medical institutions of the plurality of historical data, and performing pre-fusion based on the past data of the source medical institutions in the knowledge base; based on The past data of medical institutions other than the source medical institution in the knowledge base are pre-fused with surgery codes; or data is filled based on the average mortality rate last year of the source medical institution or medical institutions other than the source medical institution. .

較佳地,機器學習模型係配置為:擷取複數個融合知識庫資料中的複數個特徵參數,且複數個特徵參數包含選自於國際疾病與相關健康問題統計分類(International Statistical Classification of Diseases and Related Health Problem,ICD)代碼、手術代碼、各手術的歷年平均死亡率、住院診斷關聯群(Diagnosis Related Groups,DRG)代碼、抽血項目、性別、年齡、手術前住院天數、單次住院的手術次數及住院部門中的至少一者。 Preferably, the machine learning model is configured to extract multiple feature parameters from multiple fused knowledge base data, and the multiple feature parameters include at least one selected from International Statistical Classification of Diseases and Related Health Problems (ICD) codes, surgery codes, historical average mortality rates of each surgery, Diagnosis Related Groups (DRG) codes, blood draw items, gender, age, number of hospital stays before surgery, number of surgeries in a single hospital stay, and hospitalization department.

較佳地,本發明之手術前風險預測系統進一步包含資料前處理模型,其配置為對各資料集中的複數個歷史資料進行資料前處理,以填補複數個 歷史資料中的缺失值,且以訓練資料集為標準對複數個歷史資料執行標準化,並且以訓練資料集對複數個歷史資料進行編碼,而輸出經資料前處理的複數個歷史資料至知識庫模型。 Preferably, the pre-surgery risk prediction system of the present invention further includes a data pre-processing model, which is configured to perform data pre-processing on a plurality of historical data in each data set to fill in a plurality of historical data. Missing values in historical data, standardize the plurality of historical data based on the training data set, encode the plurality of historical data using the training data set, and output the plurality of historical data processed by the data to the knowledge base model .

較佳地,本發明之手術前風險預測系統進一步包含使用者介面,用以供輸入手術前的患者的歷史資料,從而獲取包含患者的預測住院天數資料、預測死亡率資料及影響因子分析資料的輸出資料。 Preferably, the pre-surgery risk prediction system of the present invention further includes a user interface for inputting historical data of patients before surgery, thereby obtaining data including the patient's predicted length of stay, predicted mortality data and influencing factor analysis data. Output data.

較佳地,時序疊加知識庫模型、知識庫模型、以及機器學習模型係選自於由梯度提升決策樹模型、直方圖梯度提升決策樹模型、隨機森林樹模型、以及長短記憶模型所組成的群組中的至少一者 Preferably, the time series stacking knowledge base model, the knowledge base model, and the machine learning model are selected from at least one of the group consisting of a gradient boosting decision tree model, a histogram gradient boosting decision tree model, a random forest tree model, and a long short-term memory model.

綜上所述,本發明之手術前風險預測系統包含資料前處理模型、知識庫模型、時序疊加知識庫模型、以及機器學習模型,從而利用不同的演算法來針對來自不同醫療院所的各資料集中的歷史資料進行資料處理。具體地歷史資料輸入至時序疊加知識庫模型,且時序疊加知識庫模型依據歷史資料執行時序疊加資料處理,而產生對應的時序疊加知識庫資料,時序疊加知識庫資料包含各患者的平均住院天數資料及死亡率資料。並且,時序疊加知識庫資料接續輸入至知識庫模型,且知識庫模型依據時序疊加知識庫資料執行資料融合處理,而產生融合知識庫資料,融合知識庫資料包含各患者的平均住院天數資料及死亡率資料。最後,融合知識庫資料輸入至機器學習模型,且機器學習模型依據融合知識庫資料執行知識庫基礎的機器學習,而產生輸出資料,輸出資料包含各患者的預測住院天數資料、預測死亡率資料及影響因子分析資料,以供醫師作為麻醉術前照會的參考資料。藉此,本發明之手術前風險預測系統可以透過以知識庫為基礎的機器學習方式來精準預測患者術後的住院天數及死亡 率,且根據實驗結果,本發明之手術前風險預測系統在預測住院天數及死亡率方面皆具有90%以上的準確率。 To sum up, the pre-surgery risk prediction system of the present invention includes a data pre-processing model, a knowledge base model, a time-series overlay knowledge base model, and a machine learning model, thereby using different algorithms to target various data from different medical institutions. Centralized historical data for data processing. Specifically, historical data are input into the time series overlay knowledge base model, and the time series overlay knowledge base model performs time series overlay data processing based on the historical data to generate corresponding time series overlay knowledge base data. The time series overlay knowledge base data includes the average length of stay of each patient. and mortality data. Moreover, the time series superimposed knowledge base data is continuously input to the knowledge base model, and the knowledge base model performs data fusion processing based on the time series superimposed knowledge base data to generate fused knowledge base data. The fused knowledge base data includes the average length of stay and death of each patient. rate data. Finally, the fused knowledge base data is input into the machine learning model, and the machine learning model performs knowledge base-based machine learning based on the fused knowledge base data to generate output data. The output data includes the predicted length of stay, predicted mortality data, and data for each patient. Impact factor analysis data can be used as a reference for physicians in pre-operative anesthesia notes. Thereby, the pre-surgery risk prediction system of the present invention can accurately predict the postoperative hospitalization days and death of patients through machine learning methods based on knowledge bases. rate, and according to the experimental results, the pre-operative risk prediction system of the present invention has an accuracy of more than 90% in predicting the length of stay and mortality.

藉由上述配置,本發明之手術前風險預測系統可以配合術前麻醉照會制度來輔助醫師執行術前的風險評估,從而可以減少人為判斷的誤差且提升術前風險評估的準確性,其有利於確保患者後續進行麻醉與手術的安全性。 Through the above configuration, the preoperative risk prediction system of the present invention can cooperate with the preoperative anesthesia consultation system to assist doctors in performing preoperative risk assessment, thereby reducing errors in human judgment and improving the accuracy of preoperative risk assessment, which is conducive to ensuring the safety of subsequent anesthesia and surgery for patients.

1:手術前風險預測系統 1: Preoperative risk prediction system

10:資料庫單元 10: Database unit

111:第一資料集 111: First data set

112:第二資料集 112: Second data set

113:第三資料集 113:Third data set

20:處理單元 20: Processing unit

21:時序疊加知識庫模型 21: Time series superposition knowledge base model

22:知識庫 22: Knowledge Base

23:知識庫模型 23: Knowledge base model

24:機器學習模型 24: Machine learning model

241:輸出資料 241: Output data

25:資料前處理模型 25: Data pre-processing model

30:使用者介面 30: User Interface

第1圖為根據本發明一實施例的手術前風險預測系統的架構示意圖;第2圖為根據本發明一實施例的手術前風險預測系統的時序疊加知識庫模型的時序疊加資料處理的流程示意圖 Figure 1 is a schematic architectural diagram of a pre-surgery risk prediction system according to an embodiment of the present invention; Figure 2 is a schematic flow chart of time-series superposition data processing of a time-series superposition knowledge base model of the pre-surgery risk prediction system according to an embodiment of the present invention.

第3圖為根據本發明一實施例的手術前風險預測系統的知識庫模型的資料融合處理的流程示意圖;第4圖為根據本發明一實施例的手術前風險預測系統的示意方塊圖;第5圖為根據本發明一實施例的手術前風險預測系統的關於死亡率預測的接收者操作特徵曲線(receiver operating characteristic curve,ROC)圖;第6圖為根據本發明一實施例的手術前風險預測系統的關於住院天數預測的接收者操作特徵曲線(ROC)圖;第7A圖至第7C圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在直方圖梯度提升決策樹(Histogram Gradient Boosting Decision Tree,HGBT)模型上的複式抽樣法(bootstrap)的比較結果示意圖;以及 第8A圖至第8C圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在梯度提升決策樹(Gradient Boosting Decision Tree 300,GBT300)模型上的複式抽樣法(bootstrap)上的複式抽樣法(bootstrap)的比較結果示意圖。 Figure 3 is a schematic flow chart of the data fusion processing of the knowledge base model of the pre-operative risk prediction system according to an embodiment of the present invention; Figure 4 is a schematic block diagram of the pre-operative risk prediction system according to an embodiment of the present invention; Figure 5 is a receiver operating characteristic curve (ROC) diagram for mortality prediction of the pre-operative risk prediction system according to an embodiment of the present invention; Figure 6 is a pre-operative risk according to an embodiment of the present invention Receiver operating characteristic curve (ROC) diagram of the prediction system regarding the prediction of hospitalization days; Figures 7A to 7C are the pre-surgery risk prediction system according to an embodiment of the present invention regarding whether to use the knowledge base in histogram gradient boosting decision-making Schematic diagram of the comparison results of the bootstrap method on the Histogram Gradient Boosting Decision Tree (HGBT) model; and Figures 8A to 8C show the pre-surgery risk prediction system according to an embodiment of the present invention on whether to use the knowledge base on the multiple sampling method (bootstrap) on the gradient boosting decision tree (Gradient Boosting Decision Tree 300, GBT300) model. Schematic diagram of the comparison results of the multiple sampling method (bootstrap).

為利 貴審查員瞭解本發明之技術特徵、內容與優點及其所能達成之功效,茲將本發明配合附圖,並以實施例之表達形式詳細說明如下,而其中所使用之圖式,其主旨僅為示意及輔助說明之用,未必為本發明實施後之真實比例與精準配置,故不應就所附之圖式的比例與配置關係解讀、侷限本發明於實際實施上的權利範圍,合先敘明。 In order to help the examiner understand the technical features, content and advantages of the present invention and the effects it can achieve, the present invention is described in detail as follows with the accompanying drawings and in the form of an embodiment. The drawings used therein are only for illustration and auxiliary explanation purposes, and may not be the true proportions and precise configurations after the implementation of the present invention. Therefore, the proportions and configurations of the attached drawings should not be interpreted to limit the scope of rights of the present invention in actual implementation.

應當理解的是,儘管術語「第一」、「第二」等在本發明中可用於描述各種元件、部件、區域、層及/或部分,但是這些元件、部件、區域、層及/或部分不應受這些術語的限制。這些術語僅用於將一個元件、部件、區域、層及/或部分與另一個元件、部件、區域、層及/或部分區分開。因此,下文討論的「第一元件」、「第一部件」、「第一區域」、「第一層」及/或「第一部分」可以被稱為「第二元件」、「第二部件」、「第二區域」、「第二層」及/或「第二部分」,而不悖離本發明的精神和教示。 It should be understood that although the terms "first", "second", etc. may be used in the present invention to describe various elements, components, regions, layers and/or parts, these elements, components, regions, layers and/or parts should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer and/or part from another element, component, region, layer and/or part. Therefore, the "first element", "first component", "first region", "first layer" and/or "first part" discussed below may be referred to as "second element", "second component", "second region", "second layer" and/or "second part" without departing from the spirit and teachings of the present invention.

另外,術語「包含」及/或「包括」指所述特徵、區域、整體、步驟、操作、元件及/或部件的存在,但不排除一個或多個其他特徵、區域、整體、步驟、操作、元件、部件及/或其組合的存在或添加。 Additionally, the terms "comprises" and/or "includes" refer to the presence of stated features, regions, integers, steps, operations, elements and/or parts, but do not exclude the presence of one or more other features, regions, integers, steps, operations , elements, parts and/or combinations thereof.

除非另有定義,本發明所使用的所有術語(包括技術和科學術語)具有與本發明所屬技術領域具有通常知識者通常理解的相同含義。可以理解的是,諸如在通常使用的字典中定義的那些術語應當被解釋為具有與它們在相關技術和本發明的上下文中的含義一致的定義,並且將不被解釋為理想化或過度正式的意義,除非本文中明確地這樣定義。 Unless otherwise defined, all terms (including technical and scientific terms) used in the present invention have the same meanings as commonly understood by persons of ordinary skill in the art to which the present invention belongs. It is understood that those terms as defined in commonly used dictionaries should be interpreted as having definitions consistent with their meanings in the context of the relevant art and the present invention, and will not be interpreted as idealized or overly formal meanings unless expressly defined as such herein.

在下文中將結合附圖對本發明進行進一步的詳細說明。這些附圖均為簡化的示意圖,僅以示意方式說明本發明的基本結構,因此並不作為對本發明的限定。 The present invention will be described in further detail below with reference to the accompanying drawings. These drawings are simplified schematic diagrams, which only illustrate the basic structure of the present invention in a schematic manner, and are therefore not intended to limit the present invention.

請參閱第1圖至第6圖、第7A圖至第7C圖、以及第8A圖至第8C圖,第1圖為根據本發明一實施例的手術前風險預測系統的架構示意圖;第2圖為根據本發明一實施例的手術前風險預測系統的時序疊加知識庫模型的時序疊加資料處理的流程示意圖;第3圖為根據本發明一實施例的手術前風險預測系統的知識庫模型的資料融合處理的流程示意圖;第4圖為根據本發明一實施例的手術前風險預測系統的示意方塊圖;第5圖為根據本發明一實施例的手術前風險預測系統的關於死亡率預測的接收者操作特徵曲線(receiver operating characteristic curve,ROC)圖;第6圖為根據本發明一實施例的手術前風險預測系統的關於住院天數預測的接收者操作特徵曲線(ROC)圖;第7A圖至第7C圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在直方圖梯度提升決策樹(Histogram Gradient Boosting Decision Tree,HGBT)模型上的複式抽樣法(bootstrap)的比較結果示意圖;以及第8A圖至第8C圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在梯度提升決策樹(Gradient Boosting Decision Tree 300,GBT300)模型上的複式抽樣法(bootstrap)的比較結果示意圖。 Please refer to Figures 1 to 6, Figures 7A to 7C, and Figures 8A to 8C, Figure 1 is a schematic diagram of the architecture of a preoperative risk prediction system according to an embodiment of the present invention; Figure 2 is a schematic diagram of the flow of time-series superposition data processing of a time-series superposition knowledge base model of a preoperative risk prediction system according to an embodiment of the present invention; Figure 3 is a schematic diagram of the flow of data fusion processing of a knowledge base model of a preoperative risk prediction system according to an embodiment of the present invention; Figure 4 is a schematic block diagram of a preoperative risk prediction system according to an embodiment of the present invention; Figure 5 is a receiver operating characteristic curve (receiver operating characteristic) for mortality prediction of a preoperative risk prediction system according to an embodiment of the present invention. curve, ROC) graph; Figure 6 is a receiver operating characteristic curve (ROC) graph for hospital stay prediction of a preoperative risk prediction system according to an embodiment of the present invention; Figures 7A to 7C are schematic diagrams of comparison results of whether to use the knowledge base in a multiple sampling method (bootstrap) on a histogram gradient boosting decision tree (HGBT) model of a preoperative risk prediction system according to an embodiment of the present invention; and Figures 8A to 8C are schematic diagrams of comparison results of whether to use the knowledge base in a multiple sampling method (bootstrap) on a gradient boosting decision tree (Gradient Boosting Decision Tree 300, GBT300) model of a preoperative risk prediction system according to an embodiment of the present invention.

如第1圖所示,根據本發明一實施例提供一種手術前風險預測系統1,其包含:資料庫單元10,其儲存有複數個患者的複數個歷史資料;以及處理單元20,其電性連接資料庫單元10,且包含時序疊加知識庫模型21、知識庫模型(knowledge-based model)23、以及機器學習模型24。其中,複數個歷史資料分別輸入至時序疊加知識庫模型21,且時序疊加知識庫模型21依據複數個歷史資料執行時序疊加資料處理,而產生對應的複數個時序疊加知識庫資料,複數個時序疊加知識庫資料包含各患者的平均住院天數資料及死亡率資料。並且,複數個時序疊加知識庫資料輸入至知識庫模型23,且知識庫模型23依據複數個時序疊加知識庫資料執行資料融合處理,而產生複數個融合知識庫資料,複數個融合知識庫資料包含各患者的平均住院天數資料及死亡率資料。最後,複數個融合知識庫資料輸入至機器學習模型24,且機器學習模型24利用複數個融合知識庫資料執行知識庫基礎的機器學習,而產生複數個輸出資料241,複數個輸出資料241包含各患者的預測住院天數資料、預測死亡率資料及影響因子分析資料。 As shown in Figure 1, according to an embodiment of the present invention, a pre-surgery risk prediction system 1 is provided, which includes: a database unit 10, which stores a plurality of historical data of a plurality of patients; and a processing unit 20, which electrically It is connected to the database unit 10 and includes a time series superposition knowledge base model 21, a knowledge base model (knowledge-based model) 23, and a machine learning model 24. Among them, a plurality of historical data are respectively input to the time series overlay knowledge base model 21, and the time series overlay knowledge base model 21 performs time series overlay data processing based on the plurality of historical data, and generates a corresponding plurality of time series overlay knowledge base data, and a plurality of time series overlay knowledge base data. The knowledge base data includes average length of stay and mortality data for each patient. Furthermore, a plurality of time series superposed knowledge base data are input to the knowledge base model 23, and the knowledge base model 23 performs data fusion processing based on a plurality of time series superimposed knowledge base data to generate a plurality of fused knowledge base data, and the plurality of fused knowledge base data includes Data on the average length of stay and mortality of each patient. Finally, the plurality of fused knowledge base data are input to the machine learning model 24, and the machine learning model 24 uses the plurality of fused knowledge base data to perform knowledge base-based machine learning to generate a plurality of output data 241, and the plurality of output data 241 includes each The patient's predicted hospitalization days data, predicted mortality data and influencing factor analysis data.

具體地,預測住院天數資料可以為此患者在手術後預計住院的天數,例如3天、5天等,並且預測死亡率資料可以為術後24小時死亡、術後48小時死亡、術後72小時死亡、住院內死亡等條件下的死亡率。另外,影響因子分析資料可以為各模型所學習到的與手術後死亡率預測相關聯的影響因子,其可以包含心臟相關的檢驗數據(例如,心肌酵素(myocardial enzymes)、B型利鈉利尿胜肽(B-type natriuretic peptide,BNP)等)、腎相關的檢驗數據(例如,肌酸酐(creatine)、血中尿素氮(blood urea nitrogen,BUN)等)、凝血相關的檢驗數據(國際標準化比值(international normalized ratio,INR)、以及國際疾病與相關健康問 題統計分類(International Statistical Classification of Diseases and Related Health Problem,ICD)代碼等。 Specifically, the predicted hospitalization days data can be the number of days the patient is expected to be hospitalized after surgery, such as 3 days, 5 days, etc., and the predicted mortality data can be death 24 hours after surgery, death 48 hours after surgery, and death 72 hours after surgery. Mortality rates for conditions such as death and in-hospital death. In addition, the influencing factor analysis data can be influencing factors learned by each model that are associated with post-operative mortality prediction, which can include heart-related test data (for example, myocardial enzymes, B-type natriuretic diuretics) Peptides (B-type natriuretic peptide, BNP, etc.), kidney-related test data (for example, creatinine (creatine), blood urea nitrogen (BUN), etc.), coagulation-related test data (international normalized ratio (international normalized ratio, INR), and international diseases and related health issues International Statistical Classification of Diseases and Related Health Problem (ICD) codes, etc.

具體地,複數個歷史資料包含分別來自不同醫療院所的第一資料集111、第二資料集112及第三資料集113。第一資料集111為使用本發明之手術前風險預測系統1來輔助醫師執行術前風險評估的主要醫院的資料集,即內部資料集,並且第二資料集112以及第三資料集113為來自其他醫院的資料集,即外部資料集。在下文中,將使用區分為內部資料集以及外部資料集的第一資料集111、第二資料集112及第三資料集113來驗證手術前風險預測系統1的準確性,因此在此將不再贅述。其中,第一資料集111、第二資料集112及第三資料集113可以包含諸如患者的年齡、手術前抽血檢查資料、本次住院的開刀次數、過往執行手術的日期、及手術前已住院天數的數字資料,以及諸如包含共病的手術前診斷碼(例如,依序排列的5項ICD代碼)、預定術式代碼(PCS code)、住院診斷關聯群(Diagnosis Related Groups,DRG)代碼、手術科別代碼、及性別等類別資料。 Specifically, the plurality of historical data include the first data set 111, the second data set 112 and the third data set 113 respectively from different medical institutions. The first data set 111 is a data set of major hospitals that use the preoperative risk prediction system 1 of the present invention to assist doctors in performing preoperative risk assessment, that is, an internal data set, and the second data set 112 and the third data set 113 are from Data sets from other hospitals, i.e. external data sets. In the following, the first data set 111, the second data set 112 and the third data set 113, which are divided into internal data sets and external data sets, will be used to verify the accuracy of the pre-surgery risk prediction system 1, so it will not be discussed here. Repeat. Among them, the first data set 111, the second data set 112 and the third data set 113 may include, for example, the age of the patient, pre-operative blood test data, the number of surgeries in this hospitalization, the date of past operations, and the number of operations performed before the operation. Numerical data of hospitalization days, as well as preoperative diagnosis codes including comorbidities (for example, 5 ICD codes in sequence), scheduled surgery codes (PCS codes), and hospitalization diagnosis related group (Diagnosis Related Groups, DRG) codes , surgical department code, and gender and other category information.

在本實施例中,第一資料集111為來自高雄醫學大學附設中和紀念醫院(Kaohsiung Medical University Chung-Ho Memorial Hospital,KMUH)的資料集,並且第二資料集112以及第三資料集113分別為來自高雄市立小港醫院(Kaohsiung Municipal Hsiao-Kang Hospital,KMHK)、以及高雄市立大同醫院(Kaohsiung Municipal Ta-Tung Hospital,KMTTH)的資料集,但本發明不限定於此。在其他實施例中,可以進一步採用其他醫療院所的資料。 In this embodiment, the first data set 111 is a data set from Kaohsiung Medical University Chung-Ho Memorial Hospital (KMUH), and the second data set 112 and the third data set 113 are respectively It is a data set from Kaohsiung Municipal Hsiao-Kang Hospital (KMHK) and Kaohsiung Municipal Ta-Tung Hospital (KMTTH), but the present invention is not limited thereto. In other embodiments, data from other medical institutions may be further used.

另外,在本實施例中,複數個歷史資料為經過去辨識化後的患者的資料,其排除了進行手術時年紀尚未滿20歲之患者,且排除了局部麻醉之患 者,並且排除了未於開刀房內進行之門診手術的資料,從而增加後續執行術後死亡率及住院天數預測的準確性。此外,患者的電子病歷紀錄為手術後30天內於院內死亡以及交叉比對台灣健保資料庫死因統計檔之死亡日期為手術後30天內,之後將其編碼為二元編碼檔案(binary data),其中0為患者存活,且1為患者死亡,因此預測終點將涵蓋手術後在院內死亡以及院外死亡之病人。 In addition, in this embodiment, the multiple historical data are the data of patients after de-identification, which excludes patients who are under the age of 20 at the time of surgery, excludes patients who receive local anesthesia, and excludes data of outpatient surgeries that are not performed in the operating room, thereby increasing the accuracy of subsequent predictions of postoperative mortality and length of hospital stay. In addition, the electronic medical records of patients record in-hospital death within 30 days after surgery and cross-check the death date of the death statistics file of the Taiwan National Health Insurance Database within 30 days after surgery, and then encode it into a binary coded file (binary data), where 0 means the patient survives and 1 means the patient dies, so the predicted endpoint will cover patients who die in the hospital and die outside the hospital after surgery.

舉例來說,在本實施例中,作為內部資料集的高雄醫學大學附設中和紀念醫院(KMUH)的第一資料集111包含87907名患者,共進行118442次手術的複數個歷史資料,且排除手術時患者未成年的7629次,而獲得82573名患者,共進行110813次手術的複數個歷史資料,並且上述複數個歷史資料可以進一步劃分為高雄醫學大學附設中和紀念醫院(KMUH)訓練資料集(包含75724次手術)、高雄醫學大學附設中和紀念醫院(KMUH)驗證資料集(包含16750次手術)、以及高雄醫學大學附設中和紀念醫院(KMUH)測試資料集(包含18339次手術)。並且,作為外部資料集的高雄市立小港醫院(KMHK)的第二資料集112包含22512名患者,共進行27731次手術的複數個歷史資料,且排除手術時患者未成年的1491次,而獲得21195名患者,共進行26240次手術的複數個歷史資料,並且上述複數個歷史資料可以作為高雄市立小港醫院(KMHK)訓練資料集(包含26240次手術)。同樣地,作為外部資料集的高雄市立大同醫院(KMTTH)的第三資料集113包含227822名患者,共進行27859次手術的複數個歷史資料,且排除手術時患者未成年的826次,而獲得22066名患者,共進行27033次手術的複數個歷史資料,並且上述複數個歷史資料可以作為高雄市立大同醫院(KMTTH)訓練資料集(包含27033次手術)。 For example, in this embodiment, the first dataset 111 of the Kaohsiung Medical University Chung Ho Memorial Hospital (KMUH) as an internal dataset includes 87,907 patients with a total of 118,442 surgeries, and excludes 7,629 surgeries in which the patients were minors, thereby obtaining 82,573 patients with a total of 110,813 surgeries. The above multiple historical data can be further divided into the Kaohsiung Medical University Affiliated Chung Ho Memorial Hospital (KMUH) training dataset (including 75724 surgeries), the Kaohsiung Medical University Affiliated Chung Ho Memorial Hospital (KMUH) validation dataset (including 16750 surgeries), and the Kaohsiung Medical University Affiliated Chung Ho Memorial Hospital (KMUH) test dataset (including 18339 surgeries). Furthermore, the second dataset 112 of Kaohsiung Municipal Hospital (KMHK) as an external dataset includes 22,512 patients, multiple historical data of 27,731 surgeries, and 1,491 surgeries in which the patients were minors were excluded, thereby obtaining 21,195 patients, multiple historical data of 26,240 surgeries, and the above multiple historical data can be used as the training dataset of Kaohsiung Municipal Hospital (KMHK) (including 26,240 surgeries). Similarly, the third dataset 113 of Kaohsiung Municipal Tatung Hospital (KMTTH) as an external dataset contains multiple historical data of 227,822 patients who underwent 27,859 surgeries, and 826 cases where the patients were minors during the surgery were excluded, resulting in multiple historical data of 22,066 patients who underwent 27,033 surgeries. The above multiple historical data can be used as the training dataset of Kaohsiung Municipal Tatung Hospital (KMTTH) (including 27,033 surgeries).

並且,本發明之手術前風險預測系統1進一步包含資料前處理模型25,其對複數個歷史資料進行資料前處理,其中資料前處理包含以下步驟:缺失資料值填補(missing data imputation)、資料切分(data splitting)、資料標準化(data normalization)、以及類別項編碼(encoding categorical features),其將在下文中詳細說明。經資料前處理的複數個歷史資料接續輸入至知識庫模型23。 Furthermore, the preoperative risk prediction system 1 of the present invention further includes a data preprocessing model 25, which performs data preprocessing on a plurality of historical data, wherein the data preprocessing includes the following steps: missing data imputation, data splitting, data normalization, and encoding categorical features, which will be described in detail below. The plurality of historical data after data preprocessing are continuously input into the knowledge base model 23.

具體地,藉由資料前處理模型25分別對歷史資料中來自不同醫療院所的第一資料集111(內部資料集)、第二資料集112及第三資料集113(外部資料集)進行資料前處理,並且將經資料前處理的複數個歷史資料傳輸至知識庫模型23,其將計算來自不同醫療院所的以年度計算的總開刀數量、總計的死亡個案、各術式的累計平均死亡率、以年度計算的整體平均死亡率、全部術式的平均死亡率、各術式死亡率的標準差等統計數據。 Specifically, the first data set 111 (internal data set), the second data set 112 and the third data set 113 (external data set) from different medical institutions in the historical data are processed through the data pre-processing model 25. Pre-processing, and transmit the plurality of historical data pre-processed to the knowledge base model 23, which will calculate the total annual number of operations, the total number of death cases, and the cumulative average death of each surgery from different medical institutions statistical data such as rate, annual average mortality rate, average mortality rate for all surgical procedures, and standard deviation of mortality rates for each surgical procedure.

進一步地,在資料前處理模型25的資料前處理作業中,缺失資料值填補包含將歷史資料中的數字資料的缺失值以-1填補,超過上限值或下限值的數值以最大可能極限值填補,並且複數個歷史資料中的類別資料則是根據ICD及DRG字典檔案進行資料編碼,若未儲存在字典檔案或是缺失的項目則編碼為另外的類別。在資料前處理模型25的資料前處理作業中,資料切分包含將歷史資料依照時序性進行切分,從而避免歷史資料中針對個別案例計算的資料包含過去重複的資訊。在資料前處理模型25的資料前處理作業中,資料標準化包含將歷史資料中的數字資料以選定的訓練資料集為標準進行標準化處理為平均值(mean)為0,資料為正負標準差的表示方式,並接續將資料調整至0至1的區間,並且來自不同醫療院所的資料集皆以相同訓練資料集為標準進行標準化。並且,在資料前處理模型25的資料前處理作業中,類別項編碼包含將所有編碼 以選定的訓練資料集為標準建立編碼字典,並且接續採用直接資料編碼(nominal encoding)來進行編碼。 Further, in the data pre-processing operation of the data pre-processing model 25, the filling of missing data values includes filling the missing values of the digital data in the historical data with -1, and the values exceeding the upper limit or lower limit are filled with the maximum possible limit. Values are filled in, and the category data in multiple historical data are coded according to the ICD and DRG dictionary files. If the items are not stored in the dictionary files or are missing, they are coded into another category. In the data preprocessing operation of the data preprocessing model 25, data segmentation includes segmenting historical data according to time series, so as to prevent the data calculated for individual cases in the historical data from containing repeated information in the past. In the data pre-processing operation of the data pre-processing model 25, data standardization includes standardizing the digital data in the historical data based on the selected training data set into a mean of 0, and the data is expressed as a positive and negative standard deviation. method, and continue to adjust the data to the range of 0 to 1, and the data sets from different medical institutions are standardized based on the same training data set. Moreover, in the data preprocessing operation of data preprocessing model 25, category item coding includes all coding An encoding dictionary is established based on the selected training data set, and is subsequently encoded using nominal encoding.

本發明之手術前風險預測系統1藉由將第一資料集111劃分為內部資料集,並且將第二資料集112及第三資料集113劃分為外部資料集,以將第一資料集111與第二資料集112及第三資料集113的複數個歷史資料分別輸入至時序疊加知識庫模型21中進行時序疊加資料處理,以產生具有內部及外部資料區分的時序疊加知識庫資料並依此建立知識庫22。同樣地,本發明之手術前風險預測系統1將第一資料集111劃分為內部資料集,並且將第二資料集112及第三資料集113劃分為外部資料集,以將第一資料集111與第二資料集112及第三資料集113的複數個歷史資料分別輸入至資料前處理模型25中進行資料前處理,其有利於後續驗證本發明之手術前風險預測系統1的準確性。 The preoperative risk prediction system 1 of the present invention divides the first data set 111 into an internal data set, and divides the second data set 112 and the third data set 113 into external data sets, so as to input the multiple historical data of the first data set 111, the second data set 112 and the third data set 113 into the time series superposition knowledge base model 21 for time series superposition data processing, so as to generate time series superposition knowledge base data with internal and external data distinction and establish a knowledge base 22 accordingly. Similarly, the preoperative risk prediction system 1 of the present invention divides the first data set 111 into an internal data set, and divides the second data set 112 and the third data set 113 into external data sets, so as to input the multiple historical data of the first data set 111, the second data set 112 and the third data set 113 into the data preprocessing model 25 for data preprocessing, which is conducive to the subsequent verification of the accuracy of the preoperative risk prediction system 1 of the present invention.

此外,資料前處理模型25所產生的資料將接續輸入至知識庫模型23中以執行資料融合處理,並且經資料融合處理所產生的複數個融合知識庫資料將接續輸入至機器學習模型24,以產生輸出資料241,從而可以有效減少所輸入的歷史資料中的雜訊的影響,並且可以提升最終輸出結果的準確性。藉此,醫師在進行麻醉術前評估時,可以進一步參考由本發明之手術前風險預測系統1所產生的輸出資料241中的預測住院天數資料、預測死亡率資料及影響因子分析資料,從而減少僅依據醫師經驗進行術前風險評估而可能造成的人為判斷的誤差,並且可以提升術前風險評估的準確性。 In addition, the data generated by the data pre-processing model 25 will be continuously input into the knowledge base model 23 to perform data fusion processing, and the multiple fused knowledge base data generated by the data fusion processing will be continuously input into the machine learning model 24 to generate output data 241, thereby effectively reducing the impact of noise in the input historical data and improving the accuracy of the final output result. Thus, when doctors conduct preoperative anesthesia assessment, they can further refer to the predicted hospitalization days data, predicted mortality data and influencing factor analysis data in the output data 241 generated by the preoperative risk prediction system 1 of the present invention, thereby reducing the human judgment errors that may be caused by preoperative risk assessment based solely on the doctor's experience, and can improve the accuracy of preoperative risk assessment.

進一步地,時序疊加知識庫模型21配置為:擷取複數個資料集中的複數個歷史資料的手術代碼、死亡標記、以及年度別,而進行各手術代碼在各年度別的死亡率統計,以輸出複數個時序疊加知識庫資料而建立知識庫22。 其中,在知識庫22中,由時序疊加知識庫模型21輸出的標記為第m年度的複數個時序疊加知識庫資料將疊加至標記為第m-1年度的複數個時序疊加知識庫資料,m為正整數。 Further, the time series overlay knowledge base model 21 is configured to: retrieve the operation codes, death marks, and year categories of the plurality of historical data in the plurality of data sets, and perform mortality statistics of each operation code in each year to output The knowledge base 22 is established by superimposing knowledge base data on multiple time series. Among them, in the knowledge base 22, the plurality of time series superposition knowledge base materials marked as the mth year output by the time series superposition knowledge base model 21 will be superimposed on the plurality of time series superposition knowledge base materials marked as the m-1th year, m is a positive integer.

具體地,如第2圖所示,時序疊加知識庫模型21從資料庫模型接收複數個歷史資料,並且時序疊加知識庫模型21對所接收的歷史資料執行的時序疊加資料處理的流程如下所述。首先,時序疊加知識庫模型21判別各歷史資料所屬的來源醫療院所,並且以手術代碼以及年度別來對複數個歷史資料進行死亡率統計,並且以年度來進行排序。 Specifically, as shown in FIG. 2, the time series superposition knowledge base model 21 receives multiple historical data from the database model, and the time series superposition knowledge base model 21 performs the time series superposition data processing on the received historical data as follows. First, the time series superposition knowledge base model 21 determines the source medical institution to which each historical data belongs, and performs mortality statistics on multiple historical data by surgery code and year, and sorts them by year.

並且,時序疊加知識庫模型21以手術代碼為類別,來對經排序的複數個歷史資料執行迴圈計算,首先判別此手術代碼的複數個歷史資料是否存在年度空缺,若存在有年度空缺,則扣除n筆空缺年度的資料,並且以(10-n)年為移動窗格計算此手術代碼對應的平均死亡率(其中,n為正整數),並且輸出時序疊加知識庫資料。或者,若不存在有年度空缺,則以10年為移動窗格計算此手術代碼對應的平均死亡率,並且輸出時序疊加知識庫資料。最後,將標記為m年度的時序疊加知識庫資料疊加至標記為m-1年度的時序疊加知識庫資料,並且依據時序疊加知識庫資料來建立知識庫22。 Moreover, the time-series superposition knowledge base model 21 uses surgery codes as categories to perform loop calculations on the plurality of sorted historical data. First, it is determined whether there are annual vacancies in the plurality of historical data of this surgery code. If there are annual vacancies, then Deduct the data of n vacant years, calculate the average mortality rate corresponding to this surgery code (where n is a positive integer) using (10-n) years as the moving pane, and output the time series overlay knowledge base data. Or, if there are no annual vacancies, calculate the average mortality rate corresponding to this surgery code using a moving pane of 10 years, and output the time series overlay knowledge base data. Finally, the time series superposition knowledge base data marked as year m is superimposed to the time series superposition knowledge base data marked year m-1, and the knowledge base 22 is established based on the time series superposition knowledge base data.

進一步地,知識庫模型23連接於知識庫22,知識庫模型23配置為:判別複數個歷史資料的來源醫療院所,而依據知識庫22中的來源醫療院所的過往資料以手術代碼進行前融合;依據知識庫22中的來源醫療院所之外的醫療院所的過往資料以手術代碼進行前融合;或者依據來源醫療院所或者來源醫療院所之外的醫療院所的去年平均死亡率以進行資料填補。 Further, the knowledge base model 23 is connected to the knowledge base 22, and the knowledge base model 23 is configured to identify the source medical institutions of the plurality of historical data, and perform pre-processing with surgical codes based on the past data of the source medical institutions in the knowledge base 22. Fusion; pre-fusion with procedure codes based on historical data from medical institutions other than the source medical institution in knowledge base 22; or based on last year's average mortality rate at the source medical institution or at medical institutions other than the source medical institution. to fill in the data.

具體地,如第3圖所示,知識庫模型23從資料前處理模型25接收經前處理的複數個歷史資料,並且從知識庫22接收經時序疊加處理的時序疊加知識庫資料,並且知識庫模型23對所接收的資料執行的資料融合處理可以劃分為三個區塊,分別為內部知識融合、外部知識融合、以及知識填補。具體地,內部知識融合為首先獲取與各資料(即,經前處理的歷史資料以及時序疊加知識庫資料)所屬的來源醫療院所的過往知識庫,並且以手術碼與此醫院的知識庫進行前融合(early fusion),融合後若資料存在有空缺值則接續進行外部知識融合。外部知識融合為獲取其他醫療院所的知識庫,並且以手術碼與其他醫院的知識庫進行前融合,若是此歷史資料所對應的手術並無存在過往資料,則接續進行知識填補。知識填補為獲取並且採用相同醫院的前一年的平均死亡率來進行資料填補,若無前一年的平均死亡率資料,則填補-1代表未知資訊。 Specifically, as shown in FIG. 3 , the knowledge base model 23 receives a plurality of pre-processed historical data from the data pre-processing model 25, and receives time-series superposition knowledge base data processed by time-series superposition from the knowledge base 22, and the data fusion processing performed by the knowledge base model 23 on the received data can be divided into three blocks, namely internal knowledge fusion, external knowledge fusion, and knowledge filling. Specifically, internal knowledge fusion is to first obtain the past knowledge base of the source medical institution to which each data belongs (i.e., pre-processed historical data and time-series superimposed knowledge base data), and perform early fusion with the knowledge base of this hospital using the surgery code. If there are missing values in the data after fusion, external knowledge fusion will be performed. External knowledge fusion is to obtain the knowledge base of other medical institutions, and perform early fusion with the knowledge base of other hospitals using the surgery code. If there is no past data for the surgery corresponding to the historical data, knowledge filling will be performed. Knowledge filling is to obtain and use the average mortality rate of the same hospital in the previous year for data filling. If there is no average mortality rate data in the previous year, fill in -1 to represent unknown information.

經由知識庫模型23的資料融合處理而產生融合知識庫資料接續輸入機器學習模型24,且機器學習模型24依據複數個融合知識庫資料執行知識庫基礎的機器學習,而產生複數個輸出資料241,複數個輸出資料241包含各該患者的預測住院天數資料、預測死亡率資料及影響因子分析資料。使用經知識庫模型23處理的融合知識庫資料來進行知識庫基礎的機器學習,可以有效地提升本發明之手術前風險預測系統1的準確性,其將在下文中詳細說明。 The fused knowledge base data generated through the data fusion processing of the knowledge base model 23 is continuously input into the machine learning model 24, and the machine learning model 24 performs machine learning based on the knowledge base based on the plurality of fused knowledge base data to generate a plurality of output data 241, The plurality of output data 241 include the predicted hospitalization days data, predicted mortality data and influencing factor analysis data of each patient. Using the fused knowledge base data processed by the knowledge base model 23 to perform knowledge base-based machine learning can effectively improve the accuracy of the pre-surgery risk prediction system 1 of the present invention, which will be described in detail below.

進一步地,輸入至複合深度機器學習模型24的融合知識庫資料由機器學習模型24擷取複數個特徵參數,且依據所擷取的複數個特徵參數進行機器學習,從而提升本發明之手術前風險預測系統1的準確性。其中,複數個特徵參數可以包含選自於國際疾病與相關健康問題統計分類(ICD)代碼、手術代碼、各手術的歷年平均死亡率、各手術的開刀次數總量、住院診斷關聯群(DRG)代 碼、抽血項目(例如,凝血指數(INR)、發炎指數(CRP)、或者白血球指數(WBC))、性別、年齡、身體質量指數、手術前住院天數、單次住院的手術次數及住院部門、是否抽菸、是否具有諸如糖尿病、高血壓、心血管疾病、肝硬化、惡性腫瘤等病史及麻醉風險等級中的至少一者,但本發明不限定於此。在其他實施例中,可以依據實際情形設定由複合深度學習模型所擷取特徵參數的種類。 Furthermore, the fused knowledge base data input into the composite deep machine learning model 24 extracts a plurality of feature parameters from the machine learning model 24, and machine learning is performed based on the extracted plurality of feature parameters, thereby improving the accuracy of the pre-operative risk prediction system 1 of the present invention. Among them, the plurality of feature parameters may include at least one selected from the International Statistical Classification of Diseases and Related Health Problems (ICD) code, surgery code, the average annual mortality rate of each surgery, the total number of surgeries for each surgery, the inpatient diagnosis group (DRG) code, blood draw items (e.g., coagulation index (INR), inflammation index (CRP), or white blood cell index (WBC)), gender, age, body mass index, number of hospitalization days before surgery, number of surgeries in a single hospitalization and hospitalization department, whether smoking, whether having a history of diabetes, hypertension, cardiovascular disease, cirrhosis, malignant tumors, etc., and anesthesia risk level, but the present invention is not limited thereto. In other embodiments, the type of feature parameters captured by the composite deep learning model can be set according to the actual situation.

經過機器學習模型24計算後將可以輸出包含術後死亡率以及住院天數的預測值的輸出資料241,其評估指標(metrics)採用曲線下面積(Area Under the Curve,AUC)以及計算AUPRC(Area Under the Precision-Recall Curve)來進行正預測率的分析,並且曲線下面積(AUC)的計算如下列公式(1)所示:

Figure 112114207-A0305-02-0017-1
After the machine learning model 24 calculates, the output data 241 including the predicted values of postoperative mortality and hospital stay can be output. The evaluation metrics adopt the area under the curve (AUC) and calculate the AUPRC (Area Under the Precision-Recall Curve) to analyze the positive prediction rate. The calculation of the area under the curve (AUC) is shown in the following formula (1):
Figure 112114207-A0305-02-0017-1

其中,B為正樣本,W為負樣本,x為預測正樣本中的預測值,y為負樣本中的預測值,xy所有可能的排序為笛卡爾乘積B×WWhere B is the positive sample, W is the negative sample, x is the predicted value in the predicted positive sample, y is the predicted value in the negative sample, and all possible orderings of x and y are the Cartesian product B × W.

此外,上述時序疊加知識庫模型21、知識庫模型23、以及機器學習模型24可以選自於決策樹(Decision Trees,DT)模型、梯度提升決策樹(Gradient Boosting Tree,GBT)模型、直方圖梯度提升決策樹(histogram-based gradient boosting tree)模型、隨機森林樹(Random Forest,RF)模型、以及長短記憶(Long Short-Term Memory,LSTM)模型等機器學習模型中的任意一種。 In addition, the above-mentioned time series superposition knowledge base model 21, knowledge base model 23, and machine learning model 24 can be selected from a decision tree (Decision Trees, DT) model, a gradient boosting decision tree (Gradient Boosting Tree, GBT) model, histogram gradient Any of the machine learning models such as the histogram-based gradient boosting tree model, the random forest tree (Random Forest, RF) model, and the long short-term memory (Long Short-Term Memory, LSTM) model.

如第4圖所示,本發明之手術前風險預測系統1進一步包含使用者介面30,用以供輸入手術前的患者的歷史資料,從而透過手術前風險預測系統1中各模型的資料處理,來獲取包含患者的預測住院天數資料、預測死亡率資料及影響因子分析資料的輸出資料241。例如,在經過身分認證後,醫師可以經由 使用者介面30輸入患者的姓名,以從資料庫單元10查閱此患者的過往病歷、身體檢查報告等相關資料,並且可以驅動處理單元20以執行上述的資料處理流程,從而產生包含患者的預測住院天數資料、預測死亡率資料及影響因子分析資料的輸出資料241,其可以作為醫師執行麻醉術前照會的參考資料,且有助於提升術前風險評估的準確性。 As shown in FIG. 4 , the preoperative risk prediction system 1 of the present invention further includes a user interface 30 for inputting historical data of patients before surgery, thereby obtaining output data 241 including patient predicted hospital stay data, predicted mortality data, and influencing factor analysis data through data processing of each model in the preoperative risk prediction system 1. For example, after identity authentication, the doctor can input the patient's name through the user interface 30 to check the patient's past medical history, physical examination report and other related data from the database unit 10, and can drive the processing unit 20 to execute the above data processing process, thereby generating output data 241 including the patient's predicted hospitalization days data, predicted mortality data and influencing factor analysis data, which can be used as reference data for doctors to perform preoperative anesthesia consultation and help improve the accuracy of preoperative risk assessment.

本發明之手術前風險預測系統1的臨床試驗的具體方式如下所述,首先患者於安排手術並住院之後,在手術進行前進行常規的術前麻醉照會程序,同時告知患者關於本發明之手術前風險預測系統1的臨床試驗流程,並且在取得患者同意後於醫院的病歷系統中註明此患者參與本發明之手術前風險預測系統1的臨床試驗。並且,將已儲存在病歷系統中的相關資料利用加密方法對可辨識之資料項進行去辨識化,且在去辨識化完成後傳輸至資料庫單元10進行儲存。 The specific method of the clinical trial of the preoperative risk prediction system 1 of the present invention is as follows: first, after the patient is scheduled for surgery and hospitalized, the conventional preoperative anesthesia consultation procedure is carried out before the surgery, and the patient is informed of the clinical trial process of the preoperative risk prediction system 1 of the present invention, and after obtaining the patient's consent, the hospital's medical record system is marked that the patient participates in the clinical trial of the preoperative risk prediction system 1 of the present invention. In addition, the relevant data stored in the medical record system is de-identified using an encryption method for identifiable data items, and after de-identification is completed, it is transmitted to the database unit 10 for storage.

此外,考量到若單一術式總開刀次數小於10次,表示過去7年以來平均每年此術式的開刀次數小於2次,其在臨床上屬於罕見或全新之術式,將會對於模型的學習造成影響,因此將排除術式開刀次數小於10次之資料。 In addition, if the total number of surgeries for a single procedure is less than 10, it means that the average number of surgeries for this procedure is less than 2 per year over the past 7 years. This is a rare or new procedure in clinical practice, which will affect the learning of the model. Therefore, data with less than 10 surgeries will be excluded.

本發明之手術前風險預測系統的處理單元20可以依據資料庫單元10中的資料自動進行資料處理,並將所產生的預測住院天數資料及預測死亡率資料進行加密,例如時序疊加知識庫模型21以及資料前處理模型25可以具有資料加密的功能,且將加密後的預測住院天數資料及預測死亡率資料傳輸至醫院的病歷系統中的獨立伺服器中進行封存。藉此,可以在臨床試驗的過程至患者出院後的皆維持資料的封存,使得所有臨床試驗參與人員(醫師、患者、護理師及分析人員)皆無法得知本發明之手術前風險預測系統1所產生的預測住院天 數資料及預測死亡率資料,從而確保醫師及患者等相關人員皆不會受到上述資料影響,而可以依醫療常規進行麻醉及手術。 The processing unit 20 of the preoperative risk prediction system of the present invention can automatically perform data processing based on the data in the database unit 10, and encrypt the generated predicted hospital stay data and predicted mortality data. For example, the time series superposition knowledge base model 21 and the data pre-processing model 25 can have the function of data encryption, and transmit the encrypted predicted hospital stay data and predicted mortality data to an independent server in the hospital's medical record system for sealing. In this way, the data can be kept sealed during the clinical trial process until the patient is discharged from the hospital, so that all clinical trial participants (doctors, patients, nurses and analysts) cannot know the predicted hospitalization days data and predicted mortality data generated by the pre-operative risk prediction system 1 of the present invention, thereby ensuring that doctors, patients and other related personnel will not be affected by the above data and can perform anesthesia and surgery according to medical routine.

並且,在患者出院後也將本次住院天數等資料進行加密,且同樣傳輸至醫院的病歷系統中的獨立伺服器中進行封存。最後,在患者出院兩個月後進行資料解鎖,此時由分析人員經身分認證後登入病歷系統中的獨立伺服器,以取得上述資料進行比對並評估本發明之手術前風險預測系統1的準確性。 In addition, after the patient is discharged from the hospital, the data such as the length of hospitalization will also be encrypted and transferred to an independent server in the hospital's medical record system for storage. Finally, the data is unlocked two months after the patient is discharged from the hospital. At this time, the analyst logs into the independent server in the medical record system after identity authentication to obtain the above data for comparison and evaluation of the pre-surgery risk prediction system 1 of the present invention. Accuracy.

如第5圖所示,當以本發明之手術前風險預測系統1評估患者的死亡率時,其接收者操作特徵曲線之曲線下面積(Area Under the Receiver Operating Characteristic curve,AUROC)可高達0.929,其預測死亡率之準確率約為92%。再者,如第6圖所示,當以本發明之手術前風險預測系統1評估患者的住院天數時,其接收者操作特徵曲線之曲線下面積(AUROC)可高達0.913,其預測死亡率之準確率約為91%。由此可見,本發明之手術前風險預測系統1具有優異的術後住院天數及死亡率的評估準確性。 As shown in Figure 5, when the patient's mortality rate is evaluated by the preoperative risk prediction system 1 of the present invention, the area under the curve of the receiver operating characteristic curve (Area Under the Receiver Operating Characteristic curve, AUROC) can be as high as 0.929, and the accuracy of predicting mortality is about 92%. Furthermore, as shown in Figure 6, when the patient's hospital stay is evaluated by the preoperative risk prediction system 1 of the present invention, the area under the curve of the receiver operating characteristic curve (AUROC) can be as high as 0.913, and the accuracy of predicting mortality is about 91%. It can be seen that the preoperative risk prediction system 1 of the present invention has excellent accuracy in evaluating postoperative hospital stay and mortality.

進一步地,在本實施例中,使用有使用直方圖梯度提升決策樹(Histogram Gradient Boosting Decision Tree,HGBT)模型以及梯度提升決策樹(Gradient Boosting Decision Tree 300,GBT300)為主要模型,且以高雄醫學大學附設中和紀念醫院(KMUH)的資料集來進行內部驗證,並且以高雄市立大同醫院(KMTTH)、以及高雄市立小港醫院(KMHK)資料集來進行外部驗證。 Further, in this embodiment, the histogram gradient boosting decision tree (HGBT) model and the gradient boosting decision tree (Gradient Boosting Decision Tree 300, GBT300) are used as the main models, and Kaohsiung Medical University The data set of the University-affiliated Chung Ho Memorial Hospital (KMUH) was used for internal validation, and the data sets of Kaohsiung Municipal Tatong Hospital (KMTTH) and Kaohsiung Municipal Xiaogang Hospital (KMHK) were used for external validation.

具體地,如第7A圖至第7C圖所示,第7A圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在直方圖梯度提升決策樹(HGBT)模型上的複式抽樣法(bootstrap)的比較結果示意圖。其中,第7A圖為使用高雄醫學大學附設中和紀念醫院(KMUH)作為測試資料集進行內部驗證的結 果,第7B圖為使用高雄市立小港醫院(KMHK)作為測試資料集進行外部驗證的結果,並且第7C圖為使用高雄市立大同醫院(KMTTH)作為測試資料集進行外部驗證的結果。第7A圖至第7C圖的A部分為有使用知識庫的直方圖梯度提升決策樹(HGBT)模型的曲線下面積(Area Under curve,AUC)表現,並且B部分為有使用知識庫的直方圖梯度提升決策樹模型(HGBT)的精確召回曲線之曲線下面積(Area Under the Precision-Recall Curve,PRAUC)表現。 Specifically, as shown in Figures 7A to 7C , Figure 7A is a compound prediction of a pre-surgery risk prediction system on whether to use a knowledge base on a Histogram Gradient Boosting Decision Tree (HGBT) model according to an embodiment of the present invention. Schematic diagram of comparison results of sampling method (bootstrap). Among them, Figure 7A shows the results of internal verification using Kaohsiung Medical University Chung Ho Memorial Hospital (KMUH) as the test data set. As a result, Figure 7B shows the results of external validation using Kaohsiung Municipal Xiaogang Hospital (KMHK) as the test data set, and Figure 7C shows the results of external validation using Kaohsiung Municipal Datong Hospital (KMTTH) as the test data set. Part A of Figures 7A to 7C is the area under curve (AUC) performance of the histogram gradient boosting decision tree (HGBT) model using the knowledge base, and part B is the histogram using the knowledge base. The area under the Precision-Recall Curve (PRAUC) performance of the gradient boosting decision tree model (HGBT).

具體地,如第7A圖的A部分所示,在使用高雄醫學大學附設中和紀念醫院(KMUH)作為測試資料集進行內部驗證的情況下,有使用知識庫的直方圖梯度提升決策樹模型(HGBT)的AUC表現為約為0.950,且未使用知識庫的直方圖梯度提升決策樹(HGBT)模型的AUC表面約為0.945。如第7B圖的A部分所示,在使用高雄市立小港醫院(KMHK)作為測試資料集進行外部驗證的情況下,有使用知識庫的直方圖梯度提升決策樹(HGBT)模型的AUC表現為約為0.944,且未使用知識庫的直方圖梯度提升決策樹(HGBT)模型的AUC表面約為0.939。如第7C圖的A部分所示,在使用高雄市立大同醫院(KMTTH)作為測試資料集進行外部驗證的情況下,有使用知識庫的直方圖梯度提升決策樹(HGBT)模型的AUC表現為約為0.947,且未使用知識庫的直方圖梯度提升決策樹(HGBT)模型的AUC表面約為0.945。由此可知,在使用直方圖梯度提升決策樹模型的情況下,結合知識庫以及歷史資料來配合機器學習模型,可以有具有較佳的準確率。 Specifically, as shown in Part A of Figure 7A, in the case of using Kaohsiung Medical University Chung Ho Memorial Hospital (KMUH) as the test data set for internal validation, there is a histogram gradient boosting decision tree model using the knowledge base ( The AUC surface of HGBT) is about 0.950, and the AUC surface of the Histogram Gradient Boosted Decision Tree (HGBT) model without using the knowledge base is about 0.945. As shown in Part A of Figure 7B, in the case of external validation using Kaohsiung Municipal Xiaogang Hospital (KMHK) as the test data set, the AUC performance of the Histogram Gradient Boosting Decision Tree (HGBT) model using the knowledge base is approximately is 0.944, and the AUC surface of the Histogram Gradient Boosting Decision Tree (HGBT) model without using the knowledge base is approximately 0.939. As shown in Part A of Figure 7C, in the case of external validation using Kaohsiung Municipal Datong Hospital (KMTTH) as the test data set, the AUC performance of the Histogram Gradient Boosting Decision Tree (HGBT) model using the knowledge base is approximately is 0.947, and the AUC surface of the Histogram Gradient Boosting Decision Tree (HGBT) model without using the knowledge base is approximately 0.945. It can be seen that when using the histogram gradient to boost the decision tree model, combining the knowledge base and historical data with the machine learning model can achieve better accuracy.

具體地,如第8A圖至第8C圖所示,第8A圖為根據本發明一實施例的手術前風險預測系統的關於是否使用知識庫在梯度提升決策樹(GBT300)模型上的複式抽樣法(bootstrap)的比較結果示意圖。其中,第8A圖為使用高雄醫學大學附設中和紀念醫院(KMUH)作為測試資料集進行內部驗證的結果,第8B圖為 使用高雄市立小港醫院(KMHK)作為測試資料集進行外部驗證的結果,並且第8C圖為使用高雄市立大同醫院(KMTTH)作為測試資料集進行外部驗證的結果。第8A圖至第8C圖的A部分為有使用知識庫的梯度提升決策樹(GBT300)模型的曲線下面積(Area Under curve,AUC)表現,並且其B部分為有使用知識庫的梯度提升決策樹(GBT300)模型的精確召回曲線之曲線下面積(Area Under the Precision-Recall Curve,PRAUC)表現。 Specifically, as shown in Figures 8A to 8C, Figure 8A shows a pre-surgery risk prediction system according to an embodiment of the present invention, regarding whether to use the knowledge base on the gradient boosting decision tree (GBT300) compound sampling method on the model. (bootstrap) comparison results diagram. Among them, Figure 8A shows the results of internal verification using Kaohsiung Medical University Chung Ho Memorial Hospital (KMUH) as the test data set, and Figure 8B shows The results of external validation using Kaohsiung Municipal Xiaogang Hospital (KMHK) as the test data set, and Figure 8C shows the results of external validation using Kaohsiung Municipal Datong Hospital (KMTTH) as the test data set. Part A of Figures 8A to 8C is the area under curve (AUC) performance of the gradient boosting decision tree (GBT300) model using the knowledge base, and part B is the gradient boosting decision making using the knowledge base. Area Under the Precision-Recall Curve (PRAUC) performance of the tree (GBT300) model's precision recall curve.

具體地,如第8A圖的A部分所示,在使用高雄醫學大學附設中和紀念醫院(KMUH)作為測試資料集進行內部驗證的情況下,有使用知識庫的梯度提升決策樹(GBT300)模型的AUC表現為約為0.949,且未使用知識庫的直方圖梯度提升決策樹模型的AUC表面約為0.945。如第8B圖的A部分所示,在使用高雄市立小港醫院(KMHK)作為測試資料集進行外部驗證的情況下,有使用知識庫的梯度提升決策樹(GBT300)模型的AUC表現為約為0.955,且未使用知識庫的直方圖梯度提升決策樹模型的AUC表面約為0.934。如第8C圖的A部分所示,在使用高雄市立大同醫院(KMTTH)作為測試資料集進行外部驗證的情況下,有使用知識庫的梯度提升決策樹(GBT300)模型的AUC表現為約為0.937,且未使用知識庫的直方圖梯度提升決策樹模型的AUC表面約為0.941。由此可知,在使用梯度提升決策樹(GBT300)模型的情況下,結合知識庫以及歷史資料來配合機器學習模型,可以有具有較佳的準確率。 Specifically, as shown in Part A of FIG. 8A, when the Chung Ho Memorial Hospital of Kaohsiung Medical University (KMUH) was used as the test data set for internal validation, the AUC performance of the gradient boosting decision tree (GBT300) model with the knowledge base was approximately 0.949, and the AUC surface of the histogram gradient boosting decision tree model without the knowledge base was approximately 0.945. As shown in Part A of FIG. 8B, when the Kaohsiung Municipal Xiaogang Hospital (KMHK) was used as the test data set for external validation, the AUC performance of the gradient boosting decision tree (GBT300) model with the knowledge base was approximately 0.955, and the AUC surface of the histogram gradient boosting decision tree model without the knowledge base was approximately 0.934. As shown in Part A of Figure 8C, when using Kaohsiung Municipal Tatung Hospital (KMTTH) as the test data set for external validation, the AUC performance of the gradient boosting decision tree (GBT300) model with the knowledge base is about 0.937, and the AUC surface of the histogram gradient boosting decision tree model without the knowledge base is about 0.941. It can be seen that when using the gradient boosting decision tree (GBT300) model, combining the knowledge base and historical data with the machine learning model can have a better accuracy.

綜上所述,本發明之手術前風險預測系統包含資料前處理模型、知識庫模型、時序疊加知識庫模型、以及機器學習模型,從而利用不同的演算法來針對來自不同醫療院所的各資料集中的歷史資料進行資料處理。具體地,歷史資料輸入至時序疊加知識庫模型,且時序疊加知識庫模型依據歷史資料執 行時序疊加資料處理,而產生對應的時序疊加知識庫資料,時序疊加知識庫資料包含各患者的平均住院天數資料及死亡率資料。並且,時序疊加知識庫資料接續輸入至知識庫模型,且知識庫模型依據時序疊加知識庫資料執行資料融合處理,而產生融合知識庫資料,融合知識庫資料包含各患者的平均住院天數資料及死亡率資料。最後,融合知識庫資料輸入至機器學習模型,且機器學習模型依據融合知識庫資料執行知識庫基礎的機器學習,而產生輸出資料,輸出資料包含各患者的預測住院天數資料、預測死亡率資料及影響因子分析資料,以供醫師作為麻醉術前照會的參考資料。藉此,本發明之手術前風險預測系統可以透過以知識庫為基礎的機器學習方式來精準預測患者術後的住院天數及死亡率,且根據實驗結果,本發明之手術前風險預測系統在預測住院天數及死亡率方面皆具有90%以上的準確率。 In summary, the preoperative risk prediction system of the present invention includes a data pre-processing model, a knowledge base model, a time series superposition knowledge base model, and a machine learning model, thereby using different algorithms to process historical data in various data sets from different medical institutions. Specifically, the historical data is input into the time series superposition knowledge base model, and the time series superposition knowledge base model performs time series superposition data processing based on the historical data to generate corresponding time series superposition knowledge base data, which includes the average hospitalization days and mortality data of each patient. Furthermore, the time-series superimposed knowledge base data is continuously input into the knowledge base model, and the knowledge base model performs data fusion processing based on the time-series superimposed knowledge base data to generate fused knowledge base data, which includes the average hospital stay data and mortality data of each patient. Finally, the fused knowledge base data is input into the machine learning model, and the machine learning model performs knowledge base-based machine learning based on the fused knowledge base data to generate output data, which includes the predicted hospital stay data, predicted mortality data, and influencing factor analysis data of each patient, for doctors to use as reference data for preoperative anesthesia consultation. Thus, the preoperative risk prediction system of the present invention can accurately predict the length of hospital stay and mortality rate of patients after surgery through machine learning based on knowledge base. According to experimental results, the preoperative risk prediction system of the present invention has an accuracy rate of more than 90% in predicting the length of hospital stay and mortality rate.

藉由上述配置,本發明之手術前風險預測系統可以配合術前麻醉照會制度來輔助醫師執行術前的風險評估,從而可以減少人為判斷的誤差且提升術前風險評估的準確性,其有利於確保患者後續進行麻醉與手術的安全性。 Through the above configuration, the preoperative risk prediction system of the present invention can cooperate with the preoperative anesthesia consultation system to assist doctors in performing preoperative risk assessment, thereby reducing errors in human judgment and improving the accuracy of preoperative risk assessment, which is conducive to ensuring the safety of subsequent anesthesia and surgery for patients.

以上所述僅為舉例性,而非為限制性者。任何未脫離本發明之精神與範疇,而對其進行之等效修改或變更,均應包含於後附之申請專利範圍中。 The above is only illustrative and not restrictive. Any equivalent modifications or changes that do not depart from the spirit and scope of the present invention shall be included in the appended patent scope.

1:手術前風險預測系統 1: Preoperative risk prediction system

10:資料庫單元 10: Database unit

111:第一資料集 111:First data set

112:第一資料集 112: First data set

113:第三資料集 113: The third data set

20:處理單元 20: Processing unit

21:時序疊加知識庫模型 21: Time series superposition knowledge base model

22:資料統計模型知識庫 22: Data Statistical Model Knowledge Base

23:複合深度學習知識庫模型 23: Composite deep learning knowledge base model

24:機器學習模型 24:Machine learning model

241:輸出資料 241:Output data

25:資料前處理模型 25: Data preprocessing model

Claims (8)

一種手術前風險預測系統,其包含:一資料庫單元,係儲存有複數個患者的複數個歷史資料;以及一處理單元,係電性連接該資料庫單元,且包含一時序疊加知識庫模型、一知識庫模型(knowledge-based model)、以及一機器學習模型;其中,該複數個歷史資料分別輸入至該時序疊加知識庫模型,且該時序疊加知識庫模型依據該複數個歷史資料執行時序疊加資料處理,而產生對應的複數個時序疊加知識庫資料,該複數個時序疊加知識庫資料包含各該患者的平均住院天數資料及死亡率資料;並且該複數個時序疊加知識庫資料輸入至該知識庫模型,且該知識庫模型依據該複數個時序疊加知識庫資料執行資料融合處理,而產生複數個融合知識庫資料,該複數個融合知識庫資料包含各該患者的平均住院天數資料及死亡率資料;並且該複數個融合知識庫資料輸入至該機器學習模型,且該機器學習模型依據該複數個融合知識庫資料執行知識庫基礎的機器學習,而產生複數個輸出資料,該複數個輸出資料包含各該患者的預測住院天數資料、預測死亡率資料及影響因子分析資料;並且其中該複數個歷史資料係以一時間單位進行排序,且在該時序疊加資料處理中,若該複數個歷史資料不存在任一該時間單位內的空缺,則以一第一移動窗格計算對應的平均死亡率資料,進而輸出該複數個時序疊加知識庫資料,若該複數個歷 史資料存在任一該時間單位內的空缺,則扣除n筆空缺的該時間單位的資料,並以該第一移動窗格減n的一第二移動窗格來計算對應的平均死亡率資料,進而輸出該複數個時序疊加知識庫資料,其中n為正整數。 A pre-operative risk prediction system, which includes: a database unit that stores a plurality of historical data of a plurality of patients; and a processing unit that is electrically connected to the database unit and includes a time-series superimposed knowledge base model, A knowledge base model (knowledge-based model), and a machine learning model; wherein, the plurality of historical data are respectively input to the time series overlay knowledge base model, and the time series overlay knowledge base model performs time series superposition based on the plurality of historical data Data processing to generate corresponding plurality of time series overlay knowledge base data, the plurality of time series overlay knowledge base data include the average length of stay data and mortality data of each patient; and the plurality of time series overlay knowledge base data are input into the knowledge base A database model, and the knowledge base model performs data fusion processing based on the plurality of time series superimposed knowledge base data to generate a plurality of fused knowledge base data. The plurality of fused knowledge base data includes the average length of stay and mortality rate of each patient. data; and the plurality of fused knowledge base data are input to the machine learning model, and the machine learning model performs knowledge base-based machine learning based on the plurality of fused knowledge base data to generate a plurality of output data, and the plurality of output data Contains the predicted hospitalization days data, predicted mortality data and influencing factor analysis data for each patient; and the plurality of historical data are sorted in a time unit, and in the time series overlay data processing, if the plurality of historical data If there is no vacancy in any of the time units, the corresponding average mortality data is calculated using a first moving pane, and then the plurality of time series overlay knowledge base data is output. If the plurality of historical data are If there is any gap in the historical data within the time unit, then the data of the n gaps in the time unit will be deducted, and the corresponding average mortality data will be calculated by subtracting a second moving pane of n from the first moving pane. Then, the plurality of time series superposed knowledge base data are output, where n is a positive integer. 如請求項1所述之手術前風險預測系統,其中該複數個歷史資料包含分別來自不同醫療院所的複數個資料集。 The pre-surgery risk prediction system as described in claim 1, wherein the plurality of historical data includes a plurality of data sets respectively from different medical institutions. 如請求項2所述之手術前風險預測系統,其中該時序疊加知識庫模型係配置為:擷取該複數個資料集中的該複數個歷史資料的手術代碼、死亡標記、以及年度別,而進行各手術代碼在各年度別的死亡率統計,以輸出該複數個時序疊加知識庫資料而建立一知識庫;其中,在該知識庫中,由該時序疊加知識庫模型輸出的標記為第m年度的該複數個時序疊加知識庫資料係疊加至標記為第m-1年度的該複數個時序疊加知識庫資料,m為正整數。 A pre-operative risk prediction system as described in claim 2, wherein the time series superposition knowledge base model is configured to: extract the surgery codes, death markers, and years of the plurality of historical data in the plurality of data sets, and perform mortality statistics for each surgery code in each year, so as to output the plurality of time series superposition knowledge base data to establish a knowledge base; wherein, in the knowledge base, the plurality of time series superposition knowledge base data marked as the mth year output by the time series superposition knowledge base model are superimposed on the plurality of time series superposition knowledge base data marked as the m-1th year, and m is a positive integer. 如請求項3所述之手術前風險預測系統,其中該知識庫模型係連接該知識庫,該知識庫模型係配置為:判別該複數個歷史資料的一來源醫療院所,而依據該知識庫中的該來源醫療院所的過往資料以手術代碼進行前融合;依據該知識庫中的該來源醫療院所之外的醫療院所的過往資料以手術代碼進行前融合;或者依據該來源醫療院所或者該來源醫療院所之外的醫療院所的去年平均死亡率以進行資料填補。 The pre-operative risk prediction system as described in claim 3, wherein the knowledge base model is connected to the knowledge base, and the knowledge base model is configured to: identify a source medical institution of the plurality of historical data, and perform pre-fusion with the surgery code based on the past data of the source medical institution in the knowledge base; perform pre-fusion with the surgery code based on the past data of medical institutions other than the source medical institution in the knowledge base; or perform data filling based on the average mortality rate of the source medical institution or medical institutions other than the source medical institution last year. 如請求項1所述之手術前風險預測系統,其中該機器學習模型係配置為:擷取該複數個融合知識庫資料中的複數個特徵參 數以執行知識庫基礎的機器學習,且該複數個特徵參數包含選自於國際疾病與相關健康問題統計分類(International Statistical Classification of Diseases and Related Health Problem,ICD)代碼、手術代碼、各手術的歷年平均死亡率、住院診斷關聯群(Diagnosis Related Groups,DRG)代碼、抽血項目、性別、年齡、手術前住院天數、單次住院的手術次數及住院部門中的至少一者。 The pre-surgery risk prediction system as described in claim 1, wherein the machine learning model is configured to: retrieve a plurality of feature parameters from the plurality of fusion knowledge base data. The number is used to perform machine learning based on the knowledge base, and the plurality of feature parameters include selected from the International Statistical Classification of Diseases and Related Health Problems (ICD) code, surgery code, and historical data of each surgery. Average mortality rate, hospitalization diagnosis related group (DRG) code, blood draw items, gender, age, length of stay before surgery, number of surgeries in a single hospitalization, and at least one of the inpatient departments. 如請求項2所述之手術前風險預測系統,其進一步包含一資料前處理模型,係配置為對各該資料集中的該複數個歷史資料進行資料前處理,以填補該複數個歷史資料中的缺失值,且以一訓練資料集為標準對該複數個歷史資料執行標準化,並且以該訓練資料集對該複數個歷史資料進行編碼,而輸出經資料前處理的該複數個歷史資料至該知識庫模型。 The preoperative risk prediction system as described in claim 2 further comprises a data preprocessing model, which is configured to perform data preprocessing on the plurality of historical data in each of the data sets to fill in missing values in the plurality of historical data, and to perform standardization on the plurality of historical data using a training data set as a standard, and to encode the plurality of historical data using the training data set, and output the plurality of historical data that have undergone data preprocessing to the knowledge base model. 如請求項1所述之手術前風險預測系統,其進一步包含一使用者介面,用以供輸入手術前的該患者的該歷史資料,從而獲取包含該患者的預測住院天數資料、預測死亡率資料及影響因子分析資料的該輸出資料。 The pre-operative risk prediction system as described in claim 1, further comprising a user interface for inputting the historical data of the patient before surgery, thereby obtaining data including the predicted length of stay and predicted mortality of the patient. and the output data of the impact factor analysis data. 如請求項1所述之手術前風險預測系統,其中該時序疊加知識庫模型、該知識庫模型、以及該機器學習模型係選自於由梯度提升決策樹模型、直方圖梯度提升決策樹模型、隨機森林樹模型、以及長短記憶模型所組成的群組中的至少一者。 The pre-operative risk prediction system as described in claim 1, wherein the time series superposition knowledge base model, the knowledge base model, and the machine learning model are selected from at least one of the group consisting of a gradient boosting decision tree model, a histogram gradient boosting decision tree model, a random forest tree model, and a long short-term memory model.
TW112114207A 2023-04-17 2023-04-17 Preoperative risk prediction system TWI836965B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW112114207A TWI836965B (en) 2023-04-17 2023-04-17 Preoperative risk prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW112114207A TWI836965B (en) 2023-04-17 2023-04-17 Preoperative risk prediction system

Publications (1)

Publication Number Publication Date
TWI836965B true TWI836965B (en) 2024-03-21

Family

ID=91269921

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112114207A TWI836965B (en) 2023-04-17 2023-04-17 Preoperative risk prediction system

Country Status (1)

Country Link
TW (1) TWI836965B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201725526A (en) * 2015-09-30 2017-07-16 伊佛曼基因體有限公司 Systems and methods for predicting treatment-regimen-related outcomes
US20210315581A1 (en) * 2017-12-28 2021-10-14 Cilag Gmbh International Method of hub communication, processing, display, and cloud analytics
TW202215358A (en) * 2020-10-13 2022-04-16 奇美醫療財團法人奇美醫院 System and program product for auxiliary assessment of surgical anesthesia risk as well as method for establishing and using the same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201725526A (en) * 2015-09-30 2017-07-16 伊佛曼基因體有限公司 Systems and methods for predicting treatment-regimen-related outcomes
US20210315581A1 (en) * 2017-12-28 2021-10-14 Cilag Gmbh International Method of hub communication, processing, display, and cloud analytics
TW202215358A (en) * 2020-10-13 2022-04-16 奇美醫療財團法人奇美醫院 System and program product for auxiliary assessment of surgical anesthesia risk as well as method for establishing and using the same

Similar Documents

Publication Publication Date Title
Hosseinzadeh et al. Assessing the predictability of hospital readmission using machine learning
Shaji Predictionand diagnosis of heart disease patients using data mining technique
WO2021121129A1 (en) Method and apparatus for similar case detection, device, and storage medium
Schäfer et al. Toward machine-learning-based decision support in diabetes care: A risk stratification study on diabetic foot ulcer and amputation
Du et al. Predicting in-hospital mortality of patients with febrile neutropenia using machine learning models
CN113657548A (en) Medical insurance abnormity detection method and device, computer equipment and storage medium
CN112330621A (en) Method and device for carrying out abnormity classification on skin image based on artificial intelligence
CN116864139A (en) Disease risk assessment method, device, computer equipment and readable storage medium
Kabir et al. Non-linear feature selection for prediction of hospital length of stay
US20230316505A1 (en) Medical scan viewing system with roc adjustment and methods for use therewith
TWI836965B (en) Preoperative risk prediction system
Adigun et al. Classification of Diabetes Types using Machine Learning
Santos et al. Enabling ubiquitous data mining in intensive care-features selection and data pre-processing
Mantovani et al. Mining compact predictive pattern sets using classification model
Ghavidel et al. Predicting the Need for Cardiovascular Surgery: A Comparative Study of Machine Learning Models
McDougall et al. Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning
Karaköse et al. A New Approach for Effective Medical Deepfake Detection in Medical Images
Wittler et al. Deep learning enabled predicting modeling of mortality of diabetes mellitus patients
WO2021012203A1 (en) Multi-model complementary enhanced machine leaning platform based on danger early warning in perioperative period
De Lauri et al. Investigating the impact of age, gender, and comorbid conditions on the prolonged length of stay after endarterectomy
Vijayakumar et al. An Intelligent stacking Ensemble-Based Machine Learning Model for Heart abnormality
Wang et al. Machine learning-based prediction of postoperative 30-days mortality
Liu et al. A Hybrid Machine Learning Method for Diabetes Detection based on Unsupervised Clustering
ERTEL et al. SIMULATING BREAST CANCER TREATMENT EFFICACY: A COMPUTATIONAL APPROACH TO OPTIMIZING PATIENT CARE
Ismail et al. A Comparative Study of Diabetes Classification Based on Machine Learning