TWI795928B

TWI795928B - System and method for prediction of intradialytic adverse event and computer readable medium thereof

Info

Publication number: TWI795928B
Application number: TW110136257A
Authority: TW
Inventors: 李光申; 楊智宇; 劉懿璇
Original assignee: 國立陽明交通大學
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2023-03-11
Also published as: TW202313132A

Abstract

Provided are a system and a method for prediction of an intradialytic adverse event, where a machine learning model of two-class classification is utilized to predict intradialytic adverse events in quasi-real time, such that features extracted in an ongoing hemodialysis process in real time can have the hemodialysis session alerted for forthcoming adverse events. Therefore, clinicians can be warned to take necessary actions and adjust the hemodialysis machine settings in advance. In addition, a computer readable medium thereof is also provided.

Description

System, method and computer readable medium for predicting adverse events in dialysis

本揭露涉及醫學監測應用，且更具體地涉及用於即時預測透析中不良事件之系統、方法及其電腦可讀媒介。 The present disclosure relates to medical monitoring applications, and more particularly to systems, methods and computer readable media for real-time prediction of adverse events in dialysis.

血液透析(hemodialysis，HD)療法在照護管理中扮演重要角色。由於少尿甚至無尿，大多數腎衰竭病患在血液透析治療(HD療程)期間必須去除液體以維持血容量正常。高血壓之體積依賴性要素可以藉由液體排除來糾正，但超過濾過程使血液透析病患面臨血流動力學不穩定之風險，這可能導致如心臟驟停的致命後果。透析中低血壓是血液透析期間最常見之併發症，且已被確定為血液透析療效降低之原因。急性地，透析中不良事件可能致命；長期地，頻繁的透析中不良事件則會增加病患之發病率和長期全因死亡率。 Hemodialysis (HD) therapy plays an important role in care management. Due to oliguria or even anuria, most patients with renal failure must remove fluid during hemodialysis treatment (HD course) to maintain emovolemia. The volume-dependent element of hypertension can be corrected by fluid removal, but the ultrafiltration process exposes hemodialysis patients to the risk of hemodynamic instability, which can lead to fatal consequences such as cardiac arrest. Intradialytic hypotension is the most common complication during hemodialysis and has been identified as a cause of reduced efficacy of hemodialysis. Acutely, adverse events in dialysis may be fatal; in the long term, frequent adverse events in dialysis will increase the morbidity and long-term all-cause mortality of patients.

已開發的裝置(例如，費森尤斯醫療(Fresenius Medical Care)所開發之Crit-Line監測器)為以光傳輸方法非侵入性地監測即時血容比、氧飽和度和透析中體積狀態來協助超過濾過程中之液體排除。儘管非對照研究表明此裝置減少了透析中的症狀，且有助於評估目標體重，但一項非盲隨機對照試驗顯示，Crit-Line組之住院率高於對照組。 Devices (e.g., the Crit-Line monitor developed by Fresenius Medical Care) have been developed for the non-invasive monitoring of instantaneous hematocrit, oxygen saturation, and in-dialysis volume status by means of light transmission. Assist in liquid removal during ultrafiltration. Although uncontrolled studies have shown that this The Crit-Line setting reduces symptoms during dialysis and helps assess target weight, but an open-label randomized controlled trial showed that the Crit-Line group had a higher rate of hospitalization than the control group.

人工智慧亦被應用於血液透析病患以輔助臨床實務，諸如尿素清除率、膳食蛋白質攝入量、體積狀態、紅血球生成素刺激劑反應、補鐵反應、血紅素值、血液透析品質、死亡率等的預測。儘管人工智慧亦被應用於預測透析中低血壓風險，先前針對此應用之研究仍欠缺對時間序列資料輸入之考量。 Artificial intelligence is also applied to hemodialysis patients to assist clinical practice, such as urea clearance rate, dietary protein intake, volume status, erythropoietin stimulating agent response, iron supplementation response, hemoglobin value, hemodialysis quality, mortality etc. forecast. Although artificial intelligence has also been applied to predict the risk of hypotension in dialysis, previous research on this application still lacks the consideration of time series data input.

因此，如何在機器學習手段中考量時間序列資料，並以無偏差作法預測比低血壓風險更多的透析中不良事件，為本技術領域中尚未滿足的需求。 Therefore, how to consider time series data in machine learning methods and predict adverse events in dialysis more than hypotension risk in an unbiased way is an unmet need in this technical field.

有鑑於此，本揭露提供了一種用於預測透析中不良事件之系統，其包括：特徵提取模組，其經配置以收集和處理有關病患之血液透析療程的資料；以及模型建立與優化模組，其經配置以基於該資料建立用於預測該血液透析療程之期間的該透析中不良事件之機器學習模型。 In view of this, the present disclosure provides a system for predicting adverse events in dialysis, which includes: a feature extraction module configured to collect and process data about patients' hemodialysis sessions; and a model building and optimization module a set configured to build a machine learning model for predicting the in-dialysis adverse event during the hemodialysis session based on the data.

本揭露另提供了一種用於預測透析中不良事件之方法，其包括：配置特徵提取模組以收集和處理關於病患之血液透析療程的資料；以及配置模型建立與優化模組以基於該資料建立用於預測該血液透析療程期間之該透析中不良事件之機器學習模型。 The present disclosure further provides a method for predicting adverse events in dialysis, which includes: configuring a feature extraction module to collect and process data about a patient's hemodialysis course; and configuring a model building and optimization module based on the data A machine learning model for predicting adverse events in dialysis during the course of hemodialysis is established.

在本揭露的至少一個實施例中，關於該病患之該血液透析療程的該資料係包括：人口學資訊、生理數據、透析資料和所登錄透析中不良事件中之一或多者。 In at least one embodiment of the present disclosure, the information about the hemodialysis course of the patient includes: one or more of demographic information, physiological data, dialysis data, and registered adverse events in dialysis.

在本揭露的至少一個實施例中，該資料包括具有複數筆紀錄之資料集，該複數筆紀錄具有在相應時間戳記處之測量。 In at least one embodiment of the present disclosure, the data includes a data set having a plurality of records with measurements at corresponding time stamps.

在本揭露的至少一個實施例中，該特徵提取模組收集和處理有關該病患之該血液透析療程的該資料係藉由：從該測量導出平均值、該平均值的標準差、變異係數、線性回歸斜率及線性回歸R平方中的至少一者，以作為該複數筆紀錄之特徵。 In at least one embodiment of the present disclosure, the feature extraction module collects and processes the data about the patient's hemodialysis session by: deriving from the measurements a mean value, a standard deviation of the mean value, a coefficient of variation At least one of linear regression slope and linear regression R square is used as the feature of the plurality of records.

在本揭露的至少一個實施例中，該測量包括靜脈壓和跨膜壓，並且其中，該特徵提取模組收集和處理有關該病患之該血液透析療程的該資料係藉由：從該靜脈壓和該跨膜壓之該測量導出變化率之最大值、最小值和平均值，以及二次微分中的至少一者，以作為該複數筆紀錄之特徵。 In at least one embodiment of the present disclosure, the measurements include venous pressure and transmembrane pressure, and wherein the feature extraction module collects and processes the data about the patient's hemodialysis session by: The measurement of pressure and the transmembrane pressure derives at least one of a maximum, minimum, and average value, and a second differential, of a rate of change to characterize the plurality of records.

在本揭露的至少一個實施例中，該機器學習模型係基於有關欲預測之目標透析中不良事件之第一維度，以及有關欲預測之該透析療程期間的目標時段之第二維度所建立。 In at least one embodiment of the present disclosure, the machine learning model is established based on a first dimension related to a target adverse event in dialysis to be predicted, and a second dimension related to a target period of time during the dialysis session to be predicted.

在本揭露的至少一個實施例中，該機器學習模型係藉由標記有相關於該透析中不良事件之結果之該資料進行訓練。 In at least one embodiment of the present disclosure, the machine learning model is trained with the data labeled with outcomes related to the adverse event in dialysis.

在本揭露的至少一個實施例中，該機器學習模型係藉由自該資料提取之關鍵特徵組合進行訓練。 In at least one embodiment of the present disclosure, the machine learning model is trained by combining key features extracted from the data.

在至少一個實施例中，本揭露所述之系統另包括資料儲存模組，其經配置以儲存該資料，且本揭露所述之方法另包括配置資料儲存模組以儲存該資料。 In at least one embodiment, the system described herein further includes a data storage module configured to store the data, and the method described herein further includes configuring the data storage module to store the data.

本揭露另提供一種電腦可讀媒介，其儲存有電腦可執行代碼，該電腦可執行代碼在執行後實施如上述之方法。 The present disclosure further provides a computer-readable medium, which stores computer-executable codes, and implements the above-mentioned method after the computer-executable codes are executed.

1:系統 1: system

10:特徵提取模組 10: Feature extraction module

20:資料儲存模組 20: Data storage module

30:模型建立與優化模組 30:Model building and optimization module

S1~S4:步驟 S1~S4: steps

藉由閱讀以下實施例之描述，並參考伴隨圖式，能更全面地理解本揭露，其中： The present disclosure can be more fully understood by reading the following description of the embodiments, with reference to the accompanying drawings, in which:

圖1為根據本揭露的實施例顯示用於預測透析中不良事件之系統的示例性結構示意圖； FIG. 1 is a schematic diagram showing an exemplary structure of a system for predicting adverse events in dialysis according to an embodiment of the present disclosure;

圖2為根據本揭露的實施例顯示用於選擇病患和血液透析療程之決策過程的步驟流程圖； 2 is a flowchart showing the steps of the decision-making process for selecting patients and hemodialysis sessions according to an embodiment of the present disclosure;

圖3A和圖3B為根據本揭露的實施例顯示用於預測屬於第1組不良事件之機器學習模型的效能示意圖；ctr：對照；f72、f76、f77、f78、f82：表4和表5中所指特徵之編號；平均△(UF速率)：超過濾速率變化之平均值； Figure 3A and Figure 3B are schematic diagrams showing the performance of the machine learning model used to predict adverse events belonging to Group 1 according to an embodiment of the present disclosure; ctr: control; f72, f76, f77, f78, f82: in Table 4 and Table 5 The number of the indicated feature; average △ (UF rate): the average value of the ultrafiltration rate change;

圖4為根據本揭露的實施例顯示用於預測屬於第2組不良事件之機器學習模型之效能的示意圖；ctr：對照；平均△(UF速率)：超過濾速率變化的平均值；平均UF體積：超過濾體積的平均值； 4 is a schematic diagram showing the performance of a machine learning model for predicting adverse events belonging to group 2 according to an embodiment of the present disclosure; ctr: control; mean Δ(UF rate): mean value of change in ultrafiltration rate; mean UF volume : average value of ultrafiltration volume;

圖5為根據本揭露的實施例顯示用於預測屬於第3組不良事件之機器學習模型之效能的示意圖；ctr：對照；以及 5 is a schematic diagram showing the performance of a machine learning model for predicting adverse events belonging to group 3 according to an embodiment of the present disclosure; ctr: control; and

圖6和圖7為根據本揭露的實施例顯示基於機器學習模型對不良事件之預測機率隨時間推移之一致性的示意圖。 FIG. 6 and FIG. 7 are schematic diagrams showing the consistency of the predicted probability of an adverse event based on a machine learning model over time according to an embodiment of the present disclosure.

以下實施例用於闡明本揭露。本技術領域中具有通常知識者閱讀本揭露內容後，可輕易理解本揭露之優點和效果，亦可以在其他不同實施例中實施或應用。因此，本文所載本揭露範圍內任何要件或方法，可以與本揭露任何實施例中所揭示之任何其他要件或方法結合。 The following examples serve to illustrate the present disclosure. Those skilled in the art can easily understand the advantages and effects of the present disclosure after reading the present disclosure, and can also be implemented or applied in other different embodiments. Accordingly, any element or method within the scope of the disclosure set forth herein may be combined with any other element or method disclosed in any embodiment of the disclosure.

本揭露之圖式所示的比例關係、結構、尺寸等特徵僅用於闡明本文所描述的實施例，以使本技術領域中具有通常知識者能夠閱讀和理解本揭露內容，並非旨在限制本揭露的範圍。對所述特徵之任何變更、修改或調整，在不影響本揭露構思目的和效果下，均應屬於本揭露技術內容之範圍。 The proportions, structures, dimensions and other features shown in the drawings of the disclosure are only used to illustrate the embodiments described herein, so that those skilled in the art can read and understand the disclosure content and is not intended to limit the scope of this disclosure. Any changes, modifications or adjustments to the above-mentioned features shall fall within the scope of the technical content of the present disclosure without affecting the purpose and effect of the present disclosure.

在本文中，當描述對象「包括」、「包含」或「具有」技術特徵時，除非另有說明，否則可另外包含其他要件、成分、結構、區域、部件、裝置、系統、步驟、連結等，且不應排除其他特徵。 In this article, when it is described that an object "comprises", "comprises" or "has" technical features, unless otherwise specified, other elements, components, structures, regions, components, devices, systems, steps, connections, etc. may be included in addition , and other features should not be excluded.

在本文中，諸如「第一」、「第二」等順序術語僅被引用以方便描述或區分諸如要件、成分、結構、區域、部件、裝置、系統等之技術特徵，非旨在限制本揭露的範圍，亦非旨在限制此類技術特徵間的空間順序。此外，除非另有說明，諸如「一(a)」、「一(an)」和「該」等單數形式用語亦指涉複數形式，且諸如「或」和「及/或」等用語可以交互使用。 In this article, ordinal terms such as "first" and "second" are only used to describe or distinguish technical features such as elements, components, structures, regions, components, devices, systems, etc., and are not intended to limit the present disclosure. It is not intended to limit the spatial sequence among such technical features. Furthermore, terms in the singular such as "a (a)", "an" and "the" also refer to the plural unless otherwise stated, and terms such as "or" and "and/or" are interchangeable use.

如本文中所使用，術語「對象」、「個體」和「病患」可交互使用，且可指動物，例如，包括人類之哺乳動物。除非特別指明一種性別，否則術語「對象」可同時指男性和女性。 As used herein, the terms "subject," "individual" and "patient" are used interchangeably and may refer to animals, eg, mammals including humans. Unless a gender is specifically indicated, the term "subject" can refer to both males and females.

在本文中，術語「包含」、「包羅」、「包括」、「包蘊」、「具有」、「具備」、「蘊含」、「涵蓋」或其任何其他變體旨在涵括非排他性的包含。例如，包含一系列要件之組成物、混合物、過程或方法不一定僅限於那些要件，而是可以包括未明確列出的、或是此類組成物、混合物、過程或方法固有之其他要件。 As used herein, the terms "comprises", "includes", "includes", "includes", "has", "has", "includes", "covers" or any other variation thereof are intended to cover non-exclusive Include. For example, a composition, mixture, process, or method comprising a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed, or inherent to such composition, mixture, process, or method.

在本文中，關於一系列的一個或多個要件的用語「至少一」應理解為意指選自所述一系列要件中之任何一個或多個要件的至少一個要件，但不一定包括所述一系列要件中列出之每個要件中的至少一個，且不排除所述一系列要件中之要件的任何組合。此定義還允許除用語「至少一」所指一系列要件中指稱的要件以外的要件選擇性地存在，無論其與指稱要件相關或不相關。因此，作為非限制性示例，「A和B中的至少一個」(或等效地，「A或B中的至少一個」，或等效地，「A及/或B中的至少一個」)可以：在一實施例中，意指至少一個、可選地包括多於一個A且沒有B(並可選地包括B以外的要件)；在另一實施例中，意指至少一個、可選地包括多於一個B且沒有A(並可選地包括A以外的要件)；在又一實施例中，意指至少一個、可選地包括多於一個A，和至少一個、可選地包括多於一個B(且可選地包括其他要件)。 In this context, the term "at least one" in relation to a series of one or more elements should be understood to mean at least one element selected from any one or more elements in the series of elements, but not necessarily including the At least one of each element listed in a list of elements, without excluding any combination of elements in said list of elements. This definition also allows for the optional presence of elements other than the indicated element in the list of elements to which the term "at least one" refers, whether related or unrelated to the indicated element. therefore, As a non-limiting example, "at least one of A and B" (or equivalently, "at least one of A or B", or equivalently, "at least one of A and/or B") may: In one embodiment, means at least one, optionally including more than one A and no B (and optionally including elements other than B); in another embodiment, means at least one, optionally including More than one B and no A (and optionally including elements other than A); in yet another embodiment, means at least one, optionally including more than one A, and at least one, optionally including more than A B (and optionally other elements).

在本文中，術語「一或多」和「至少一」可以具有相同含義並且包括一個、兩個、三個或更多個。 Herein, the terms "one or more" and "at least one" may have the same meaning and include one, two, three or more.

在本文中，術語「測量」和「量測」可以與「判定」、「評估」、「測定」、「檢測」等互換，其指定量和定性的判定。在欲進行定量判定之情況，可使用「測量數量」等用語。在欲進行定性或定量判定的情況，則可使用「測量水平」或「判定水平」之用語。 As used herein, the terms "measure" and "measure" are interchangeable with "determine", "assess", "determine", "detect" and the like, which designate both quantitative and qualitative determinations. In the case of quantitative determination, terms such as "measurement quantity" may be used. In the case of qualitative or quantitative determination, the term "measurement level" or "judgment level" can be used.

參照圖1，其顯示用於預測透析中不良事件的系統1，包括：特徵提取模組10、資料儲存模組20和模型建立與優化模組30。該系統1之要件可藉由任意合適的有線或無線方式相互連接，本揭露並不限於此。 Referring to FIG. 1 , it shows a system 1 for predicting adverse events in dialysis, including: a feature extraction module 10 , a data storage module 20 and a model building and optimization module 30 . The elements of the system 1 can be connected to each other by any suitable wired or wireless means, and the present disclosure is not limited thereto.

在一些實施例中，特徵提取模組10可以耦接至或實施於血液透析(HD)機(未圖示)，使得血液透析療程期間自病患獲得的任何紀錄皆可被收集並用於現場特徵提取。 In some embodiments, the feature extraction module 10 can be coupled to or implemented in a hemodialysis (HD) machine (not shown), so that any records obtained from a patient during a hemodialysis session can be collected and used for on-site features extract.

在一些實施例中，資料儲存模組20配置為維護經特徵提取模組10接收和處理之資料，以便在後期供予資料檢查及/或模型建立。資料儲存模組20可以任意合適之資料儲存裝置、系統、資料庫、雲端存儲、或類似者實現，本揭露並不限於此。 In some embodiments, the data storage module 20 is configured to maintain the data received and processed by the feature extraction module 10 for later data inspection and/or model building. The data storage module 20 can be realized by any suitable data storage device, system, database, cloud storage, or the like, and the present disclosure is not limited thereto.

在一些實施例中，模型建立與優化模組30配置為建立用於預測病患透析中不良事件之機器學習模型，及/或基於來自病患之血液透析療程的改善資料品質而增進機器學習模型的效能。在至少一個實施例中，模型建立與優化模組30可建立一個以上的機器學習模型，以對應特徵提取模組10所收集各類特性的資料。舉例而言，機器學習模型可建基於二個維度上：欲預測之目標透析中不良事件(例如，血壓升高、肌肉痙攣和除血壓升高以外之所有事件)；以及欲預測之血液透析療程期間的目標時段(例如，血液透析療程期間不同之30分鐘時段)。然而，所建立之機器學習模型的數量和預測目標並不旨在限制本揭露的範圍，而可根據規劃需求以任意合適作法變更。 In some embodiments, the model building and optimization module 30 is configured to build a machine learning model for predicting adverse events in a patient's dialysis, and/or based on improvements from a patient's hemodialysis treatment Improve the performance of machine learning models by improving data quality. In at least one embodiment, the model establishment and optimization module 30 can establish more than one machine learning model to correspond to the data of various characteristics collected by the feature extraction module 10 . For example, a machine learning model can be built on two dimensions: the target adverse event in dialysis to be predicted (e.g., blood pressure increase, muscle cramps, and all events except blood pressure increase); and the hemodialysis course to be predicted The target time period of the period (eg, different 30-minute time periods during a hemodialysis session). However, the number of established machine learning models and prediction targets are not intended to limit the scope of this disclosure, and can be changed in any appropriate way according to planning requirements.

在一些實施例中，系統之要件可被個別地實現為任意合適的運算裝置、設備、應用程式、系統、或類似者，但本揭露不限於此。舉例而言，特徵提取模組10、資料儲存模組20和模型建立與優化模組30中的任意二個或三個可以整合在一起，而非實現為三個獨立單元。在一些實施例中，所述三個要件亦可在雲端運算環境中整合並實現。然而，在不偏離本揭露操作理念的情況下，系統中所述要件之配置可以實現為任意合適形式，且不應侷限本揭露之範圍。 In some embodiments, elements of the system may be individually implemented as any suitable computing device, device, application, system, or the like, but the disclosure is not limited thereto. For example, any two or three of the feature extraction module 10 , the data storage module 20 and the model building and optimization module 30 can be integrated together instead of being implemented as three independent units. In some embodiments, the three requirements can also be integrated and implemented in a cloud computing environment. However, without departing from the operating philosophy of the present disclosure, the configuration of the elements in the system may be implemented in any suitable form, and should not limit the scope of the present disclosure.

圖1中進一步描述系統1之上述要件間的操作關係，其以箭號表示(在此描述為「步驟」)並詳述於此。 The operational relationship among the aforementioned elements of system 1 is further described in FIG. 1 , indicated by arrows (described herein as "steps") and described in detail herein.

在一些實施例中，步驟S1表示特徵提取模組10將在血液透析療程期間即時收集和處理有關病患之紀錄的資料，並將該資料儲存在資料儲存模組20中。 In some embodiments, the step S1 means that the feature extraction module 10 will collect and process the data related to the patient's record in real time during the hemodialysis treatment, and store the data in the data storage module 20 .

在一些實施例中，步驟S2表示模型建立與優化模組30將利用儲存在資料儲存模組20中的資料建立及/或優化用於預測病患透析中不良事件的機器學習模型。舉例而言，機器學習模型可以基於線性模型、隨機森林支持向量回歸、XGBoost、LASSO回歸、集成方法、深度學習、或類似者中的任一種或上述各者的任意組合，本揭露並不限於此。 In some embodiments, step S2 means that the model building and optimization module 30 will use the data stored in the data storage module 20 to build and/or optimize a machine learning model for predicting adverse events in dialysis patients. For example, the machine learning model can be based on any one of linear models, random forest support vector regression, XGBoost, LASSO regression, ensemble methods, deep learning, or the like or any combination of the above, and the present disclosure is not limited thereto .

在一些實施例中，步驟S3表示對於經模型建立與優化模組30建立並充分訓練的機器學習模型，特徵提取模組10還可將有關病患在血液透析療程期間之紀錄的資料發送至該機器學習模型，以用於即時預測透析中不良事件。 In some embodiments, step S3 means that for the machine learning model established and fully trained by the model building and optimization module 30, the feature extraction module 10 can also send the data about the patient's records during the hemodialysis course to the machine learning model. A machine learning model for immediate prediction of adverse events in dialysis.

在一些實施例中，步驟S4表示在機器學習模型完成其預測後，預測結果將發送回特徵提取模組10(或與其配對之血液透析機)，以告知病患的透析中不良事件風險。 In some embodiments, step S4 means that after the machine learning model completes its prediction, the prediction result will be sent back to the feature extraction module 10 (or the hemodialysis machine paired with it) to inform the patient of the risk of adverse events during dialysis.

在一些實施例中，還存在一種電腦可讀媒介，其儲存有電腦可執行代碼，此電腦可執行代碼經配置以在被執行後實施本揭露上述所討論的步驟。 In some embodiments, there is also a computer-readable medium storing computer-executable code configured to implement the above-discussed steps of the present disclosure after being executed.

由此處開始，將詳述特徵提取模組10、資料儲存模組20和模型建立與優化模組30之工作機制如何規劃。 Starting from here, how to plan the working mechanisms of the feature extraction module 10 , the data storage module 20 and the model building and optimization module 30 will be described in detail.

方法 method

研究方案和研究對象 Research Protocol and Study Objects

在一項於單一機構進行的回顧性觀察研究之實作中，審閱所有在彰化基督教醫院接受維持性血液透析治療之病患的紀錄。圖2顯示依據本文所述實施例中如何選擇病患和血液透析療程以用於研究之決策過程。舉例而言，在三個月期間內，129名符合條件之病患中有108名完成此為期三個月的研究，並基於以下理由排除所述108名病患的血液透析療程：(1)因透析機替換導致療程中斷；(2)因病患排尿或排便導致中斷超過一次的療程；及/或(3)病患在期間無法自由表達其不適的療程。最終，來自所述108名病患之總共4221個血液透析療程被用於建立機器學習模型以預測透析中不良事件，其中每位病患在3個月研究期間內各接受了39或40個血液透析療程。 In the practice of a retrospective observational study conducted at a single institution, the records of all patients receiving maintenance hemodialysis at Changhua Christian Hospital were reviewed. Figure 2 shows the decision-making process of how patients and hemodialysis sessions were selected for the study according to the examples described herein. For example, 108 of 129 eligible patients completed the three-month study during the three-month period, and said 108 patients were excluded from hemodialysis sessions based on the following reasons: (1) Interruption of a course of treatment due to dialysis machine replacement; (2) interruption of more than one course of treatment due to urination or defecation of the patient; and/or (3) course of treatment during which the patient is unable to freely express their discomfort. Ultimately, a total of 4221 hemodialysis sessions from the 108 patients, each receiving 39 or 40 hemodialysis sessions during the 3-month study period, were used to build a machine learning model to predict adverse events in dialysis Dialysis sessions.

透析和生理數據收集 Dialysis and Physiological Data Collection

在本文所描述的實施例中，特徵提取模組10收集了數種類型的資料以用於建立機器學習模型，例如，人口學資訊、生理數據、透析資料和所登錄透析中不良事件。上述類型之資料可以藉由人工(例如，由醫務人員進行測量)或自動方式(例如，由血液透析機自行測量)來收集，本揭露並不限於此。 In the embodiments described herein, the feature extraction module 10 collects several types of data for building a machine learning model, such as demographic information, physiological data, dialysis data, and registered adverse events in dialysis. The above-mentioned types of data can be collected manually (eg, measured by medical personnel) or automatically (eg, measured by the hemodialysis machine itself), but the present disclosure is not limited thereto.

在至少一個實施例中，人口學資訊可以從病歷中導出，並且可以包括諸如任意病患之年齡、性別和透析治療年資等資訊。 In at least one embodiment, demographic information can be derived from medical records and can include information such as age, gender, and years of dialysis treatment for any patient.

在至少一個實施例中，生理數據為在參與病患的每個血液透析療程期間以大約每30到60分鐘的間隔進行測量和記錄(每個血液透析療程大約4小時)。 In at least one embodiment, physiological data is measured and recorded at approximately every 30 to 60 minute intervals during each hemodialysis session of the participating patient (approximately 4 hours per hemodialysis session).

在至少一個實施例中，透析資料為在參與病患的每個血液透析療程期間自血液透析機收集。與透析資料相關之血液透析機讀數的示例如下表1所示。 In at least one embodiment, dialysis data is collected from a hemodialysis machine during each hemodialysis session of a participating patient. Examples of hemodialysis machine readings associated with dialysis data are shown in Table 1 below.

在至少一個實施例中，所登錄的透析不良事件為根據測量到的生理數據或病患陳訴來記錄。所記錄到參與病患之透析中不良事件的示例如下表2所示。 In at least one embodiment, the logged adverse dialysis events are recorded based on measured physiological data or patient complaints. Examples of dialysis adverse events recorded for participating patients are shown in Table 2 below.

參照下表3-1至3-2，其揭示了特徵提取模組10所收集之資料的示例。在研究來自108名病患、共4221個血液透析療程之實作中，特徵提取模組10替每個血液透析療程i(i=1至4221)所收集之資料集HD_i為由複數筆紀錄{Y_j,k,T_k}組成，其中，j(參見表3-1，其範圍為1至9)為來自透析和生理測量之資料的類別的索引值；k為測量發生之時間的索引值；而Y_j,k則為在時間T_k之測量j的數值。參照表3-1中以粗邊界標記的紀錄，其顯示於時間戳記「T₄=9：35：44」(即，{Y_j,4,T₄})所測量的所有類別的資料點Y_j,4。根據製造商默認的設置，一旦靜脈壓(VP)或跨膜壓(TMP)之數值發生變化並與在T=T_k-1的前次測量不同，通常會從血液透析機自動記錄透析資料及/或生理數據。因此，任兩筆連續紀錄間的時間間隔T_k-T_k-1可能不相等。 Refer to Tables 3-1 to 3-2 below, which disclose examples of data collected by the feature extraction module 10 . In the practice of studying 108 patients with a total of 4221 hemodialysis courses, the data set HD _i collected by the feature extraction module 10 for each hemodialysis course i (i=1 to 4221) is composed of multiple records {Y _j,k ,T _k }, where j (see Table 3-1, which ranges from 1 to 9) is the index value of the category of data from dialysis and physiological measurements; k is the index of the time when the measurement occurred value; and Y _j,k is the value of j measured at time T _k . Refer to the records marked with thick boundaries in Table 3-1, which show the data points Y for all classes measured at the time stamp "T ₄ =9:35:44" (ie, {Y _j,4 ,T ₄ }) _j,4 . According to the manufacturer's default settings, once the value of venous pressure (VP) or transmembrane pressure (TMP) changes and is different from the previous measurement at T=T _k-1 , the dialysis data is usually automatically recorded from the hemodialysis machine and / or physiological data. Therefore, the time interval T _k -T _k-1 between any two consecutive records may not be equal.

續表3-2，每個資料集HD_i還包括額外的非時變病患特定資訊Y_j(j=10至13)，其分別代表病患於對應血液透析療程期間的年齡、性別、血液透析年資和透析前體重等資訊。 Continuing from Table 3-2, each data set HD _i also includes additional time-invariant patient-specific information Y _j (j=10 to 13), which respectively represent the age, sex, blood Information such as years of dialysis and weight before dialysis.

應當理解，當系統準備好投入實際使用時(例如，開發出經充分訓練的機器學習模型時)，表3-1中描述的資料集HD_i可直接用於特徵提取和透析中不良事件的預測。然而，亦可登錄透析中不良事件至資料集HD_i以作為建立機器學習模型之訓練資料。 It should be understood that when the system is ready for practical use (for example, when a fully trained machine learning model is developed), the dataset _HDi described in Table 3-1 can be directly used for feature extraction and prediction of adverse events in dialysis . However, adverse events during dialysis can also be registered in the data set HD _i as training data for building a machine learning model.

特徵提取 feature extraction

在資料收集後，特徵提取模組10接續從來自血液透析療程的資料集HD_i中提取的資料以作為分析用的特徵。在本文所描述的實施例中，特徵提取理想地為使用AWK程式執行，但可用任意合適的程式或應用程式，本揭露並不限於此。 After the data is collected, the feature extraction module 10 continues to extract the data from the data set HD _i from the hemodialysis treatment session as features for analysis. In the embodiments described herein, feature extraction is ideally performed using the AWK program, but any suitable program or application may be used, as the disclosure is not limited thereto.

為避免因每個血液透析療程中設置和啟動透析的操作程序不同而在資料集HD_i的起始處出現假影，故每個資料集HD_i起始處若有血流速率(即，資料點Y_5,k)於時間戳記T₁和T₂間發生改變的情況，則此資料集HD_i的第一個資料點Y_j,1(即，紀錄{Y_j,1,T₁})將被排除。任何在給定時間戳記T_k處的紀錄{Y_j,k,T_k}在由於透析中斷(例如，透析機替換或病患排尿/排便)導致血流速率等於或低於零的情況下亦將被排除。若血液透析療程被中斷超過一次，則整個血液透析療程(完整資料集HD_i)將從特徵提取流程中排除。 In order to avoid artifacts at the beginning of the data set HD _i due to the different operating procedures for setting and starting dialysis in each hemodialysis session, if there is blood flow velocity at the beginning of each data set HD _i (that is, data point Y _5,k ) changes between time stamps T ₁ and T ₂ , then the first data point Y _j,1 of this data set HD _i (ie record {Y _j,1 ,T ₁ }) will be excluded. Any record {Y _j,k ,T _k } at a given time stamp T _k also has a blood flow rate equal to or lower than zero due to a dialysis interruption (e.g., dialysis machine replacement or patient urination/defecation). will be excluded. If a hemodialysis session is interrupted more than once, the entire hemodialysis session (full dataset HD _i ) will be excluded from the feature extraction process.

對於訓練資料，來自血液透析療程之HD_i資料集的紀錄{Y_j,k,T_k}的完整集合在無登錄透析中不良事件(例如前述表2中所列任一不良事件)的情況下將被納入以作特徵提取。另一方面，對於已登錄透析中不良事件的血液透析療程，僅有在第一次發生不良事件前之相應資料集HD_i的紀錄{Y_j,k,T_k}會被納入以作特徵提取，意味著此血液透析療程持續時間少於4小時。 For the training data, the complete set of records {Y _j,k ,T _k } from the HD _i data set of the hemodialysis sessions in the absence of registered adverse events in dialysis (such as any of the adverse events listed in Table 2 above) will be included for feature extraction. On the other hand, for hemodialysis courses that have registered adverse events in dialysis, only the records {Y _j,k ,T _k } of the corresponding data set HD _i before the first occurrence of adverse events will be included for feature extraction , meaning that the hemodialysis session lasted less than 4 hours.

由於兩筆相鄰紀錄{Y_i,k,T_k}間的時間間隔和血液透析療程持續時間各不相同，使得回歸分析具有挑戰性，且需將所量測變量的時間特徵納入用於分類的分析中。為此，特徵提取模組10自透析和生理測量的紀錄{Y_j,k,T_k}中導出平均值、平均值的標準差及變異係數，連同線性回歸的斜率和R平方，以作為分析之特徵。此外，特徵提取模組10還導出靜脈壓(VP)和跨膜壓(TMP)變化率的最大值、最小值和平均值(一次微分)，連同二次微分以作為分析之特徵。 Regression analysis is challenging due to the time interval between two adjacent records {Y _i,k ,T _k } and the duration of hemodialysis sessions, and the temporal characteristics of the measured variables need to be included for classification in the analysis. To this end, the feature extraction module 10 derives the mean, the standard deviation of the mean and the coefficient of variation from the records {Y _j,k ,T _k } of the dialysis and physiological measurements, together with the slope and R-square of the linear regression, for analysis The characteristics. In addition, the feature extraction module 10 also derives the maximum value, minimum value and average value (first differential) of the rate of change of venous pressure (VP) and transmembrane pressure (TMP), together with the second differential as features for analysis.

參照下表4和表5，其揭示特徵提取模組10從血液透析療程中提取之特徵的示例。舉例而言，總共提取的84項特徵{X_h}(h=1至84，其指如下所示之特徵編號，其中「#」代表「編號」)中，包括來自上述HD_i資料集之原始測量的特徵和從中導出之時間方面的特徵。然而，基於實務中資料集HD_i的內容，特徵總數可為大於或小於84，本揭露並不限於此。 Refer to Table 4 and Table 5 below, which disclose examples of features extracted by the feature extraction module 10 from hemodialysis sessions. For example, a total of 84 extracted features {X _h } (h=1 to 84, which refers to the feature numbers shown below, where "#" represents "number"), including _the original Measured characteristics and temporal characteristics derived therefrom. However, based on the content of the dataset HD _i in practice, the total number of features may be greater or less than 84, and the present disclosure is not limited thereto.

說明： illustrate:

A：時間間隔≡T_i-T_i-1，i=1~n A: Time interval≡T _i -T _i-1 , i=1~n

B：K_i=△X_i/(T_i-T_i-1)=(X_i-X_i-1)/(T_i-T_i-1) B: K _i =△X _i /(T _i -T _i-1 )=(X _i -X _i-1 )/(T _i -T _i-1 )

C：△X_i≡X_i-X_i-1 C: △X _i ≡X _i -X _i-1

D：平均值*為依持續時間加權，然後除以總記錄時間： D: Average* is weighted by duration and divided by total recording time:

Σ((X_i-X_i-1)×(T_i-T_i-1))/Σ(T_i-T_i-1)=Σ((X_i-X_i-1)×(T_i-T_i-1))/(T_n-T₀) Σ((X _i -X _i-1 )×(T _i -T _i-1 ))/Σ(T _i -T _i-1 )=Σ((X _i -X _i-1 )×(T _i - T _i-1 ))/(T _n -T ₀ )

於特徵提取後，特徵提取模組10將所提取之特徵儲存至資料儲存模組20中以備後用(例如，用於建立機器學習模型)，或者直接發送至模型建立與優化模組30以預測透析中不良事件。如上所述，HD_i資料集的{Y_j,k,T_k}在靜脈壓或跨膜壓改變時即被記錄下來。因此，在兩個相鄰測量時間戳記T_k和T_k-1間的時間T_p處的任意測量值可被指定為{Y_j,k,T_p}={Y_j,k,T_k-1}。亦即，於特徵提取模組10實時運作時，可在任意時間(例如，T_p)終止血液透析療程中資料集HD_i的特徵提取以進行儲存或預測。 After feature extraction, the feature extraction module 10 stores the extracted features in the data storage module 20 for later use (for example, for establishing a machine learning model), or directly sends them to the model building and optimization module 30 for predicting Adverse events during dialysis. As mentioned above, {Y _j,k ,T _k } of the HD _i data set is recorded when the venous or transmembrane pressure changes. Thus, any measurement at time T _p between two adjacent measurement timestamps T _k and T _k-1 can be specified as {Y _j,k ,T _p }={Y _j,k ,T _{k- 1} }. That is, when the feature extraction module 10 is running in real time, the feature extraction of the data set HD _i in the hemodialysis session can be terminated at any time (for example, T _p ) for storage or prediction.

用於模型建立的結果標記 Result markers for model building

於訓練期間，在將相應資料集HD_i及/或其提取的特徵用於模型建立之前，可以先標記與血液透析療程相關之結果。舉例而言，在所研究的4221個血液透析療程中，具有一個或多個不良事件的血液透析療程標記為1，而無不良事件的血液透析療程標記為0。亦可在不考慮真實結果並保持與實驗集相同的0對1比率下，藉由隨機重新標記所述4221個血液透析療程來設立陰性對照集。 During training, outcomes related to hemodialysis sessions may be labeled before the corresponding data set HD _i and/or its extracted features are used for model building. For example, among the 4221 hemodialysis sessions studied, hemodialysis sessions with one or more adverse events were marked as 1, while hemodialysis sessions with no adverse events were marked as 0. A negative control set was also created by randomly relabeling the 4221 hemodialysis sessions without considering the true outcome and maintaining the same 0 to 1 ratio as the experimental set.

在結果標記後，模型建立與優化模組30執行其機器學習模型的建立過程包括：(1)建立二元分類模型(例如，使用諸如集成或感知器的演算法)，其根據輸入時給定的資料集HD_i輸出0或1之標記；以及(2)藉由四重交叉驗證(例如，利用微軟開發的Azure服務)評估所述二元分類模型。對於每個機器學習模型，其模型建立以引入不同隨機數之方式重覆上述建立過程至少三次來執行。 After the results are marked, the model building and optimization module 30 performs its machine learning model building process, including: (1) building a binary classification model (for example, using an algorithm such as ensemble or perceptron), which is based on the given input Dataset HD _i outputs a label of 0 or 1; and (2) evaluates the binary classification model by quadruple cross-validation (eg, using the Azure service developed by Microsoft). For each machine learning model, its model building is performed by repeating the above-mentioned building process at least three times by introducing different random numbers.

選取最佳表現特徵 Pick the best performing features

除標記結果之外，機器學習模型還可以基於從全84項特徵中挑選之關鍵特徵建立。舉例而言，模型建立與優化模組30還被配置為在建立機器學習模型期間實行演算法以選取與透析中不良事件發生相關之關鍵特徵。藉由判定模型建立的關鍵特徵，能在不損失預測準確度的情況下有效降低運算負載。進一步地，機器學習模型基於關鍵特徵預測透析中不良事件後，所述關鍵特徵可作為調整血液透析機參數之參考依據。 In addition to labeling results, machine learning models can also be built based on key features selected from the full 84 features. For example, the model building and optimization module 30 is also configured to implement an algorithm during machine learning model building to select key features associated with the occurrence of adverse events in dialysis. By determining the key features established by the model, the computing load can be effectively reduced without losing the prediction accuracy. Further, after the machine learning model predicts adverse events in dialysis based on key features, the key features can be As a reference for adjusting the parameters of the hemodialysis machine.

舉例而言，為查明在預測目標透析中不良事件中哪些特徵比其他特徵更重要，可選取關鍵特徵(從全84項特徵中)並用在模型建立與優化模組30的模型建立，如此便可比較使用選定的關鍵特徵及與使用全84項特徵所建立之機器學習模型之預測結果間的差異。由此，可使用MATLAB(例如，邁斯沃克公司(MathWorks Inc.)開發的MATrixLABoratory)來執行關鍵特徵之選取，但亦可利用任意合適的程式或應用程式，而不會限制本揭露的範圍。然後，所選取的關鍵特徵可用於建立如上所討論之機器學習模型(例如，使用集成隨機下採樣提升樹建立二元分類模型，並藉由四重交叉驗證進行評估)。接著，可藉由從所建立的機器學習模型的預測結果中總和真陽性和真陰性的百分比來給予所建立的機器學習模型的評分，以用於稍後階段中之比較。 For example, in order to find out which features are more important than other features in predicting adverse events in target dialysis, key features (from all 84 features) can be selected and used in the model building of the model building and optimization module 30, so that It is possible to compare the difference between the selected key features and the prediction results of the machine learning model built using all 84 features. Thus, key feature selection may be performed using MATLAB (eg, MATrixLABoratory, developed by MathWorks Inc.), although any suitable program or application may be utilized without limiting the scope of the present disclosure. The selected key features can then be used to build a machine learning model as discussed above (eg, a binary classification model using ensemble random downsampled boosting trees and evaluated by quadruple cross-validation). Then, the established machine learning model can be given a score by summing the percentages of true positives and true negatives from the prediction results of the established machine learning model for comparison in a later stage.

關鍵特徵選取之過程在此進一步詳述。首先，藉由依次使用84項特徵中的單一項特徵來建立第一組機器學習模型，並為第一組機器學習模型中的每一個根據其預測結果(基於真陽性和真陰性的百分比)給出一個評分。接著，從二項特徵組合池中選出最佳二項特徵組合來建立第二組機器學習模型，所述二項特徵組合池藉由從上一步驟中所得第一組機器學習模型的評分中所選出之最佳特徵(例如，貢獻最高分之機器學習模型的特徵)和每一個其餘83項特徵各自結合所建立的。接下來，為第二組機器學習模型所選取得分高於第一組機器學習模型之最高分的二項特徵組合被保留以用於下一步驟。同樣地，可以從三項特徵組合池中選取最佳之三項特徵組合來建立第三組機器學習模型，所述三項特徵組合池藉由從第二組機器學習模型的評分中所選出之最佳二項特徵組合(例如，貢獻最高分之機器學習模型的二項特徵組合)和每一個其餘82項特徵各自結合所建立的，並且為第三組機器學習模型所選取得分高於第二組機器學習模型之最高分的三項特徵組合被保留以用於下一步驟。此過程被重複，直到選出最佳20項特徵組合，而最常出現在這些20項特徵組合中的特徵被定義為關鍵特徵。 The process of key feature selection is further detailed here. First, the first set of machine learning models is built by sequentially using a single feature in the 84 features, and each of the first set of machine learning models is given according to its prediction results (based on the percentage of true positives and true negatives). Give a rating. Then, the best binomial feature combination is selected from the binomial feature combination pool to establish a second group of machine learning models. The selected best features (eg, the features that contributed the highest score to the machine learning model) and each of the remaining 83 features were individually combined to build. Next, binomial feature combinations selected for the second set of machine learning models with higher scores than the highest scores of the first set of machine learning models are retained for use in the next step. Similarly, the best three feature combinations can be selected from the three feature combination pools to build a third group of machine learning models, and the three feature combination pools are selected from the scores of the second group of machine learning models. The best binomial feature combination (e.g., the binomial feature combination that contributed the highest score to the machine learning model) and each of the remaining 82 features The combination of the three features established together and selected for the third group of machine learning models with a score higher than the highest score of the second group of machine learning models is retained for the next step. This process was repeated until the best 20 feature combinations were selected, and the features that most frequently appeared in these 20 feature combinations were defined as key features.

結果 result

研究參與者的人口學特徵 Demographics of Study Participants

下表6依據本文所描述的實施例顯示概述參與的108名病患之特徵的表。舉例而言，在上述108名病患中，平均年齡為63.6歲；60名病患(55.6%)為男性；血液透析平均年資為7.7年；47名病患(43.5%)患有糖尿病；69名病患(63.9%)患有高血壓；11名病患(10.2%)患有冠狀動脈疾病；12名病患(11.1%)患有鬱血性心衰竭；7名病患(6.5%)有中風病史；3名病患(2.8%)患有慢性阻塞性肺疾病；2名病患(1.9%)患有周邊血管疾病；以及2名病患(1.9%)患有惡性腫瘤。 Table 6 below shows a table summarizing the characteristics of the 108 patients involved, according to the examples described herein. For example, among the above 108 patients, the average age was 63.6 years; 60 patients (55.6%) were male; the average age of hemodialysis was 7.7 years; 47 patients (43.5%) had diabetes; 10 patients (63.9%) had hypertension; 11 patients (10.2%) had coronary artery disease; 12 patients (11.1%) had congestive heart failure; 7 patients (6.5%) had History of stroke; chronic obstructive pulmonary disease in 3 patients (2.8%); peripheral vascular disease in 2 patients (1.9%); and malignancy in 2 patients (1.9%).

資料以適當的平均值±標準差或百分比表示。 Data are presented as mean ± standard deviation or percentage as appropriate.

此外，依據從此108名病患所記錄到的透析中不良事件(參見表2)的發生情況來看，4個血液透析療程有超過3個透析中不良事件；19個血液透析療程有3次不良事件；106個血液透析療程有兩個不良事件和276個血液透析療程有一個不良事件。在總共4221個血液透析療程中，有406個血液透析療程有不良事件。 In addition, according to the occurrence of adverse events in dialysis (see Table 2) recorded from these 108 patients, there were more than 3 adverse events in dialysis in 4 courses of hemodialysis; 3 adverse events in 19 courses of hemodialysis Events; 106 hemodialysis sessions had two adverse events and 276 hemodialysis sessions had one adverse event. Of a total of 4221 hemodialysis sessions, 406 hemodialysis sessions had adverse events.

預測模型之表現 Performance of predictive models

為增加結果1對0的比率(即，有不良事件的血液透析療程標記為1，沒有不良事件的血液透析療程標記為0)，表2中列出的27項不良事件分為三組，以用於建立機器學習模型。第一組對應除血壓升高、血管通路阻塞和血管通路栓塞之外的所有透析中不良事件，共323個血液透析療程被分派到此組(第1組)。第二組對應之透析中不良事件包括肌肉痙攣，138個血液透析療程被分派到此組(第2組)。第三組對應之透析中不良事件包括血壓升高，108個血液透析療程被分派到該組(第3組)。 To increase the ratio of 1 to 0 outcomes (i.e., a hemodialysis session with an adverse event is marked as 1 and a hemodialysis session without an adverse event is marked as 0), the 27 adverse events listed in Table 2 were divided into three groups, with Used to build machine learning models. The first group corresponds to all adverse events in dialysis except blood pressure increase, vascular access obstruction and vascular access embolism, and a total of 323 hemodialysis sessions were assigned to this group (group 1). Adverse events during dialysis included muscle cramps in the second group, to which 138 hemodialysis sessions were assigned (group 2). Adverse events during dialysis included elevated blood pressure in the third group, to which 108 hemodialysis sessions were assigned (group 3).

第1組：除血壓升高以外之所有事件 Group 1: All events except elevated blood pressure

圖3A和圖3B及表7描述了用於預測屬於第1組不良事件之機器學習模型的效能，曲線a至k代表使用不同特徵組合建立之機器學習模型。在此場景中，使用學習率為20且最大疊代次數為20的二元分類平均感知器於模型建立。 Figure 3A and Figure 3B and Table 7 describe the performance of the machine learning model for predicting adverse events belonging to Group 1, and curves a to k represent machine learning models built using different feature combinations. In this scenario, a binary categorical average perceptron with a learning rate of 20 and a maximum iteration of 20 is used for model building.

對於84項特徵模型(曲線a)，平均曲線下面積(AUC)為0.83、標準差(SD)為0.03、F1分數為0.53、敏感度為0.53、而特異度為0.96。與陰性對照(曲線b)相比(平均AUC為0.50、SD為0.04、而F1分數為0.15)，二元分類平均感知器的84項特徵模型能合理預測不良事件。亦測試了其他演算法的預測。舉例而言，二元分類支持向量機器(SVM)得到的平均AUC為0.83(SD 0.02)、F1分數為0.55、敏感度為0.53、而特異度為0.96。此結果與平均感知器獲得的結果相似。與平均感知器和SVM算法相比，二元分類邏輯式回歸和決策森林並不能很好地預測不良事件。邏輯式回歸得到的平均AUC為0.82(SD 0.02)，且F1分數為0.48，決策森林得到的平均AUC為0.83(SD 0.02)，且F1分數為0.46。額外地，用於抽樣的病患內分區(平均AUC為0.83、SD 0.03，而平均F1分數為0.53、SD 0.02)和病患間分區(平均AUC為0.82、SD 0.04，而平均F1分數為0.50、SD 0.06)並未在預測上呈現顯著差異。 For the 84-item feature model (curve a), the mean area under the curve (AUC) was 0.83, the standard deviation (SD) was 0.03, the F1 score was 0.53, the sensitivity was 0.53, and the specificity was 0.96. Compared to the negative control (curve b) (mean AUC 0.50, SD 0.04, and F1 score 0.15), the 84-item feature model of the binary classification average perceptron reasonably predicted adverse events. The predictions of other algorithms were also tested. For example, a support vector machine (SVM) for binary classification yielded a mean AUC of 0.83 (SD 0.02), an F1 score of 0.55, a sensitivity of 0.53, and a specificity of 0.96. This result is similar to that obtained by the average perceptron. Compared with the average perceptron and SVM algorithms, binary logistic regression and decision forests are not Can predict adverse events well. The average AUC obtained by logistic regression was 0.82 (SD 0.02), and the F1 score was 0.48, and the average AUC obtained by decision forest was 0.83 (SD 0.02), and the F1 score was 0.46. Additionally, the within-patient partition (mean AUC of 0.83, SD 0.03, and mean F1 score of 0.53, SD 0.02) and the between-patient partition (mean AUC of 0.82, SD 0.04, and mean F1 score of 0.50) used for sampling , SD 0.06) showed no significant difference in prediction.

超過濾速率和超過濾體積為血液透析相關參數。然而，本文中模型之效能顯示，採用單一特徵，例如，超過濾體積最大值(特徵78，參照曲線k)或超過濾速率變化平均值(特徵77，參照曲線j)，無法妥當預測不良事件。由超過濾體積最大值(定義為最末時間點所記錄到的超過濾體積)所建立之模型，其AUC為0.48，且F1分數為0.15，與陰性對照結果相似。另一方面，由血液透析療程期間超過濾速率變化之平均值所建立的模型具有AUC為0.70，且F1分數為0.28。結合兩項超過濾相關特徵(曲線h)亦無法預測不良事件。在使用多達6項與超過濾體積相關之特徵(特徵78至83，參照曲線f)進行預測後，AUC自0.48增加至0.82，而F1分數自0.15增加至0.46。具有14項超過濾特徵(特徵70至83，參照曲線e)之模型具有AUC為0.83，且F1分數為0.52。 Ultrafiltration rate and ultrafiltration volume are parameters related to hemodialysis. However, the performance of the models in this paper showed that adverse events could not be adequately predicted using a single feature, for example, maximum ultrafiltration volume (feature 78, see curve k) or mean change in ultrafiltration rate (feature 77, see curve j). The model established by the maximum ultrafiltration volume (defined as the ultrafiltration volume recorded at the last time point) had an AUC of 0.48 and an F1 score of 0.15, similar to the results of the negative control. On the other hand, the model built from the mean of the change in ultrafiltration rate during a hemodialysis session had an AUC of 0.70 and an F1 score of 0.28. Combining the two ultrafiltration-related features (curve h) also did not predict adverse events. After prediction using up to 6 features related to ultrafiltration volume (features 78 to 83, see curve f), AUC increased from 0.48 to 0.82, while F1 score increased from 0.15 to 0.46. The model with 14 ultrafiltered features (features 70 to 83, see curve e) had an AUC of 0.83 and an F1 score of 0.52.

接下來，選取20項特徵組合中出現頻率最高的21項特徵進行評估(曲線c)。基於這些表現最佳的21項特徵，但略過超過濾相關特徵之二元分類平均感知器模型，顯示平均AUC為0.82(SD 0.02)，而F1分數為0.45。然而，為此模型增加一或兩項特徵並不會顯著增強預測(例如，23項最佳特徵模型僅得到平均AUC為0.82、SD 0.02、而F1分數為0.46)。與基於不包括超過濾相關特徵之所有特徵的模型相比(參照曲線d，其使用了70項特徵，且AUC為0.81、F1分數為0.45)，21項最佳特徵模型(不含超過濾相關特徵)的結果顯示，使用全84項特徵中的四分之一即足以用於預測不良事件。 Next, select the 21 features with the highest frequency among the 20 feature combinations for evaluation (curve c). A binary classification averaged perceptron model based on these top-performing 21-item features, but omitting ultra-filtered relevant features, showed a mean AUC of 0.82 (SD 0.02) and an F1 score of 0.45. However, adding one or two features to this model did not significantly enhance predictions (e.g., the best 23-item feature model only resulted in a mean AUC of 0.82, SD 0.02, and an F1 score of 0.46). Compared to a model based on all features excluding ultra-filtered relevant features (cf. curve d, which uses 70 features and has an AUC of 0.81, F1 score 0.45), the results of the 21 best feature models (excluding ultra-filtered features) showed that using a quarter of all 84 features was sufficient for predicting adverse events.

準確地說，從全84項特徵中選出的21項特徵為年齡、最大跨膜壓、最小收縮壓(SBP)、最小舒張壓(DBP)、最小脈搏壓、最小血流速率、平均SBP、平均靜脈壓、平均跨膜壓、SBP線性回歸斜率、DBP線性回歸斜率、脈搏壓線性回歸斜率、脈搏率線性回歸斜率、跨膜壓線性回歸斜率、血流速率平均值標準差、脈搏壓線性回歸R平方、以及靜脈壓二次微分之相關參數(參照列於表4和5中之特徵2、5、6、8、11、14、17、20、21、26、29、31、36、47至52、57及59)。 To be precise, the 21 features selected from the total 84 features are age, maximum transmembrane pressure, minimum systolic blood pressure (SBP), minimum diastolic blood pressure (DBP), minimum pulse pressure, minimum blood flow rate, mean SBP, mean Venous pressure, mean transmembrane pressure, SBP linear regression slope, DBP linear regression slope, pulse pressure linear regression slope, pulse rate linear regression slope, transmembrane pressure linear regression slope, blood flow rate mean standard deviation, pulse pressure linear regression R Squared, and related parameters of the second differential of venous pressure (see features 2, 5, 6, 8, 11, 14, 17, 20, 21, 26, 29, 31, 36, 47 to 52, 57 and 59).

第2組：肌肉痙攣 Group 2: Muscle cramps

圖4和表8描述了用於預測屬於第2組不良事件之機器學習模型的效能，曲線a到k代表使用不同特徵組合建立之機器學習模型。 Figure 4 and Table 8 describe the performance of machine learning models for predicting adverse events belonging to Group 2, curves a to k represent machine learning models built using different feature combinations.

如圖4和表8所見，基於14個超過濾相關特徵(參照曲線d)的模型在預測肌肉痙攣發生之平均AUC為0.85(SD 0.04)，且F1分數為0.45，此結果類似於84項特徵模型(參照曲線a，其平均AUC為0.82、SD為0.04、且F1分數為0.42)，並且優於基於不包括超過濾相關特徵之所有特徵所建立之模型(參照曲線c，其平均AUC為0.79、SD為0.04、而F1分數為0.30)。然而，單一超過濾相關特徵(參照曲線i和k)無法妥當預測痙攣。兩項超過濾相關特徵之組合亦無法預測肌肉痙攣(參照曲線f，其包括特徵70至77，並具有AUC為0.79，且F1分數為0.29；或曲線e，其包括特徵78至83，並具有AUC為0.84，且F1分數為0.37)。上述結果說明超過濾相關特徵比其他特徵對於肌肉痙攣預測的貢獻更大。 As seen in Figure 4 and Table 8, the model based on 14 ultra-filtered features (refer to curve d) had an average AUC of 0.85 (SD 0.04) in predicting the occurrence of muscle spasm, and an F1 score of 0.45, which was similar to that of 84 features model (cf. curve a, with mean AUC of 0.82, SD of 0.04, and F1 score of 0.42), and outperformed models based on all features excluding ultrafiltration-related features (see curve c, with mean AUC of 0.79 , SD is 0.04, and the F1 score is 0.30). However, a single ultrafiltration-related feature (see curves i and k) does not adequately predict spasticity. The combination of two ultrafiltration-related features also failed to predict muscle spasticity (see curve f, which includes features 70 to 77, and has an AUC of 0.79, and an F1 score of 0.29; or curve e, which includes features 78 to 83, and has AUC was 0.84 and F1 score was 0.37). The above results suggest that ultrafiltration-related features contribute more to the prediction of muscle spasticity than other features.

第3組：血壓升高 Group 3: Elevated blood pressure

圖5和表9描述了用於預測屬於第3組不良事件之機器學習模型的效能，曲線a到d代表使用不同特徵組合建立之機器學習模型。 Figure 5 and Table 9 describe the performance of machine learning models for predicting adverse events belonging to group 3, curves a to d represent machine learning models built using different combinations of features.

如圖5和表9所見，基於全84項特徵以預測高血壓發生之模型(參照曲線a)具有平均AUC為0.93(SD 0.02)，而F1分數為0.41。與基於14項超過濾相關特徵所建立的模型(參照曲線c，其AUC為0.72，且F1分數為0.22)相比，其結果顯示超過濾參數未能在預測透析中的高血壓發揮重要作用。儘管基於24項血壓相關特徵的模型(參照曲線d，其AUC為0.92、SD為0.03、且F1分數為0.38)之AUC高於0.9，但血壓以外之特徵可以使F1分數獲得額外改進。 As seen in Figure 5 and Table 9, the model based on all 84 features to predict the occurrence of hypertension (refer to curve a) had an average AUC of 0.93 (SD 0.02), and an F1 score of 0.41. Compared with the model based on 14 ultrafiltration-related features (refer to curve c, its AUC is 0.72, and its F1 score is 0.22), the results show that ultrafiltration parameters do not play an important role in predicting hypertension in dialysis. Although the AUC of the model based on 24 blood pressure-related features (cf. curve d, with AUC of 0.92, SD of 0.03, and F1 score of 0.38) was higher than 0.9, features other than blood pressure could lead to additional improvements in F1 score.

隨時間推移的不良事件預測機率之一致性 Consistency of Predicted Probability of Adverse Events Over Time

圖6和圖7及表10描述了基於時間序列特徵來預測透析中不良事件之機器學習模型的表現，表示為0、5、10、15、20和60分鐘的曲線代表所述機器學習模型在發生任何透析中不良事件前之表示時間時的預測能力。 Fig. 6 and Fig. 7 and table 10 have described the performance of the machine learning model that predicts the adverse event in dialysis based on time series feature, and the curve that is expressed as 0, 5, 10, 15, 20 and 60 minutes represents the described machine learning model in Predictive ability at the time indicated before the occurrence of any adverse event in dialysis.

在本文描述的實施例中，時間序列特徵在整個血液透析療程中從血液透析療程開始直到記錄到不良事件前的時間點，或血液透析療程正要結束的時間點(即，未記錄到不良事件)期間所收集。在此情況下，特徵收集的結尾若在其結束時間點正好在記錄到不良事件發生之前，則將被定義為0分鐘。除0分鐘外，特徵收集的截止結束時間點亦被設為不良事件發生前的5、10、15、20、及60分鐘，以評估預測準確性。 In the examples described herein, the time series feature is used throughout the hemodialysis session from the beginning of the hemodialysis session until the time point before the adverse event is recorded, or the end of the hemodialysis session collected during the time point (i.e., no adverse events were recorded). In this case, the end of feature collection will be defined as 0 minutes if its end time point is just before the occurrence of recorded adverse events. In addition to 0 minutes, the cut-off end time points of feature collection were also set as 5, 10, 15, 20, and 60 minutes before the occurrence of adverse events to evaluate the prediction accuracy.

如圖6和表10所示，用於預測所有透析中不良事件(血壓升高除外)的機器學習模型顯示，與由較早截止時間點之特徵所習得的效能相比，即便來自在記錄到的不良事件前或無不良事件之血液透析療程的結束前之5、10、15和20分鐘截止結束時間點之特徵的AUC分數為約0.80，且其F1分數都低於0.5，0分鐘截止之特徵導致最佳AUC和F1分數(參見0分鐘曲線，其AUC為0.83，而F1分數為0.53)。此結果表明，落在索引不良事件前的20分鐘時間窗中的資訊是有價值的，但落在索引不良事件前的5分鐘時間窗中的資訊對事件預測的影響更大。 As shown in Figure 6 and Table 10, the machine learning model used to predict all adverse events in dialysis (except blood pressure increase) showed that compared with the performance learned from the characteristics of earlier cut-off time points, even from the The AUC scores of the characteristics of the 5, 10, 15 and 20 minutes cut-off time points before the end of the adverse event or the end of the hemodialysis course without adverse events were about 0.80, and their F1 scores were all lower than 0.5. The feature leads to the best AUC and F1-score (see the 0-minute curve with AUC of 0.83 and F1-score of 0.53). This result suggests that information falling in the 20-minute time window preceding the indexed adverse event is valuable, but information falling in the 5-minute time window preceding the indexed adverse event has a greater impact on event prediction.

參見圖7，為進一步了解預測準確度對截止結束時間點的依賴，隨機挑選500個血液透析療程，以與自84項特徵獲得在記錄的不良事件發生前0、5、10、15和20分鐘的不良事件之預測機率比較。如圈選資料點所示，有5個血液透析療程在使用基於不同截止結束時間點所提取的特徵對不良事件的預測機率有很強的一致性，且這5個血液透析療程有發生不良事件。由於在隨機挑選的500個血液透析療程中應有約40個血液透析療程發生不良事件，因此結果顯示至少有十分之一具有不良事件的血液透析療程最早得以提前20分鐘被預測，且可以藉由使用後續截止結束時間點之特徵的即時機器學習作進一步確認。 See Figure 7. In order to further understand the dependence of the prediction accuracy on the cut-off end time point, 500 hemodialysis sessions were randomly selected to compare with the 84 features obtained at 0, 5, 10, 15 and 20 minutes before the occurrence of recorded adverse events Comparison of predicted probabilities of adverse events. As shown by the circled data points, there are 5 hemodialysis courses that use the features extracted based on different cut-off time points to predict the probability of adverse events with strong consistency, and these 5 hemodialysis courses have adverse events . Since adverse events should occur in about 40 hemodialysis sessions out of 500 randomly selected hemodialysis sessions, the results show that at least one in ten hemodialysis sessions with adverse events could be predicted as early as 20 minutes in advance and can be predicted by Further confirmation was made by real-time machine learning using features from subsequent cut-off end time points.

儘管84項特徵中沒有一項包含明確時間序列資訊，但特徵提取所採用的線性和差值分析可能會受到血液透析療程時長影響。因此，將無不良事件 (陰性)的血液透析療程截短，並將其預測結果與未截短的血液透析療程之預測結果進行比較。由於發生不良事件(陽性)的血液透析療程平均長度為3.3小時，陰性血液透析療程被截短並隨機設置3到3.5小時間的結束點(T_end)，而陽性療程的結束點保持不變。結束點T_end處的{Y_j,k,T_k}紀錄依照與任意時間T_p處{Y_j,k,T_k}紀錄所使用的相同方法定義之。在結果方面，其平均AUC為0.89(SD 0.019)、F1分數為0.55、敏感度為0.52、且特異度為0.97。替代地，當結束點精確設在3.3小時時，其AUC為0.86，而F1分數為0.55。與持續約4小時之未截短的陰性HD療程所獲得的原始結果相比(AUC為0.83、F1分數為0.53、敏感度為0.53、且特異度為0.96)，結束點設於較早時的預測結果較佳。事實上，當陰性血液透析療程的結束點隨機設於2.5和3.5小時之間時，其AUC為0.92、F1分數為0.62、敏感度為0.61、而特異度為0.98。 Although none of the 84 features contained explicit time series information, the linear and difference analyzes used for feature extraction may be affected by the length of hemodialysis sessions. Therefore, hemodialysis sessions without adverse events (negative) were truncated and their predictions were compared with those of non-truncated hemodialysis sessions. Since the mean length of hemodialysis sessions with adverse events (positive) was 3.3 hours, negative hemodialysis sessions were truncated and randomized to end points between 3 and 3.5 hours in time (T _end ), while the end points of positive sessions were kept constant. The {Y _j,k ,T _k } record at the end point T _end is defined in the same way as the {Y _j,k ,T _k } record at any time T _p . In terms of results, the mean AUC was 0.89 (SD 0.019), the F1 score was 0.55, the sensitivity was 0.52, and the specificity was 0.97. Alternatively, when the end point was set at exactly 3.3 hours, its AUC was 0.86 and the F1 score was 0.55. Compared to the original results obtained with an untruncated negative HD session lasting approximately 4 hours (AUC 0.83, F1 score 0.53, sensitivity 0.53, and specificity 0.96), the end point was set at an earlier The forecast is better. In fact, when the end point of negative hemodialysis sessions was randomized between 2.5 and 3.5 hours, the AUC was 0.92, the F1 score was 0.62, the sensitivity was 0.61, and the specificity was 0.98.

貢獻和主要觀察 Contributions and key observations

根據本文描述的實施例之發現可知，將線性和差值分析與二元分類機器學習相結合之演算法對於預測透析中不良事件具有高AUC。當試圖從全84項特徵中找出對預測高血壓以外所有不良事件(第1組)貢獻最大的特徵時，最佳23項特徵中僅有特徵76和特徵82與超過濾(超過濾速率變化次數及超過濾體積線性回歸斜率)有關。排除上述兩項超過濾相關特徵後，發現剩餘21項特徵便足以準確預測並具有良好區辨能力，其AUC從0.83(84項特徵)略降到0.82(21項特徵)。藉由14項超過濾相關特徵建立的模型亦有良好的0.83的AUC。因此，選取最佳21項與超過濾無關的特徵或整合共14項與超過濾相關的特徵，而不將所有84項特徵都包含在模型建立中，能夠減少運算負載。圖3A和圖3B顯示的結果還表明，所述兩群特徵(曲線c和e)可能埋置有導致不良事件發生的相似因素。 According to the findings of the examples described herein, an algorithm combining linear and difference analysis with binary classification machine learning has a high AUC for predicting adverse events in dialysis. When trying to find the features that contributed the most to predicting all adverse events except hypertension (group 1) from the full 84 features, only features 76 and 82 of the best 23 features were correlated with ultrafiltration (ultrafiltration rate change The number of times and the linear regression slope of the ultrafiltration volume) are related. After excluding the above two ultra-filtering related features, it was found that the remaining 21 features were sufficient for accurate prediction and good discrimination, and its AUC dropped slightly from 0.83 (84 features) to 0.82 (21 features). The model established by 14 ultra-filtered related features also has a good AUC of 0.83. Therefore, select the best 21 features that are not related to ultra-filtering or integrate a total of 14 features that are related to ultra-filtering, instead of All 84 features are included in the model building, which can reduce the computational load. The results shown in Figures 3A and 3B also suggest that the two population characteristics (curves c and e) may embed similar factors leading to the occurrence of adverse events.

在根據本文描述的實施例之發現中，肌肉痙攣為在血液透析療程期間最頻繁發生的不良事件。肌肉痙攣是血液透析治療期間常見的不良事件，在所有血液透析治療中盛行率為28%。肌肉痙攣為由骨骼肌組織缺血所引起，代表低血壓的早期跡象，且可能導致血液透析療程過早中斷。血液透析期間組織缺血與超過濾速率呈正相關。在嘗試識別對預測肌肉痙攣(第2組)貢獻最大的特徵時，由14項超過濾相關特徵所建立的模型在本研究中預測肌肉痙攣發生之AUC為0.85。當排除所有與超過濾相關的特徵(包括超過濾速率和超過濾體積)來測試預測準確度時，AUC從0.82(84項特徵)下降到0.79(70項特徵)，表明超過濾相關特徵對於預測肌肉痙攣是相關但非必需的。機器學習的結果顯示，與超過濾無關之特徵亦有助於預測透析中的肌肉痙攣。 In findings according to the examples described herein, muscle cramps were the most frequently occurring adverse events during hemodialysis sessions. Muscle cramps are a common adverse event during hemodialysis treatment, with a prevalence of 28% across all hemodialysis treatments. Muscle cramps are caused by ischemia of skeletal muscle tissue, represent an early sign of hypotension, and may lead to premature interruption of hemodialysis. Tissue ischemia during hemodialysis was positively correlated with ultrafiltration rate. In an attempt to identify the features that contributed most to the prediction of muscle spasticity (group 2), the model built from 14 ultra-filtered relevant features had an AUC of 0.85 for predicting the occurrence of muscle spasticity in this study. When excluding all ultrafiltration-related features (including ultrafiltration rate and ultrafiltration volume) to test the prediction accuracy, the AUC dropped from 0.82 (84 features) to 0.79 (70 features), indicating that ultrafiltration-related features are important for prediction Muscle cramps are associated but not required. Machine learning results showed that features unrelated to ultrafiltration also helped predict muscle cramps in dialysis.

一般而言，有症狀的低血壓發生於20%到30%的血液透析療程中。透析中低血壓有兩種主要的病理生理機制。首先，當由超過濾去除血漿液體的速率超過血漿重新填充血管的速率時，血容量就會減少。同時，若心血管和神經荷爾蒙系統不能補償超過濾期間的急性血管容量損失，就會發生低血壓。透析中低血壓的頻繁發作可能導致超過濾減少、「乾體重」不足、前負荷增加、和心臟功能受損，最終導致更多的低血壓發作，從而形成惡性循環。同時，頻繁的透析中，低血壓會破壞透析效率和功效，其與較高的罹病率和死亡率有關，這在一定程度上解釋了心血管疾病為血液透析病患的罹病率和死亡率之主因。現有產品近期開發了一種智慧系統來預測透析中的低血壓。然而，根據本文描述的實施例的機器學習模型，不僅能進一步排除與超過濾相關的特徵，還可以檢查整體透析中的不良事件，而非僅聚焦在低血壓事件。 In general, symptomatic hypotension occurs in 20 to 30 percent of hemodialysis sessions. There are two main pathophysiological mechanisms of hypotension during dialysis. First, blood volume decreases when the rate at which plasma fluid is removed by ultrafiltration exceeds the rate at which plasma refills blood vessels. At the same time, hypotension can occur if the cardiovascular and neurohormonal systems fail to compensate for the acute loss of vascular volume during ultrafiltration. Frequent episodes of hypotension during dialysis may lead to decreased ultrafiltration, insufficient "dry body mass", increased preload, and impaired cardiac function, which eventually lead to more episodes of hypotension, thus creating a vicious cycle. At the same time, during frequent dialysis, hypotension undermines dialysis efficiency and efficacy, which is associated with higher morbidity and mortality, which partly explains why cardiovascular disease is one of the major causes of morbidity and mortality in hemodialysis patients. main cause. Existing Products recently developed an intelligent system to predict hypotension in dialysis. However, according to the machine learning model of the embodiments described herein, not only can the features related to ultrafiltration be further ruled out, but also the Adverse events, rather than just focusing on hypotensive events.

例如，如下表11所示，列出了主要有助於預測肌肉痙攣的最佳16項特徵，其包括病患特徵、靜脈壓、跨膜壓、超過濾、血流速率、和脈搏壓等。在此，靜脈壓最小值和跨膜壓平均值為命中次數最多之特徵(分別為20次和17次)，因此是前兩大特徵。用於預測肌肉痙攣的此前兩大特徵為由血液透析機輸出參數所導出，表明有可能將本文討論的演算法整合至血液透析機軟體中，以提醒臨床醫師並提前調整血液透析機設置。不過，與第1組不良事件之預測不同(其最佳23項特徵中只有2項與超過濾有關)，就預測肌肉痙攣而言，其最佳16項特徵中有8項與超過濾有關，這表明超過濾相關參數為肌肉痙攣之相關因素。 For example, as shown in Table 11 below, the best 16 features that are mainly helpful for predicting muscle spasm are listed, including patient characteristics, venous pressure, transmembrane pressure, ultrafiltration, blood flow rate, and pulse pressure. Here, the venous pressure minimum and the transmembrane pressure mean were the features with the most hits (20 and 17, respectively), and thus the top two features. The two previous features used to predict muscle spasticity were derived from hemodialysis machine output parameters, suggesting that it is possible to integrate the algorithms discussed in this paper into hemodialysis machine software to alert clinicians and adjust hemodialysis machine settings in advance. However, unlike the prediction of adverse events in group 1 (only 2 of its best 23 features were related to ultrafiltration), for the prediction of muscle spasms, 8 of its best 16 features were related to ultrafiltration, This indicates that ultrafiltration-related parameters are correlates of muscle spasm.

在根據本文描述的實施例所建立的數個二元分類模型(貝氏點機器、增強決策樹和SVM)中，由二元分類平均感知器所建立的模型具有最佳的AUC和F1分數。臨床醫師正面臨人工智慧的新時代，電腦科學與透析醫學的整合被認為是全面提高血液透析病患照護品質的第一步。本文所描述的實施例展示了此種整合的可行性。此外，將機器學習與血液透析機相結合並藉由雲端運算和資料集的積累即時調整演算法，有望進一步提高預測表現。 Among several binary classification models (Bayzier point machine, boosted decision tree, and SVM) built according to the embodiments described herein, the model built by the binary classification average perceptron had the best AUC and F1 score. Clinicians are facing a new era of artificial intelligence, and the integration of computer science and dialysis medicine is considered the first step to comprehensively improve the quality of care for hemodialysis patients. The examples described herein demonstrate the feasibility of such integration. In addition, combining machine learning with hemodialysis machines and adjusting algorithms in real time through cloud computing and data collection accumulation is expected to further improve predictive performance.

有關提前預測透析中不良事件，鑑於使用基於不同截止結束時間點的特徵所得不良事件的預測機率之一致性約為預測有不良事件預測之血液透析療程的十分之一(參見圖6和圖7和表10)，可預期在模型訓練中增加具有不良事件血液透析療程的數量，可以改善不平衡的資料，並可能讓警報時間提前。進一步地，由於大多數不良事件發生在血液透析後半段，可預期若包含更多血液透析資料集，則血液透析後半段的資料集即足以進行預測。此外，若招納更多具有不良事件的血液透析療程，可預期為每一個不良事件個別建立預測模型是可達成的，而無需為減少不平衡的資料結果而將不良事件編組。 Regarding the early prediction of adverse events in dialysis, given that the agreement of the predicted probability of adverse events using features based on different cut-off end time points is about one-tenth that of the predicted hemodialysis sessions with adverse event predictions (see Figures 6 and 7 and Table 10), it would be expected that increasing the number of hemodialysis sessions with adverse events in model training would improve unbalanced data and possibly advance the timing of alerts. Furthermore, since most adverse events occurred in the second half of hemodialysis, it can be expected that if more hemodialysis data sets were included, the second half of hemodialysis data sets would be sufficient for prediction. Furthermore, if more hemodialysis sessions with adverse events were enrolled, it is expected that individual predictive models for each adverse event would be achievable without the need to group adverse events to reduce unbalanced data outcomes.

在本文描述的實施例中，建立了AUC高於0.8的二元分類機器學習模型以近乎即時地預測透析中的不良事件。從進行中的血液透析療程中即時提取的特徵所獲得透析中不良事件之預測機率的一致性，能為血液透析療程即將發生的不良事件發出警示。此種藉由雲端運算實現的方法，可以警告臨床醫師提前採取必要措施及調整血液透析機設置。 In the examples described herein, a binary classification machine learning model with an AUC higher than 0.8 was built to predict adverse events in dialysis nearly instantaneously. Consistency in the predicted probabilities of adverse events in dialysis obtained from features extracted in real time from ongoing hemodialysis sessions can provide an alert for impending adverse events in a hemodialysis session. This approach, enabled by cloud computing, can alert clinicians Take the necessary measures and adjust the hemodialysis machine settings in advance.

本揭露藉由示例性實施例闡明其特徵和功效，但不旨在限制本揭露的範圍。本揭露在不脫離其範圍的前提下，本技術領域中具有通常知識者可對其進行各種變更和修改。然而，任何根據本揭露實現的等同變更和修改，均應視為包含在本揭露的範圍內。本揭露的範圍應由所附之申請專利範圍加以限定。 The present disclosure clarifies its characteristics and functions by means of exemplary embodiments, but is not intended to limit the scope of the present disclosure. Various changes and modifications may be made to this disclosure by those skilled in the art without departing from the scope thereof. However, any equivalent changes and modifications realized according to the present disclosure should be deemed to be included in the scope of the present disclosure. The scope of this disclosure shall be limited by the appended claims.

1:系統 1: system

10:特徵提取模組 10: Feature extraction module

20:資料儲存模組 20: Data storage module

30:模型建立與優化模組 30:Model building and optimization module

S1~S4:步驟 S1~S4: steps

Claims

A system for predicting adverse events in dialysis, comprising: a feature extraction module configured to collect and process data about a patient's hemodialysis sessions; and a model building and optimization module configured to based on the The data is used to establish a machine learning model for predicting the adverse event in dialysis during the course of hemodialysis; wherein, the machine learning model is based on the first dimension of the target adverse event in dialysis to be predicted and the blood to be predicted The second dimension of the target time period during the dialysis session was established.

The system according to claim 1, wherein the data about the hemodialysis course of the patient includes one or more of demographic information, physiological data, dialysis data, and registered adverse events in dialysis.

The system of claim 1, wherein the data comprises a data set having a plurality of records having measurements at corresponding time stamps.

The system of claim 3, wherein the feature extraction module collects and processes the data about the patient's hemodialysis session by: deriving from the measurements a mean value, a standard deviation of the mean value, At least one of coefficient of variation, linear regression slope, and linear regression R-square is used as the characteristic of the plurality of records.

The system of claim 3, wherein the measurements include venous pressure and transmembrane pressure, and wherein the feature extraction module collects and processes the data about the patient's hemodialysis session by: The measurement of venous pressure and the transmembrane pressure derives at least one of a maximum, minimum, and average value, and a second differential, of rates of change to characterize the plurality of records.

The system of claim 1, wherein the machine learning model is trained by using the data labeled with outcomes related to the adverse event in dialysis.

The system as claimed in claim 1, wherein the machine learning model is trained by combining key features extracted from the data.

The system as described in claim 1 further includes a data storage module configured to store the data.

A method for predicting adverse events in dialysis, which includes: configuring a feature extraction module to collect and process data about the patient's hemodialysis course; A machine learning model of the adverse event in dialysis during the hemodialysis course; wherein, the machine learning model is based on the first dimension of the target adverse event in dialysis to be predicted and the target period of the hemodialysis course to be predicted Created by the second dimension.

The method according to claim 9, wherein the data about the hemodialysis course of the patient includes one or more of demographic information, physiological data, dialysis data, and registered adverse events in dialysis.

The method of claim 9, wherein the data comprises a data set having a plurality of records having measurements at corresponding time stamps.

The method of claim 11, wherein the feature extraction module collects and processes the data about the patient's hemodialysis session by: deriving from the measurements a mean value, a standard deviation of the mean value, a variance At least one of coefficient, linear regression slope, and linear regression R-square is used as the feature of the plurality of records.

The method of claim 11, wherein the measurements include venous pressure and transmembrane pressure, and wherein the feature extraction module collects and processes the data about the patient's hemodialysis session by: The measurement of venous pressure and the transmembrane pressure derives at least one of a maximum, minimum, and average value, and a second differential, of rates of change to characterize the plurality of records.

The method according to claim 9, wherein the machine learning model is trained by using the data labeled with outcomes related to the adverse events in the dialysis.

The method according to claim 9, wherein the machine learning model is trained by combining key features extracted from the data.

The method described in Claim 9 further includes configuring a data storage module to store the data.

A computer-readable medium storing computer-executable codes, the computer-executable codes are executed to implement the method described in Claim 9.