TW201837745A - Cross-platform analyzing and display system of clinical data - Google Patents
Cross-platform analyzing and display system of clinical data Download PDFInfo
- Publication number
- TW201837745A TW201837745A TW106112205A TW106112205A TW201837745A TW 201837745 A TW201837745 A TW 201837745A TW 106112205 A TW106112205 A TW 106112205A TW 106112205 A TW106112205 A TW 106112205A TW 201837745 A TW201837745 A TW 201837745A
- Authority
- TW
- Taiwan
- Prior art keywords
- disease
- module
- data
- clinical medical
- block
- Prior art date
Links
Landscapes
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Description
本發明是有關於一種臨床醫療數據分析及顯示系統,且特別是一種能夠加速海量數據計算的臨床醫療數據分析及顯示系統。 The invention relates to a clinical medical data analysis and display system, and in particular to a clinical medical data analysis and display system capable of accelerating the calculation of massive data.
共病(comorbidity)指兩種或兩種以上同時或非同時存在的疾病,而共病間的關聯性在臨床醫學的疾病分類(nosology)上非常重要。共病顯示一種疾病暫時的多層次關聯(temporal relationships between co-existing disease),不同年齡出現的慢性病(chronology)往往影響疾病分類的建立和臨床決策。複雜的慢性病史與病人的預後息息相關(BMJ 334,1016-1017(2007)and Annals of family medicine 4,417-422(2006)),特別在老人醫學和癌症(Jama 291,2441-2447(2004))方面。隨著老齡人口逐年增加,針對共病的研究課題應作出應變,但目前並沒有足夠的實證醫學證據能提供相關的診斷與治療決策(JAMA 294,716-724(2005)and Lancet 367,550-551(2006))。 Comorbidity refers to two or more diseases that are simultaneous or non-simultaneous, and the association between comorbidities is very important in the nosology of clinical medicine. The comorbidity shows a temporary relationship between co-existing disease, and chronic diseases of different ages often affect the establishment of disease classification and clinical decision-making. Complex chronic medical history is closely related to the prognosis of patients (BM J 334, 1016-1017 (2007) and Annals of family medicine 4, 417-422 (2006)), particularly in the elderly medicine and cancer (Jama 291, 2441-2447 (2004)). . As the aging population increases year by year, research topics for comorbidities should be strained, but there is currently insufficient empirical medical evidence to provide relevant diagnosis and treatment decisions (JAMA 294, 716-724 (2005) and Lancet 367, 550-551 (2006) ).
共病以發生原因可分為:(1)、因果性(causal),即兩種或以上疾病有共同的病生理。(2)、複雜性(complicating),即與疾病間的特異性死亡有關。根據發生時間又可以區分為並存性(concurrent)、併發性(intercurrent)和相繼的合併症(successive comorbidity)(Journal of child psychology and psychiatry,and allied disciplines 40,57-87(1999)),並存性即為兩種非相關性的疾病同時存在,而併發性(intercurrent)即代表共病間的交互作用受到疾 病的急性期所影響,通常受到時間限制。 Common causes can be divided into: (1), causal, that is, two or more diseases have a common disease physiology. (2) Complexity, which is related to specific death between diseases. According to the time of occurrence, it can be divided into concurrent, concurrent (intercurrent) and successive comorbidity (Journal of child psychology and psychiatry, and allied disciplines 40, 57-87 (1999)), coexistence. That is, two unrelated diseases exist simultaneously, and intercurrent means that the interaction between comorbidities is affected by the acute phase of the disease, and is usually limited by time.
共病的醫學研究在最近10年有飛躍性的發展(Cell March 18,2011 vol.144 no.6 986-998)。過去有研究利用單一醫學中心所提供150萬病歷統計161種疾病與基因之間的關聯性(PNAS July 10,2007 vol.104 no.28 11694-11699),建立模型並計算出表型(phenotype)的進程(time course)與疾病發生的機率(probability),亦有哈佛學者使用三千兩百萬病人的資料庫中(PLoS Comput Biol 5(4):e1000353),統計大於六十五歲病人的過去病史,進行橫斷式研究,並計算ICD9診斷碼包括的疾病間之相對風險(Relative risk)。以上研究對共病與臨床醫學大數據之分析有著重大意義,提出了結合人體生物資料庫有理化分析與自動化分析之概念。 Medical research on comorbidity has developed dramatically in the last 10 years (Cell March 18, 2011 vol. 144 no. 6 986-998). In the past, studies have used the relationship between 161 diseases and genes provided by a single medical center to provide 1.5 million medical records (PNAS July 10, 2007 vol. 104 no. 28 11694-11699), modelling and calculating phenotypes. The time course and the probability of disease, and the Harvard scholars use a database of 32 million patients (PLoS Comput Biol 5(4): e1000353), which counts patients older than 65 years old. Past medical history, cross-sectional studies, and calculate the relative risk of disease included in the ICD9 diagnostic code. The above research is of great significance for the analysis of comorbidity and clinical medical big data, and proposes the concept of physical and chemical analysis combined with automated analysis of human biological database.
雖然上述創新研究在學術上有貢獻,但其結果在實驗臨床使用上依然具有很多待解決的障礙,在這舉出以下幾點原因:一、其使用的資料庫代表性低,且非記錄了市民一般的就醫習慣;二、其使用的分析方法為橫斷式研究和類似世代追蹤的自創之方法,其證據力與因果推導、內在效度比傳統世代追蹤研究相低;三、使用橫斷式研究並不能計算疾病完整的疾病週期,死亡率等重要之臨床資料;四、其自動化分析缺乏具高可信度的驗證,無論在資料庫中的內部驗證與資料庫外的外部驗證均不足夠;五、缺乏有理化地統計疾病的診斷碼集合;六、缺乏排除藥物與手術所造成的併發症等。 Although the above innovative research has contributed academically, the results still have many obstacles to be solved in the experimental clinical use. Here are the following reasons: First, the database used is low representative and non-recorded. The public's general medical habits; Second, the analytical methods used are cross-cutting research and self-created methods similar to the generational tracking. The evidence and causal derivation and intrinsic validity are lower than the traditional generation tracking research. Third, the use of horizontal Broken research does not calculate important clinical data such as complete disease cycle and mortality; fourth, its automated analysis lacks high-confidence verification, both internal verification in the database and external verification outside the database. Not enough; V. Lack of a rational collection of diagnostic codes for statistic diseases; 6. Lack of elimination of drugs and complications caused by surgery.
在過去有臺灣學者運用衛生福利部中央健康保險署(National Health Insurance Administration Ministry of Health and Welfare)所提供的資料,參考國外先導研究,建立大數據進行自動化分析(橫斷式),但橫斷式(Cross sectional study)與具時間性的縱向式研究(longitudinal study)在效率上的差異可到數十倍之多,在計算效率上也存在障礙。 In the past, Taiwanese scholars used the information provided by the National Health Insurance Administration Ministry of Health and Welfare to refer to foreign pilot research and establish big data for automated analysis (transverse), but traverse The cross-sectional study and the time-long longitudinal study can vary by tens of times in efficiency, and there are obstacles in computational efficiency.
因此,提供一種可以精準預測使用者疾病發展可能性的分析系統,則是現在業界的一個重要課題。 Therefore, providing an analytical system that can accurately predict the likelihood of a user's disease development is an important issue in the industry today.
有鑑於此,本發明實施例提供了一種跨平台臨床醫療數據分析及顯示系統,設置在一伺服器,其中,伺服器包括一中央控制裝置、一儲存裝置以及一通訊裝置,該跨平台醫療數據分析及顯示系統包括:一核心運算主模組、一自動化統計模組以及一驗證模組。核心運算主模組收集並產生複數個臨床醫療數據資料,其包括:一臨床醫療數據收集模組,根據至少一第一臨床醫療資料庫之複數個資料進行疾病診斷碼、各種日期的判斷分析以及醫療數據的分類與收集,並轉換該第一臨床醫療資料庫的複數個資料為複數個預定格式資料;以及一自動化加速模組,根據一多重巢式縱貫性統計方法以及該複數個預定格式資料,進行疾病關係分析,並產生一疾病矩陣,其中,該疾病矩陣為一MxN矩陣,建立與原始資料具有差異之大型的Trajectory資料庫,其後之運算除非原始資料庫的更換,否則均使用同一Trajectory資料庫。自動化統計模組,利用複數種回歸診斷方式,對該核心運算主模組的數據進行統計分析,以產生一分析結果;以及驗證模組,至少根據對應該第一臨床醫療資料庫的該些臨床數據資料的複數篇論文資料,對該核心運算主模組的該疾病矩陣以及複數個臨床醫療數據資料進行驗證。 In view of this, an embodiment of the present invention provides a cross-platform clinical medical data analysis and display system, which is disposed on a server, wherein the server includes a central control device, a storage device, and a communication device, the cross-platform medical data. The analysis and display system comprises: a core computing main module, an automated statistical module and a verification module. The core computing main module collects and generates a plurality of clinical medical data, including: a clinical medical data collection module, and the disease diagnosis code, the judgment of various dates, and the analysis according to the plurality of data of the at least one first clinical medical database; Classifying and collecting medical data, and converting a plurality of data of the first clinical medical database into a plurality of predetermined format data; and an automated acceleration module according to a multiple nested longitudinal statistical method and the plurality of predetermined Format data, conduct disease relationship analysis, and generate a disease matrix, wherein the disease matrix is an MxN matrix, and establish a large Trajectory database that is different from the original data, and then the operation is performed unless the original database is replaced. Use the same Trajectory database. The automated statistical module utilizes a plurality of regression diagnostic methods to perform statistical analysis on the data of the core computing main module to generate an analysis result; and the verification module, at least according to the clinical clinics corresponding to the first clinical medical database The multiple papers of the data data verify the disease matrix of the core computing main module and a plurality of clinical medical data.
綜上所述,本發明的跨平台臨床醫療數據分析及顯示系統通過一臨床醫療資料庫的資料進行近一步的分析以及統計,在計算量上大幅降低,處理速度加快,而且能夠不降低其精確程度,另外,還可從本系統提供的網頁或應用程式取得專業版或是普通版的分析結果,不僅可以提供使用者明確而且快速的醫療分析資訊,更可即時地了解自身健康的可能發展方向。 In summary, the cross-platform clinical medical data analysis and display system of the present invention performs further analysis and statistics through the data of a clinical medical database, which greatly reduces the calculation amount, speeds up the processing, and can not reduce the accuracy thereof. In addition, the professional or general version of the analysis results can be obtained from the webpage or application provided by the system, which not only provides users with clear and rapid medical analysis information, but also instantly understands the possible development direction of their health. .
為讓本發明之上述特徵和優點能更明顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細說明如下。 The above described features and advantages of the present invention will be more apparent from the following description.
1‧‧‧伺服器 1‧‧‧Server
10‧‧‧中央控制裝置 10‧‧‧Central control unit
11‧‧‧儲存裝置 11‧‧‧Storage device
12‧‧‧顯示裝置 12‧‧‧ display device
13‧‧‧資料庫 13‧‧‧Database
14‧‧‧通訊裝置 14‧‧‧Communication device
121‧‧‧電腦版網頁顯示模組 121‧‧‧Computer version of web page display module
122‧‧‧移動裝置版應用程式顯示模組 122‧‧‧Mobile device application display module
A‧‧‧核心運算主模組 A‧‧‧ core computing main module
A1‧‧‧臨床醫療數據收集模組 A1‧‧‧ Clinical Medical Data Collection Module
A2‧‧‧自動化加速模組 A2‧‧‧Automation Acceleration Module
B‧‧‧自動化統計模組 B‧‧‧Automatic Statistics Module
C‧‧‧驗證模組 C‧‧‧ verification module
A000-A063、B001-B020、C000-1、C000-2、C001-C010、D001-D028、E001-E012、E014-E016‧‧‧步驟 A000-A063, B001-B020, C000-1, C000-2, C001-C010, D001-D028, E001-E012, E014-E016‧‧
F001-F016、G001-G011、H001-H023‧‧‧區塊 F001-F016, G001-G011, H001-H023‧‧‧ blocks
圖1繪示為本發明實施例之伺服器的示意圖。 FIG. 1 is a schematic diagram of a server according to an embodiment of the present invention.
圖2繪示為本發明實施例之跨平台臨床醫療數據分析及顯示系統的示意圖。 2 is a schematic diagram of a cross-platform clinical medical data analysis and display system according to an embodiment of the present invention.
圖3繪示為本發明實施例之顯示裝置的示意圖。 FIG. 3 is a schematic diagram of a display device according to an embodiment of the invention.
圖4繪示為本發明實施例之核心運算主模組的示意圖。 FIG. 4 is a schematic diagram of a core computing main module according to an embodiment of the present invention.
圖5繪示為本發明實施例之自動化加速模組的示意圖。 FIG. 5 is a schematic diagram of an automated acceleration module according to an embodiment of the present invention.
圖6繪示為本發明實施例之驗證模組的示意圖。 FIG. 6 is a schematic diagram of a verification module according to an embodiment of the present invention.
圖7繪示為本發明實施例之自動化統計模組之自動回歸診斷子模組的示意圖。 FIG. 7 is a schematic diagram of an automatic regression diagnosis sub-module of an automated statistical module according to an embodiment of the present invention.
圖8繪示為根據本發明實施例之伺服器的流程圖。 8 is a flow chart of a server in accordance with an embodiment of the present invention.
圖9繪示為本發明實施例專業版網頁之示意圖。 FIG. 9 is a schematic diagram of a professional version webpage according to an embodiment of the present invention.
圖10繪示為本發明實施例普通版網頁之示意圖。 FIG. 10 is a schematic diagram of a normal version webpage according to an embodiment of the present invention.
圖11繪示為本發明實施例應用程式之示意圖。 FIG. 11 is a schematic diagram of an application according to an embodiment of the present invention.
在下文將參看隨附圖式更充分地描述各種例示性實施例,在隨附圖式中展示一些例示性實施例。然而,本發明概念可能以許多不同形式來體現,且不應解釋為限於本文中所闡述之例示性實施例。確切而言,提供此等例示性實施例使得本發明將為詳盡且完整,且將向熟習此項技術者充分傳達本發明概念的範疇。在諸圖式中,可為了清楚而誇示層及區之大小及相對大小。類似數字始終指示類似元件。 Various illustrative embodiments are described more fully hereinafter with reference to the accompanying drawings. However, the inventive concept may be embodied in many different forms and should not be construed as being limited to the illustrative embodiments set forth herein. Rather, these exemplary embodiments are provided so that this invention will be in the In the drawings, the size and relative sizes of layers and regions may be exaggerated for clarity. Similar numbers always indicate similar components.
應理解,雖然本文中可能使用術語第一、第二、第三等來描述各種元件,但此等元件不應受此等術語限制。此等術語乃用以區分一元件與另一元件。因此,下文論述之第一元件可稱為第二 元件而不偏離本發明概念之教示。如本文中所使用,術語「及/或」包括相關聯之列出項目中之任一者及一或多者之所有組合。 It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, such elements are not limited by the terms. These terms are used to distinguish one element from another. Thus, a first element discussed below could be termed a second element without departing from the teachings of the inventive concept. As used herein, the term "and/or" includes any of the associated listed items and all combinations of one or more.
以下將以至少一種實施例配合圖式來說明所述跨平台臨床醫療數據分析及顯示系統,然而,下述實施例並非用以限制本揭露內容。 The cross-platform clinical medical data analysis and display system will be described below in conjunction with at least one embodiment. However, the following embodiments are not intended to limit the disclosure.
〔本發明跨平台臨床醫療數據分析及顯示系統的實施例〕 [Embodiment of the cross-platform clinical medical data analysis and display system of the present invention]
參照圖1至圖3,圖1繪示為本發明實施例之伺服器的示意圖。 1 to FIG. 3, FIG. 1 is a schematic diagram of a server according to an embodiment of the present invention.
圖2繪示為本發明實施例之跨平台臨床醫療數據分析及顯示系統的示意圖。圖3繪示為本發明實施例之顯示裝置的示意圖。 2 is a schematic diagram of a cross-platform clinical medical data analysis and display system according to an embodiment of the present invention. FIG. 3 is a schematic diagram of a display device according to an embodiment of the invention.
本發明實施例提供了一種跨平台臨床醫療數據分析及顯示系統,在本實施例終,其係執行於一伺服器1上。在本實施例中,伺服器1可設置於本地或是遠端,在本發明中不作限制。 The embodiment of the invention provides a cross-platform clinical medical data analysis and display system, which is executed on a server 1 in the end of the embodiment. In this embodiment, the server 1 can be disposed at the local or remote end, which is not limited in the present invention.
伺服器1包括一中央控制裝置10、一儲存裝置11、一顯示裝置12、一資料庫13以及一通訊裝置14。 The server 1 includes a central control device 10, a storage device 11, a display device 12, a database 13, and a communication device 14.
其中,中央控制裝置10電性連接儲存裝置11、顯示裝置12、資料庫13以及通訊裝置14。在其他實施例中,中央控制裝置10、儲存裝置11、顯示裝置12、資料庫13以及通訊裝置14等,可以是虛擬系統中的一個子系統,在本發明中不作限制。 The central control device 10 is electrically connected to the storage device 11 , the display device 12 , the database 13 , and the communication device 14 . In other embodiments, the central control device 10, the storage device 11, the display device 12, the database 13 and the communication device 14, etc., may be a subsystem in the virtual system, which is not limited in the present invention.
伺服器1的中央控制裝置10,是用於處理跨平台臨床醫療數據分析及顯示系統的各種資料,能夠進行使用者身份識別,接收前端介面傳入之參數,傳送運算結果至前端介面。 The central control device 10 of the server 1 is used to process various data of the cross-platform clinical medical data analysis and display system, and can perform user identification, receive parameters input from the front-end interface, and transmit the operation result to the front-end interface.
而儲存裝置11則用於儲存跨平台臨床醫療數據分析及顯示系統的各種資料,在本實施例中資料庫13即是設置在儲存裝置11中。 The storage device 11 is configured to store various materials of the cross-platform clinical medical data analysis and display system. In the present embodiment, the data library 13 is disposed in the storage device 11.
在本實施例中,顯示裝置12是用於顯示跨平台臨床醫療數據分析及顯示系統的各種顯示資料,例如電腦版網頁以及移動裝置版網頁或是移動裝置版應用程式。在本實施例中,跨平台臨床醫療數據分析及顯示系統可以通過顯示裝置12,傳送給使用者相關 的顯示資料,以方便使用者在電腦上或是不同的移動裝置上都能夠使用本發明的跨平台臨床醫療數據分析及顯示系統。 In the present embodiment, the display device 12 is used to display various display materials of the cross-platform clinical medical data analysis and display system, such as a computer version of a web page and a mobile device version of a web page or a mobile device version application. In this embodiment, the cross-platform clinical medical data analysis and display system can transmit the related display data to the user through the display device 12, so that the user can use the present invention on the computer or on different mobile devices. Cross-platform clinical medical data analysis and display system.
在本實施例中,中央控制裝置11通過通訊裝置14傳送給使用者的電腦或是移動裝置本發明跨平台臨床醫療數據分析及顯示系統的各種顯示資料。在本實施例中,通訊裝置14可為有線通訊裝置或是無線通訊裝置,在本發明中不作限制。 In the present embodiment, the central control device 11 transmits to the user's computer or mobile device via the communication device 14 various display materials of the cross-platform clinical medical data analysis and display system of the present invention. In this embodiment, the communication device 14 can be a wired communication device or a wireless communication device, which is not limited in the present invention.
在本實施例中,跨平台臨床醫療數據分析及顯示系統包括一核心運算主模組A、一自動化統計模組B以及一驗證模組C。 In this embodiment, the cross-platform clinical medical data analysis and display system includes a core computing main module A, an automated statistical module B, and a verification module C.
核心運算主模組A包括一臨床醫療數據收集模組A1以及一自動化加速模組A2。 The core operation main module A includes a clinical medical data collection module A1 and an automatic acceleration module A2.
臨床醫療資料收集模組A1用於收集各種臨床醫學資料,並轉化為統一格式,分類疾病發生事件、疾病種類,分類存活病人,分類實驗組病人、控制組病人。 The clinical medical data collection module A1 is used to collect various clinical medical materials and convert them into a unified format, classify disease occurrence events, disease types, classify surviving patients, classify experimental group patients, and control group patients.
所述臨床醫療資料收集模組A1在收集臨床醫學資料步驟中,包括從原始的臨床醫學資料中切割日期資料和疾病診斷碼資料等步驟。自動化加速模組A2是採用統計學原理,自動化地對所有疾病進行組合,直至所有疾病組合結束。 The clinical medical data collection module A1 includes the steps of cutting date data and disease diagnosis code data from the original clinical medical data in the step of collecting clinical medical data. The Automated Acceleration Module A2 uses statistical principles to automatically combine all diseases until the end of all disease combinations.
自動化統計模組B主要包括自動化回歸診斷子模組。自動化回歸診斷子模組,是根據輸入的條件,採用回歸統計法進行回歸診斷。自動化統計子模組則是根據輸入的條件,對系統中某一疾病組合資料進行統計。 The automated statistical module B mainly includes an automated regression diagnostic sub-module. The automated regression diagnostic sub-module is based on the input conditions and uses the regression statistical method for regression diagnosis. The automated statistical sub-module counts the data of a certain disease combination in the system according to the input conditions.
驗證模組C還包括專家驗證子模組,所述專家驗證子系統用於由醫療專家對資料庫中資料的真實性進行檢驗驗證模組中之統計結果。 The verification module C further includes an expert verification sub-module for verifying the statistical result in the verification module by the medical expert on the authenticity of the data in the database.
一、核心運算主系統 First, the core computing main system
請參照圖4,圖4繪示為本發明實施例之核心運算主模組的示意圖。 Please refer to FIG. 4. FIG. 4 is a schematic diagram of a core computing main module according to an embodiment of the present invention.
在本實施例中,是使用臺灣健保資料庫(National Health Insurance Research Database,簡稱:NHIRD)的資料。在臺灣健保計畫(National Health Insurance Program in Taiwan)收集了兩千三百多萬名臺灣地區市民之就醫資料,建立臺灣健保資料庫(National Health Insurance Research Database,簡稱:NHIRD)提供學者所研究之使用。臺灣健保資料庫覆蓋臺灣超過90%的人口,過去臺灣健保資料庫從母群體中抽樣建立百萬歸人檔(longitudinal health insurance database,簡稱:LHID)、癌症登錄檔、全台全住院檔、全台重大傷病檔、全台1/2孩童檔,並廣泛使用在臨床醫學、流行病學以及公共衛生等研究上,統計至今(即2016年11月初)共發表論文二千餘篇。在臺灣健保資料庫所提供之各種資料庫中,以百萬歸人檔(LHID)最具代表性。百萬歸人檔(LHID)收集共十五年(1996-2011)的就醫記錄,收集內容包括病人的基本資料,如門診和任院等就醫記錄。疾病診斷使用國際疾病分類第九版(The International Classification of Disease-Ninth Revision,簡稱:ICD9),包括約928種三碼(3 digit level)診斷碼和13813種五碼(5 digit level)診斷碼,若包括E&V類碼即1234種三碼與16327種五碼診斷碼。其他資料庫的資料格式與百萬歸人檔(LHID)類似,如全台重大傷病檔在重大疾病的記錄較為詳細,全台1/2孩童檔則只收錄全台1996-2008年共12年全臺灣一半小於18或20歲孩童的就醫記錄,全住院資料檔案則只收錄全台15年間所有使用健保住院之病人資料。有國外著名學者在著名醫學期刊發表文章介紹NHIRD的優點與重要性,即NHIRD為快速分析醫療大數據且能保有高可信度之重要工具(JAMA Intern Med.2015 Sep;175(9):1527-9),這是本專利基本的臨床醫療資料原始來源。 In this embodiment, the data of the National Health Insurance Research Database (NHIRD) is used. The National Health Insurance Program in Taiwan collected medical information from more than 23 million people in Taiwan and established a National Health Insurance Research Database (NHIRD) to provide scholars with research. use. Taiwan's health insurance database covers more than 90% of Taiwan's population. In the past, Taiwan's health insurance database sampled from the parent group to establish a longitudinal health insurance database (LHID), a cancer registration file, a full hospitalization file, and a full A major injury and illness file, 1/2 child file in Taiwan, and widely used in clinical medicine, epidemiology and public health research, statistics have so far (ie in early November 2016) published more than 2,000 papers. Among the various databases provided by the Taiwan Health Insurance Database, the millions of people (LHID) are the most representative. The Million Returned Person (LHID) collects medical records for a total of fifteen years (1996-2011). The collection includes basic patient information such as outpatient visits and hospitalization records. The disease diagnosis uses the International Classification of Disease-Ninth Revision (ICD9), which includes about 928 three digit level diagnostic codes and 13813 five digit level diagnostic codes. If the E&V class code is included, that is, 1234 three-codes and 16327 five-code diagnostic codes. The data format of other databases is similar to that of millions of people (LHID). For example, the record of major illnesses in major injuries and illnesses in Taiwan is more detailed. The total number of 1/2 children in Taiwan is only included in the total period of 1996-2008 for 12 years. Half of the children in Taiwan are less than 18 or 20 years old, and the all-in-patient data file only contains information on all patients who have been hospitalized for health care for 15 years. Some famous foreign scholars published articles in famous medical journals to introduce the advantages and importance of NHIRD, that is, NHIRD is an important tool for rapid analysis of medical big data and can maintain high credibility (JAMA Intern Med.2015 Sep;175(9): 1527-9) This is the original source of basic clinical medical information for this patent.
(1)臨床醫療資料收集模組 (1) Clinical medical data collection module
臨床醫療資料收集模組A1利用伺服器1的中央控制裝置10、儲存裝置11以及資料庫13將臺灣健保資料庫(NHIRD)與其他國內外醫療資料之資料庫匯入,使其轉換成能加速統計、自動化、有理化分析之格式,匯入各種臨床資料建立臨床醫療資料庫,資料庫的來源包括百萬歸入檔(LHID)、癌症登錄檔、全台全住院檔、全台重大傷病檔、全台1/2孩童檔等資料庫並進行資料轉換並儲存。取得資料庫的百萬歸入檔(LHID)、癌症登錄檔、全台全住院檔、全台重大傷病檔、全台1/2孩童檔、慢性腎病登錄歸人檔。 The clinical medical data collection module A1 uses the central control device 10, the storage device 11 and the database 13 of the server 1 to import the database of the Taiwan Health Insurance Database (NHIRD) and other domestic and foreign medical materials, so that it can be converted into an accelerated The format of statistics, automation, and rational analysis, importing various clinical data to establish a clinical medical database. The sources of the database include millions of files (LHID), cancer registration files, full hospitalization files, and major medical illness files. The database of 1/2 children's files and other data is converted and stored. The database has obtained millions of files (LHID), cancer registration files, full hospitalization files, all major illnesses and illnesses, 1/2 children's files in Taiwan, and chronic kidney disease registration.
請參照圖4,在步驟A000中,本系統啟動,在步驟A001中,是從各種儲存裝置中取得之原始未處理之資料(Raw data)。而在步驟A002中,臨床醫療資料收集模組A1則將從各種儲存裝置中取得之原始未處理之資料(Raw data)利用SAS ®(STATISTICAL ANALYSIS SYSTEM,統計分析軟體)處理,根據臺灣健保計畫(National Health Insurance Program in Taiwan)所提供之最新解碼薄之解碼,把原始資料切割成*.SAS7BDAT和轉換成*.CSV之資料格式,並確認切割和轉換過程中沒有任何資料流程失。在步驟A003以及步驟A004中,則是根據各原始資料之病歷登錄範圍,即住院檔(DD-admission)與門診檔(CD-OPD)進行分類。在本實施例中,步驟A005是使用語言-結構化查詢語言(Structural Query Language-SQL)匯入PostgreSQL(PostgreSQL Global Development group,9.5.2version)、SQL server 2016(Microsoft)、MariaDB version 10.1.13(MariaDB Corporation Ab,MariaDB Foundation)資料庫中儲存並格式化已切割之資料。 Referring to FIG. 4, in step A000, the system is started. In step A001, raw raw data (Raw data) obtained from various storage devices is obtained. In step A002, the clinical medical data collection module A1 processes the raw unprocessed data (Raw data) obtained from various storage devices using SAS ® (STATISTICAL ANALYSIS SYSTEM) according to the Taiwan health insurance plan. (National Health Insurance Program in Taiwan) provides the latest decoding thin decoding, cutting the original data into *.SAS7BDAT and converting to *.CSV data format, and confirming that there is no data flow during the cutting and conversion process. In step A003 and step A004, the medical record registration range of each original data, that is, the hospitalization file (DD-admission) and the clinic file (CD-OPD) are classified. In this embodiment, step A005 is to import PostgreSQL (PostgreSQL Global Development group, 9.5.2 version), SQL server 2016 (Microsoft), MariaDB version 10.1.13 using the language-structured query language (Structural Query Language-SQL). The MariaDB Corporation Ab, MariaDB Foundation) database stores and formats the cut data.
在步驟A006中,是根據門診檔(CD)與住院檔(DD)不同的診斷碼對儲存在資料庫13中的資料進行切割。除了全台住院檔之外,其他資料庫包括百萬歸入檔(LHID)、癌症登錄檔、全台重大傷病檔、全台1/2孩童檔、慢性腎病登錄歸人檔都同時具備門診 檔(CD)與住院檔(DD)檔,也就是只有全台全院資料檔案不具備門診檔(CD)。 In step A006, the data stored in the database 13 is cut according to a diagnostic code different from the clinic file (CD) and the hospital file (DD). In addition to the entire hospitalization file, other databases include millions of files (LHID), cancer registration files, major injury files for all Taiwan, 1/2 children files for all Taiwan, and chronic kidney disease registration and return files. (CD) and hospitalization (DD) files, that is, only the entire hospital data file does not have an outpatient file (CD).
而在步驟A007中,儲存在資料庫13中的資料,則會進行日期類資料之轉換。步驟A007中,先判斷資料欄位是否為日期類資料,若判斷為非日期類資料則儲存為文字類資料,則執行步驟A012,若判讀為日期類資料即儲存成日期類資料,則執行步驟A010。而在步驟A011中,則是將同一欄位的診斷碼(ICD)根據原本之日期與病患識別碼(ID)合併。在診斷碼切割轉換的步驟A006中是把經過步驟A011處理的門診檔(CD)與住院檔(DD)之病人之新識別碼(new ID),作進一步處理,例如執行步驟A008格式轉換以及步驟A009的相反步驟。在本實施例中,是將診斷碼輸入日期(即門診或住院時申報診斷碼之日期)與診斷碼(ICD)獨立欄位合併與格式倒轉,也就是,把門診檔(CD)以及住院檔(DD)的獨立欄位門診檔(CD檔中共有5筆ICD碼,而住院檔(DD)則有三筆之獨立欄位,經過格式倒轉的處理至同一欄位。 In step A007, the data stored in the database 13 is converted into date data. In step A007, it is first determined whether the data field is a date type data. If it is determined that the non-date type data is stored as a text type data, step A012 is performed, and if the date type data is stored as a date type data, the steps are executed. A010. In step A011, the diagnostic code (ICD) of the same field is combined with the patient identification code (ID) according to the original date. In step A006 of the diagnostic code cutting conversion, the new identification code (new ID) of the patient in the clinic file (CD) and the hospitalization file (DD) processed in step A011 is further processed, for example, performing step A008 format conversion and steps. The reverse of A009. In this embodiment, the diagnostic code input date (ie, the date when the diagnostic code is declared in the clinic or hospitalization) is merged with the diagnostic code (ICD) independent field and the format is reversed, that is, the clinic file (CD) and the hospitalization file are used. (DD) independent field clinic file (there are 5 ICD codes in the CD file, while the hospital file (DD) has three separate fields, which are processed to the same field after the format is reversed.
在本實施例中,日期類資料將使用在:一、判斷病患之身份識別碼(ID_birthday,病患之出生日期);二、判斷病人診斷碼在門診下診斷之日期(func_date);三、住院日期(in_date)與診斷碼之申報日期(Appl_date)。儲存成日期格式之出生日期之欄位,將與病患之識別碼合併成新識別碼,避免有重複病患識別碼出現以致影響計算結果之情況。 In this embodiment, the date type data will be used in: 1. Identifying the patient's identification code (ID_birthday, the date of birth of the patient); 2. Determining the date the patient diagnosis code is diagnosed under the clinic (func_date); The date of hospitalization (in_date) and the date of the diagnosis code (Appl_date). The field of birth date stored in the date format will be combined with the patient's identification code into a new identification code to avoid the occurrence of duplicate patient identification codes and affect the calculation results.
在日期類資料完成合併後則馬上回傳至步驟A006,經過上述步驟到步驟A009後,則執行步驟A013、步驟Á014、步驟A015,按照日期類資料之大小重新排序,並按該病人之診斷日期判讀最大值(死亡數值)與最小值。最小值之計算將使用在判讀病人之初診斷某一疾病之日期,並使用在歸類病人之世代追蹤之分類組別。也就是執行步驟A016以及步驟A017的內容,根據最大值的計算,判讀病人之最後就醫日期,比較其與投保日期之差異,來判 斷該病人是否在該就醫日期前、就醫日期後死亡、或病人並未死亡。 After the date type data is merged, it is immediately returned to step A006. After the above steps to step A009, step A013, step Á014, step A015 are executed, and the data is reordered according to the size of the date type data, and the patient's diagnosis date is pressed. Interpret the maximum value (death value) and the minimum value. The minimum value will be calculated using the date at which the patient was diagnosed at the beginning of the diagnosis and using the classification group tracked by the generation of the patient. That is, the contents of step A016 and step A017 are executed, and according to the calculation of the maximum value, the patient's last medical treatment date is interpreted, and the difference between the patient and the insured date is compared to determine whether the patient died before the medical treatment date, after the medical treatment date, or the patient Not dead.
當判讀該病人之最初診斷後,便按照預先接受的人工輸入(manual input)(步驟A018),也就是指一種可以接受其網站系統、嵌入式應用所傳送之資料,在這步驟中,特別是指三種時間長度,其條列如下:一、排除非初診斷之日期(Exclusion period):本時期決定病人之診斷如果同時出現在收件期中,即判讀病人為非初診斷,非初診斷專指一種舊診斷,即診斷為過去已患上之疾病;二、收件期(Inclusion period):當在此時間所發生之診斷並未出現於排除期,即把病人收錄,若病人符合特定條件,即可進入追蹤期;三、追蹤期(Follow up period):專指一種把收件期中之病人追蹤一定日期並觀察是否有真事件發生之時期(Event),或計算病人是否在時期內死亡藉以計算生存率(survival)等之重要資料,亦是橫向型研究與縱向式之最大差別之處。 After interpreting the patient's initial diagnosis, it follows a pre-accepted manual input (step A018), which refers to a data that can be transmitted by its website system or embedded application, in this step, especially Refers to three kinds of time lengths, which are listed as follows: 1. Exclusion period: In this period, the diagnosis of the patient is determined. An old diagnosis, that is, a disease that has been diagnosed in the past; 2. Inclusion period: When the diagnosis occurring at this time does not appear in the exclusion period, the patient is included, and if the patient meets certain conditions, You can enter the tracking period. 3. Follow up period: refers to a period in which the patient in the receiving period is tracked for a certain date and observes whether there is a real event, or whether the patient died during the period. Calculating important information such as survival rate is also the biggest difference between horizontal research and vertical type.
當接受日期輸入後,便把病人重新分成門診(CD)與住院(DD)(步驟A019),用以方便計算病人在收件期之診斷次數,判斷病人之診斷是否「已確診」,避免收錄錯誤診斷所造成之誤差(bias)。門診檔(CD)與住院檔(DD)病人之就診日期,按照人工輸入(manual input)的日期或時間長度,比較兩者之間之大小,茲以排除日EX為例(步驟A020以及步驟A026),若少於排除日EX之設定日期之上限(步驟A026以及步驟A029),即把病人歸類成排除組;若大於排除日EX之設定日期(步驟A021步驟以及步驟A027),則同時小於追蹤期FU之設定日期,即把病人歸類成收件組(步驟A024以及步驟A030);若否,則在步驟A022以及步驟A028中,就會把病人歸類成追蹤組(步驟A025以及步驟A031)。 After accepting the date input, the patient is re-divided into outpatient (CD) and hospitalized (DD) (step A019) to facilitate the calculation of the patient's diagnosis count during the receiving period, to determine whether the patient's diagnosis is "diagnosed", to avoid inclusion The error caused by the error diagnosis (bias). The date of the visit of the patient (CD) and the hospitalized (DD) patient is compared with the date of the manual input or the length of time. The exclusion date EX is taken as an example (step A020 and step A026). If the upper limit of the set date of the exclusion date EX is less (step A026 and step A029), the patient is classified into the exclusion group; if it is greater than the set date of the exclusion date EX (step A021 step and step A027), it is simultaneously smaller than The set date of the tracking period FU, that is, the patient is classified into the receiving group (step A024 and step A030); if not, in step A022 and step A028, the patient is classified into the tracking group (step A025 and steps) A031).
當病人之分類完成,本發明實施例的臨床醫療數據收集模組A1就會進入巢式迴圈(Nested while loop)之第一迴圈中,收件組 進入步驟A033之迴圈,而排除組則進入步驟A032之迴圈。在這裡之巢式迴圈專指一種專門把疾病矩陣中之所有排列組合進行縱向式研究之方法,為達到計算目的,把病人之診斷碼分為原發疾病i(Index disease/primary disease/1st diagnosis,在流程圖中將以符號i標示),而在原發疾病後發生之疾病稱為次發疾病j(secondary disease/2nd diagnosis,symbol j,在流程圖中將以符號j標示)進行疾病排列組合之計算。 When the classification of the patient is completed, the clinical medical data collection module A1 of the embodiment of the present invention enters the first loop of the nested loop (the Nested while loop), and the receiving group enters the loop of step A033, and the exclusion group Then enter the loop of step A032. The nested loop here refers to a method of longitudinally studying all the permutations and combinations in the disease matrix. For the purpose of calculation, the patient's diagnostic code is divided into the primary disease i (Index disease/primary disease/1st). Diagnosis, which will be indicated by the symbol i in the flow chart, and the disease that occurs after the primary disease is called secondary disease/2nd diagnosis, symbol j, which will be marked with the symbol j in the flowchart. Arrange the combination calculations.
(2)自動化加速模組 (2) Automated acceleration module
進入巢式迴圈之後,就是本發明實施例中,自動化加速模組A2的範疇。在巢式迴圈中之第一層迴圈,原發疾病i之起始資料為診斷碼(ICD)中最小之三碼或五碼,而其極限值則為診斷碼(ICD)中最大之三碼或五碼,當判斷收件組與排除組病人符合該迴圈條件步驟A032以及步驟A033為正確(True)時,即當源發疾病i小於或等於設定之上限時,則進入以下指示中(步驟A034)。而在步驟A035中,則是以該層之原發疾病i為初診斷之疾病碼,處理收件組與排除組中有重複診斷(舊診斷)之病人,原理為使用left join涵式計算兩組是否有交集(步驟A036),即計算兩組病人之識別碼是否有重複,所排除之病人歸類成已排除組(Excluded)(步驟A037),已進入已排除組之病人將不會進入後續之統計。待進入迴圈之下一ICD碼(i++)之計算(步驟A045),將重新執行步驟A034之流程。以計算經過去除舊診斷之收件組病人之某一診斷碼之診斷次數(步驟A038),若門診檔(CD)中,診斷該病人之次數大於或等於三次,即判讀為正確(True)並把該病人收歸進入實驗組(experimental group)(步驟A040),而不符合步驟A038之條件之病人則判讀為錯誤(False)並把該病人收歸進入控制組(control group)(步驟A039)。經步驟A039以及步驟A040收歸之病人則合併追蹤期之資料(步驟A041),並根據其底線(Baseline)之設定 進入追蹤期資料之統計(步驟A042以及步驟A043)。在這裡之底線專指一種在步驟A018中,所決定收件期與追蹤期分界之日期。 After entering the nested loop, it is the scope of the automatic acceleration module A2 in the embodiment of the present invention. In the first layer of the loop in the nest loop, the starting data of the primary disease i is the smallest three or five yards in the diagnostic code (ICD), and the limit value is the largest in the diagnostic code (ICD). Three yards or five yards, when it is judged that the receiving group and the excluded group meet the loop condition, step A032 and step A033 are correct (True), that is, when the source disease i is less than or equal to the set upper limit, then the following indication is entered. Medium (step A034). In step A035, the disease code of the initial disease i of the layer is used as the initial diagnosis, and the patient in the receiving group and the exclusion group has repeated diagnosis (old diagnosis). The principle is to calculate the two using the left join culvert. Whether the group has an intersection (step A036), that is, whether the identification codes of the two groups of patients are duplicated, and the excluded patient is classified into the excluded group (Excluded) (step A037), and the patient who has entered the excluded group will not enter. Subsequent statistics. To enter the calculation of an ICD code (i++) under the loop (step A045), the flow of step A034 will be re-executed. To calculate the number of diagnoses of a certain diagnostic code of the patient who has been removed from the old diagnostics (step A038), if the number of times the patient is diagnosed in the outpatient file (CD) is greater than or equal to three times, the sentence is correctly read (True) and The patient is admitted to the experimental group (step A040), and the patient who does not meet the conditions of step A038 is interpreted as a False and the patient is admitted to the control group (step A039) . The patient who has been admitted to step A039 and step A040 merges the data of the tracking period (step A041), and enters the statistics of the tracking period data according to the setting of the baseline (step A042 and step A043). The bottom line here refers specifically to a date in the step A018 that determines the boundary between the receiving period and the tracking period.
之後則進入第二層迴圈(步驟A044),以次發疾病j為計算核心,其條件設定與原發疾病i之第一層迴圈相似,其流程敘述可以參考上述之步驟A032以及步驟A033。第二層之變數與第一層迴圈相似,其差異在於第二層中處理死亡與疾病發生事件之部份。 Then enter the second layer loop (step A044), with the secondary disease j as the calculation core, the condition setting is similar to the first layer loop of the primary disease i, the flow description can refer to the above step A032 and step A033 . The second layer of variables is similar to the first layer of circling, the difference being in the second layer dealing with the death and disease occurrences.
若判讀錯誤(True)(步驟A046)則計算該層之次發疾病j的疾病出現之次數,次發疾病j之起始值為診斷碼(ICD)中最小之三碼或五碼,而極限值則為診斷碼(ICD)中最大之三碼或五碼,當判斷收件組與排除組病人符合該迴圈條件為正確(True)時,即當次發疾病j小於或等於設定之上限時,進入以下指示(步驟A047)。步驟A047之方法為計算在追蹤期中是否有出現次發疾病j,如果有出現大於或等於一次,即判讀為正確(true)(步驟A049),並按病人之識別碼把發生事件累加(步驟A050),而發生次發疾病j之日期亦會與原發疾病i發生之日期比較並計算出人年(person-years)(步驟A051)。而當在指定日期內j疾病沒有在該病人發生時就會計算下一個病人並把該病人判讀為無發生事件(步驟A048)。 If the error is diagnosed (True) (step A046), the number of occurrences of the disease of the secondary disease j of the layer is calculated, and the initial value of the secondary disease j is the minimum three or five yards in the diagnostic code (ICD), and the limit The value is the maximum three or five yards in the diagnostic code (ICD). When it is judged that the condition of the receiving group and the excluded group is true (True), that is, when the secondary disease j is less than or equal to the upper limit. When, the following instruction is entered (step A047). The method of step A047 is to calculate whether there is a secondary disease j in the tracking period, if there is more than or equal to one time, the reading is correct (step A049), and the event is accumulated according to the patient identification code (step A050) The date of occurrence of the secondary disease j will also be compared with the date of occurrence of the primary disease i and the person-years will be calculated (step A051). When the disease does not occur in the patient within the specified date, the next patient is counted and the patient is judged as having no event (step A048).
當取得人年資料後,即進入判讀病人是否在發生次發疾病j後出現死亡事件(步驟A052)。參考臺灣健保資料庫(NHIRD)之喪失資格或應退保之三種情況,加上其他特別條件,建立自動判讀病人是否可歸類成「非存活」之機制,其方法如下:分類成五種條件,優先判讀臺灣健保資料庫(NHIRD)之喪失資格或應退保之三種情況;其一為死亡(步驟A053),也就是病人經法定機構判定死亡並根據健保局之規定在三日後退保之病人,若判讀為是,即歸類為死亡(步驟A058),若判讀為非,則進入下一層判讀或指示,如此類推。其二為失蹤滿六個月的人(步驟A054),若判讀為是,則歸類為失去追蹤(步驟A059)。其三為喪失投保資格步驟 A055,即喪失中華民國國籍、戶籍遷出國外、外籍人士居留期限屆滿等若判讀為是同樣歸類為失去追蹤(步驟A060)。此外由於臺灣健保資料庫(NHIRD)除了以上健保局之三規範外,還會出現以下兩種情況,即是一、服兵役、二、進監獄或看守所超過兩個月以上。在步驟A056中,判讀民國90年以前之資料是否屬於服兵役,若判斷為是則歸類成失去追蹤(步驟A061),而步驟A057中,則判讀病人是否在進監獄或看守所超過兩個月以上,判斷為是則屬死亡(步驟A062),如判斷為非,代表病人屬於非「非存活」(步驟A063),而存活之病人將不進入死亡率與生存曲線之計算。 When the data of the person is obtained, the patient is judged to have a death event after the occurrence of the secondary disease (step A052). Refer to the three conditions of the Taiwan Health Insurance Data Base (NHIRD) for disqualification or surrender, plus other special conditions, to establish whether the automatic interpretation of patients can be classified as "non-survival" mechanism, as follows: classified into five conditions Priority is given to the three conditions of disqualification or surrender of the Taiwan Health Insurance Data Base (NHIRD); one of them is death (step A053), that is, the patient is sentenced to death by a statutory body and surrendered after three days according to the regulations of the Health Insurance Bureau. The patient, if judged as yes, is classified as dead (step A058), and if the sentence is negative, proceeds to the next level of interpretation or instruction, and so on. The second is a person who has been missing for six months (step A054), and if the reading is yes, it is classified as loss of tracking (step A059). The third is the loss of the qualification step A055, that is, the loss of the nationality of the Republic of China, the immigration of the household registration abroad, the expiration of the residence period of the foreigner, etc., if the interpretation is the same as the loss of tracking (step A060). In addition, because the Taiwan Health Insurance Data Base (NHIRD), in addition to the above three regulations of the Health Insurance Bureau, there are two situations, namely, one, military service, two, jail or detention center for more than two months. In step A056, it is judged whether the data of the Republic of China 90 years ago belongs to military service, if it is judged to be categorized as loss of tracking (step A061), and in step A057, it is judged whether the patient is in jail or detention center for more than two months. In the above, it is judged to be death (step A062). If the judgment is negative, the representative patient belongs to non-non-survival (step A063), and the surviving patient will not enter the calculation of the mortality and survival curve.
請參照圖5,圖5繪示為本發明實施例之自動化加速模組的示意圖。 Please refer to FIG. 5. FIG. 5 is a schematic diagram of an automatic acceleration module according to an embodiment of the present invention.
因一般臨床醫學研究的縱向式研究之方法,其中一部份為探討在某一疾病狀態下所造成患上另一疾病之風險,因此在本實施例中,利用原發疾病i以及次發疾病j代表疾病之狀態,次發疾病j為因原發疾病i之疾病狀態所發生之繼發疾病。進入巢式迴圈則能達成計算原發疾病i以及次發疾病j組合之目的,因此將有16459(i)x16459(j)種排列組合,亦即為上述所稱之疾病矩陣。 In the present embodiment, the primary disease i and the secondary disease are used in the present embodiment because of the method of the longitudinal study of general clinical medical research, in which part of the method is to investigate the risk of suffering from another disease caused by a certain disease state. j represents the state of the disease, and the secondary disease j is a secondary disease caused by the disease state of the primary disease i. Entering the nested loop can achieve the purpose of calculating the combination of the primary disease i and the secondary disease, so there will be 16459(i)x16459(j) permutation combinations, which is the above-mentioned disease matrix.
又因原發疾病i以及次發疾病j之間有時間順序性,即原發疾病i與次發疾病j之間有一時間性,例如:原發疾病i→次發疾病j如原發疾病i之疾病為高血壓,而次發疾病j之疾病為糖血病,則原發疾病i(高血壓)→次發疾病j(糖尿病)與原發疾病i(糖尿病)→j(高血壓)所代表的意義並不相同;若實驗設定為原發疾病i(高血壓)→次發疾病j(糖尿病),代表研究高血壓病人患上糖尿病之風險為何,而實驗設定為原發疾病i(糖尿病)→次發疾病j(高血壓),則代表研究糖尿病病人患上高血壓之風險,以符號代表,即是injm不等於imjn,也就是兩者的機率不必然相等,兩者不能互相取代,亦代表兩種可能風險會同時存在。另外,兩者之間有著方向性,若以傳統方法計算原發疾病i以及次發疾病j之間的關係,並 不能計算出真正的疾病的關連性(i→j或j→i)。傳統方法如計算糖尿病(i)是否會增加高血壓(j)之風險,計算結果為風險比值(Hazard Ratio)大於1且統計學上有顯著差異,基於這結果通常得到糖尿病導致高血壓風險增加的結論。若計算高血壓是否會增加糖尿病時,計算結果為風險比值約等於1,結合以上結果,如何得知糖尿病是否真正增加高血壓之風險。 Also, there is a chronological relationship between the primary disease i and the secondary disease j, that is, there is a time between the primary disease i and the secondary disease j, for example, the primary disease i → secondary disease j such as the primary disease i The disease is hypertension, and the disease of secondary disease is glycemia, then the primary disease i (hypertension) → secondary disease j (diabetes) and the primary disease i (diabetes) → j (hypertension) The meaning of the representative is not the same; if the experiment is set to the primary disease i (hypertension) → secondary disease j (diabetes), it represents the risk of developing diabetes in patients with hypertension, and the experiment is set to the primary disease i (diabetes) → secondary disease j (high blood pressure), on behalf of the study of diabetes patients with high risk of hypertension, represented by symbols, that is, injm is not equal to imjn, that is, the probability of the two is not necessarily equal, the two can not replace each other, It also means that two possible risks will exist at the same time. In addition, there is a directional relationship between the two. If the relationship between the primary disease i and the secondary disease j is calculated by the conventional method, the true disease correlation (i→j or j→i) cannot be calculated. Traditional methods such as calculating whether diabetes (i) increases the risk of hypertension (j), the calculated result is a Hazard Ratio greater than 1 and a statistically significant difference, based on which results usually result in an increased risk of hypertension due to diabetes. in conclusion. If the calculation of hypertension will increase diabetes, the calculated result is a risk ratio of about 1, combined with the above results, how to know whether diabetes actually increases the risk of high blood pressure.
在本實施例中,利用自動化之技術,透過統計出16459(假設i為原發疾病,16459為ICD9之疾病總數)x16459(假設j為次發疾病)種排列組合之世代追蹤,計算出ij(符號:)方向性,設定疾病i和疾病j的連結為L(I→J)。以下另以疾病I與疾病J作為範例,疾病I與疾病J是不同的疾病,其關聯性可以根據下列敘述進一步定義。 In this embodiment, using the automated technique, by counting 16459 (assuming i is the primary disease, 16459 is the total number of diseases of ICD9) x16459 (assuming j is a secondary disease), the generation of the combination is tracked, and i is calculated. j (symbol: In the directionality, the link between disease i and disease j is set to L (I → J). In addition, the disease I and the disease J are exemplified below, and the disease I and the disease J are different diseases, and the correlation can be further defined according to the following description.
在探討疾病I→疾病J與疾病J→疾病I的方向性時,可用下列公式1得知疾病I→疾病J的方向性。 When the directionality of the disease I→the disease J and the disease J→the disease I is examined, the directionality of the disease I→the disease J can be known by the following formula 1.
其中,i為第一疾病,j為第二疾病,λ為疾病之間的方向性判斷,li→j為第一疾病引發第二疾病的關係函數值,lj→i為第二疾病引發第一疾病的關係函數值。若λ>0時則是疾病I→疾病J,若λ<0時,則是疾病J→疾病I。此外,使用這些方法可以進一步瞭解每一種疾病在風險上的角色扮演,此外,還可利用下列公式2判斷哪一疾病為原發疾病:Λ i =Σ j λ i→j -公式2 Where i is the first disease, j is the second disease, λ is the directional judgment between the diseases, l i→j is the relationship function value of the second disease caused by the first disease, l j→i is caused by the second disease The relationship function value of the first disease. If λ>0, it is disease I→disease J, and if λ<0, it is disease J→disease I. In addition, these methods can be used to further understand the role of each disease in risk. In addition, the following formula 2 can be used to determine which disease is the primary disease: Λ i = Σ j λ i → j - Equation 2
其中,Λi為λ的總和。當Λi愈大時,愈能代表疾病I的角色為原發疾病,相反則是次發疾病。計算出真正的疾病的關連性(疾病I→疾病J或疾病J→疾病I)。類似Social network之概念,由雙方向之1值計算λ之數值得知節點(node,vertex)與節點之間連結(邊, edge)之單方向性,並總結出方向性之強弱。計算Λ則能得知節點之中心特性(Centrality)。 Where Λ i is the sum of λ. When Λ i larger, more representative of the role of disease I was the primary disease, the opposite is the second disease. The relationship between the real disease (disease I → disease J or disease J → disease I) is calculated. Similar to the concept of Social network, the value of λ is calculated from the value of λ in both directions to know the unidirectionality of the connection between nodes (nodes, vertex) and nodes, and to summarize the strength of directionality. The calculation knows the centrality of the node.
在本實施例中,提供了一種新方法,其稱為多重巢式縱貫性統計方法(Multiple-Nested Longitudinal Statistic,MNLS),簡化矩陣形縱貫性研究之步驟,其核心概在於使用嵌套方式,簡化縱貫性研究之方法,加速計算過程但保留原本統計之正確性。該方法使用在自動化計算2x2(m*n)或以上矩陣之統計,也就是多維矩陣,其矩陣之運算速度取決於m元素之最大值,例如使用在100(i)x1000(j)之矩陣組合,能有效降低約一千倍運算時間,而使用在16459(i)x16459(j)之矩陣組合,則能有效降低約一萬六千多倍運算時間。該方法使用在計算大數據組合上,如上述描述之1x1疾病組合或1x16459組合並不適合使用多重巢式縱貫性統計方法(MNLS),必需達到2x2之疾病組合才有加速效果,因此上述之方法之手動輸入(步驟A018)為2x2或以上組合時,即啟動多重巢式縱貫性統計(MNLS)之運算模式。多重巢式縱貫性統計(MNLS)被啟動後,進入第一層之巢式迴圈,判讀疾病i是否出現在收件期步驟B001,即判讀為正確即判讀是為陽性(符號A:index disease positive(1st +))步驟B002,若為陽性即把該病人收錄在實驗組(Exposure(+))步驟B004,若步驟B002判斷為否,代表陰性(α:index disease negative(1st -)),即把該病人收錄在對照組(Exposure(-))(步驟B003)。步驟B003步驟B004之組別分別進入步驟B005與步驟B006之第二層巢式迴圈,在該層之巢式迴圈具有一判斷疾病i是否等於疾病j之判斷式(步驟B007-B008),若疾病i等於疾病j則重新進入第二層巢式迴圈(步驟B018),進入下一個疾病j之計算,該判斷式救能在m=n之情況下減少一多無意義之運算。在步驟B007與步驟B008判讀結束後,則進入步驟B009與步驟B010之判斷式。 In this embodiment, a new method is provided, which is called Multiple-Nested Longitudinal Statistic (MNLS), which simplifies the steps of matrix shape longitudinal research, and its core is to use nesting. Ways to simplify the method of longitudinal research, speed up the calculation process but retain the correctness of the original statistics. This method uses the statistics of the matrix of 2x2(m*n) or above in the automation calculation, that is, the multidimensional matrix, and the operation speed of the matrix depends on the maximum value of the m element, for example, the matrix combination of 100(i)x1000(j) is used. It can effectively reduce the computing time by about one thousand times, and the matrix combination of 16459(i)x16459(j) can effectively reduce the computing time by about 16,000 times. This method is used in the calculation of big data combinations. The 1x1 disease combination or the 1x16459 combination as described above is not suitable for using the multiple nested longitudinal statistical method (MNLS), and it is necessary to achieve a 2x2 disease combination to have an acceleration effect, so the above method When the manual input (step A018) is 2x2 or more, the operation mode of the multiple nested longitudinal statistics (MNLS) is started. After the multiple nested longitudinal statistics (MNLS) is activated, it enters the nesting loop of the first layer, and whether the disease i appears in the receiving period step B001, that is, the reading is correct and the reading is positive (symbol A: index Disease positive (1st +)) Step B002, if it is positive, the patient is included in the experimental group (Exposure (+)) step B004, if the step B002 is judged as no, it represents negative (α: index disease negative (1st -)) That is, the patient is included in the control group (Exposure (-)) (step B003). Step B003 The group of step B004 respectively enters the second layer nesting loop of step B005 and step B006, and the nest loop of the layer has a judgment formula for determining whether the disease i is equal to the disease j (step B007-B008), If the disease i is equal to the disease j, then re-enter the second layer of the nest loop (step B018), and enter the calculation of the next disease j, the judgment type rescue can reduce a meaningless operation in the case of m=n. After the interpretation of step B007 and step B008 is completed, the judgment formulas of step B009 and step B010 are entered.
步驟B009是判讀有沒有患上疾病i之人是否在追蹤期中出現疾病j,即是否α n β m(符號β:secondary disease positive(2nd +)),若否則代表該病人是α nBm(符號B:secondary disease negative(2nd -))。步驟B010則是判讀有患上疾病i之人是否在追蹤期中出現疾病j,即是否An β m,若否則代表該病人是AnBm。因此在第二層之迴圈可把病人分類成4種,分別是An β m步驟B012、AnBm步驟B014、α n β m步驟B011與α nBm(步驟B013),此4種分類目的在於使用AnBm與α nBm類別之資料進入統計(步驟B017),而An β m與α n β m之病人則不進入計算範圍(步驟B015以及步驟B016),並經步驟B018步驟B019進入第二層迴圈。在第二層所分類之病人的相關資料,將儲存在儲存裝置11中,因此在進入下一個疾病j時不需要重新分類病人,可使用上第二層巢式迴圈之上一個疾病j之病人分類AnBm(步驟B014)與α nBm(步驟B013),直至第二層巢式迴圈結束並進入第一層巢式回圈之下一個疾病i時再重新分類病人(步驟B001),如此類推。如此矩陣之運算速度取決於m元素之最大值,也就是第二層之疾病j之多寡,步驟B017所統計之疾病矩陣結果如步驟B020所顯示。在此,我們建立與原始資料具有差異之大型的Trajectory資料庫,其後之運算除非原始資料庫的更換,否則均使用同一Trajectory資料庫。也就是,在本實施例中,是將第一臨床醫療資料庫的複數個資料轉換為負數個具有預定格式的資料,將其儲存在一第二臨床醫療資料庫中。其格式的差異,可根據實際需求進行調整,在本發明中不作限制。 Step B009 is to determine whether the person suffering from the disease i has a disease j during the tracking period, that is, whether α n β m (symbol β: secondary disease positive (2nd +)), if otherwise, the patient is α nBm (symbol B) :secondary disease negative(2nd -)). Step B010 is to determine whether the person suffering from the disease i has the disease j in the tracking period, that is, whether or not An β m, if otherwise, the patient is AnBm. Therefore, in the second layer of the circle, the patient can be classified into four types, namely An β m step B012, AnBm step B014, α n β m step B011 and α nBm (step B013), and the four classification purposes are to use AnBm. The data with the α nBm category enters the statistics (step B017), while the patients of An β m and α n β m do not enter the calculation range (steps B015 and B016), and enter the second layer loop through step B018 and step B019. The relevant data of the patients classified in the second layer will be stored in the storage device 11, so that it is not necessary to reclassify the patient when entering the next disease j, and a disease on the second layer of the nest loop can be used. The patient classifies AnBm (step B014) and α nBm (step B013) until the second layer of nested loops ends and enters a disease i below the first layer of nested circles and then reclassifies the patient (step B001), and so on. . The operation speed of such a matrix depends on the maximum value of the m element, that is, the number of diseases of the second layer, and the disease matrix result counted in step B017 is as shown in step B020. Here, we create a large Trajectory database that differs from the original data, and the subsequent operations use the same Trajectory database unless the original database is replaced. That is, in the present embodiment, the plurality of data of the first clinical medical database are converted into a negative number of materials having a predetermined format, and stored in a second clinical medical database. The difference in format can be adjusted according to actual needs, and is not limited in the present invention.
在本實施例中,根據以上核心運算主模組的各種步驟,即可產生複數個臨床醫療數據資料以及對應的一MxN疾病矩陣。 In this embodiment, according to the various steps of the core computing main module, a plurality of clinical medical data and a corresponding MxN disease matrix can be generated.
二、自動化統計模組之自動化回歸診斷子模組 Second, automated statistical module automatic regression diagnostic sub-module
請參照圖7,圖7繪示為本發明實施例之自動化統計模組之自動回歸診斷子模組的示意圖。 Please refer to FIG. 7. FIG. 7 is a schematic diagram of an automatic regression diagnosis sub-module of an automated statistical module according to an embodiment of the present invention.
在前述內容中,係解釋疾病i與疾病j以回歸分析(regression analysis)計算風險比值。回歸分析除了能比較不同組之間是否有顯著的差異之外,還可以進行存活分析,比較不同別組的存活率。而當需要進行存活分析時,最常用的模式為考克斯比例風險模式(cox proportional hazard model)。而當違反考克斯(Cox)之假設時則需要作修正,可以把干擾因數(Confounding factors)當成分層之變數,在此常用之方法為分層考克斯回歸模式(Stratified Cox regression)。本專利設計之方法具有彈性,可讓使用者決定要使用之方法為向,選用考克斯比例風險模式(cox proportional hazard model)或分層考克斯回歸模式(Stratified Cox regression)。在本實施例中,則是利用較為保守之分層考克斯回歸模式(Stratified Cox regression)。在其他實施例中,可以利用前者或是其他方法,不論是前者或後者之方法,其回歸分析之結果都有待驗證,因此本專利設計自動化之回歸驗證之系統,以及其驗證失敗後之辦法。茲在驗證回歸分析之方法稱為回歸診斷(Regression Diagnostics),而本專利所設計之系統稱為自動化回歸診斷(Automatic Regression Diagnostics)。 In the foregoing, it is explained that disease i and disease j calculate the risk ratio by regression analysis. In addition to comparing significant differences between different groups, regression analysis can also perform survival analysis to compare survival rates in different groups. When the survival analysis is needed, the most common mode is the cox proportional hazard model. When the Cox hypothesis is violated, it is necessary to make corrections. The Confounding factors can be used as the component variables. The commonly used method is Stratified Cox regression. The method of this patent design is flexible, allowing the user to decide which method to use, using the cox proportional hazard model or the Stratified Cox regression. In this embodiment, a more conservative Stratified Cox regression is used. In other embodiments, the former or other methods may be utilized, and the results of the regression analysis of the former or the latter method are to be verified. Therefore, the system of regression verification of the patent design automation and the method after the verification failure. The method for verifying regression analysis is called Regression Diagnostics, and the system designed in this patent is called Automatic Regression Diagnostics.
該自動化系統分成以下幾個部份計算回歸診斷:共線性之診斷(Collinearity)、獨立性之診斷(Independence of errors)、常態分佈之測試(Tests for Normality)、選擇性的常態分佈之測試(Optional Tests for Normality)共四個部份。第一部份為共線性之診斷,引入步驟A以及步驟B之回歸分析之結果進行決定係數(R2)之計算(步驟D001),判斷決定係數(R2)是否少於0.8,若相關係數過高代表有共線性問題。若步驟D001判斷為正確,則進入步驟D002之計算,如此類推直至步驟D004之方法。步驟D002之方法為計算容忍值(Tolerance),數值之範圍設定為0.1到1,數值愈大共線性可能性愈低,若回歸之數值在設定範圍內,則進入步驟D003之方法。步驟D003之方法為計算變異數膨脹因素 (Variance inflation faction,VIF),其實際為容忍值之倒數,因此數值愈小其共線性的機率愈低,當小於10,則判讀為正確並進入步驟D004之方法。步驟D004之方法計算條件指標(Condition Index,CI),當條件指標(CI)小於30時判讀為正確並結束第一部份共線性之診斷。若步驟D001、步驟D002、步驟D003、步驟D004的判斷為錯誤,即便判讀該回歸之結果具有共線性(步驟D006),並進入其解決共線性之辦法。 The automated system is divided into the following sections to calculate regression diagnosis: Collinearity, Independence of errors, Tests for Normality, Selective Normal Distribution Test (Optional) Tests for Normality has four parts. The first part is the diagnosis of collinearity. The result of the regression analysis of step A and step B is introduced to calculate the coefficient of determination (R2) (step D001), and it is judged whether the coefficient of determination (R2) is less than 0.8, if the correlation coefficient is too high. The representative has a collinearity problem. If the determination in step D001 is correct, the process proceeds to step D002, and so on to the method of step D004. The method of step D002 is to calculate the tolerance value (Tolerance), and the range of the numerical value is set to 0.1 to 1. The larger the value is, the lower the collinearity possibility is. If the value of the regression is within the set range, the method proceeds to step D003. The method of step D003 is to calculate the Variance inflation faction (VIF), which is actually the reciprocal of the tolerance value, so the smaller the value is, the lower the probability of collinearity is. When less than 10, the interpretation is correct and proceeds to step D004. The method. The method of step D004 calculates a Condition Index (CI), and when the condition index (CI) is less than 30, the interpretation is correct and the diagnosis of the first partial collinearity is ended. If the determination in step D001, step D002, step D003, and step D004 is an error, even if the result of the regression is judged to be collinear (step D006), the method of solving the collinearity is entered.
解決共線性之辦法共有四項,第一項為「選用其他回歸之方法」(步驟D007),即以本專利的默認方法分層考克斯回歸模式(Stratified Cox regression)為例,若選用默認方法為回歸方法,當回歸診斷為共線性時則提供非預設方法之回歸方法為下一次之統計方法。 There are four ways to solve the collinearity. The first one is “Selecting other regression methods” (step D007), which is based on the Stratified Cox regression model of the patent. The method is a regression method. When the regression diagnosis is collinear, the regression method of the non-predetermined method is provided as the next statistical method.
第二項為選用偏最小二乘法(Partial Least Squares Regression,PLS)或主成分分析(Principal Components Analysis,PCA),PLS建立新的潛在變數(Latent Variables),預測矩陣比觀測的有更多變數,而主成分分析(Principal Components Analysis,PCA)則只解釋變數中尋找影響的變異(步驟D008)。 The second is the use of Partial Least Squares Regression (PLS) or Principal Components Analysis (PCA). PLS establishes new Latent Variables. The prediction matrix has more variables than the observed ones. Principal Components Analysis (PCA) only explains the variation in the variables looking for influence (step D008).
第三項為逐步回歸法(Stepwise Regression)與子集回歸法(Subset Regression)(步驟D009),而第四項為增加回歸的數量(步驟D010)。 The third term is Stepwise Regression and Subset Regression (Step D009), and the fourth term is to increase the number of regressions (Step D010).
第二部份則為計算回歸之獨立性(independence),採用杜賓-瓦特森統計量(Durbin-Watson test,DW test)(步驟D005),測定自相關(autocorrelation)是否在α顯著性水準下為正或負,並將檢驗統計量d(Test Statistic d)與關鍵值(Critical values,dL,α and dU,α),在正值下d<dU,α或在負值下(4-d)>dU,α,代表誤差項自相關為正或不為負,而在正值下d>dU,α或在負值下(4-d)<dU,α,代表誤差項自相關不為正或為負。若杜賓-瓦特森統計量結果顯示獨立性弱,則進入校正(步驟D011至步驟D017)。 若杜賓-瓦特森統計量結果顯示獨立性強,則完成獨立性之診斷,進入測定常態分佈(步驟D018)。 The second part is to calculate the independence of the regression, using the Durbin-Watson test (DW test) (step D005) to determine whether the autocorrelation is at the level of alpha significance. Positive or negative, and test statistic d and critical values (Critical values, dL, α and dU, α), under positive values d < dU, α or under negative values (4-d >dU,α, which means that the error term autocorrelation is positive or not negative, and under positive value d>dU,α or under negative value (4-d)<dU,α, represents the error term autocorrelation is not Positive or negative. If the Dubin-Watson statistic result shows that the independence is weak, the correction is entered (step D011 to step D017). If the Dubin-Watson statistic results show strong independence, the diagnosis of independence is completed and the measurement normal distribution is entered (step D018).
第五項為常態分佈(Normal Distribution)之驗證,檢測回歸是否非常態分佈,分為必須項與自選項,必須項是不需要經過使用者同意,在執行自動化回歸診斷即馬上執行,而自選項則需用戶同意或需經過使用者選擇方可執行。必須項之方法為常見之統計方法,即步驟D018-D020之方法。在步驟D018中,是計算標準差(Standard Deviation,SD)與四分位距(Interquartile range,IQR)之比值,若標準差(SD)=四分位距(IQR)/1.35,則為常態分佈,並進入步驟D019與步驟D020。若判讀為錯誤,則可能是重尾或輕尾分佈(heavier or lighter-than-normal tail),則進入步驟D029。而步驟D019與步驟D020之方法是計算其偏態(skewness)與峰度(kurtosis),若為常態分佈,其偏態與峰度均等於零。 The fifth item is the verification of Normal Distribution. It is detected whether the regression is abnormal or not. It is divided into necessary items and self-options. The required items are not required to be approved by the user, and are executed immediately after performing automatic regression diagnosis. It is subject to user consent or subject to user selection. The method of the mandatory item is a common statistical method, that is, the method of steps D018-D020. In step D018, the ratio of the standard deviation (SD) to the interquartile range (IQR) is calculated, and if the standard deviation (SD) = the interquartile range (IQR) / 1.35, the normal distribution is And proceeds to step D019 and step D020. If the interpretation is an error, it may be a heavier or lighter-than-normal tail, and the process proceeds to step D029. The method of step D019 and step D020 is to calculate the skewness and kurtosis. If it is a normal distribution, the skewness and kurtosis are both equal to zero.
當結束步驟D001至步驟D005與步驟D018至步驟D020的流程並判讀為正確後,即代表回歸為高可信度之結果,即可跳至步驟D021讓使用者判讀回歸之資料。若用戶選擇自選項之步驟D022,則進行安德森-達林檢定(Anderson-Darling test,A-D檢定)(步驟D023)、Shapiro-Wilk test(S-W檢定)(步驟D025)、Kolmogorov-Smirnow test(K-S檢定)(步驟D026)與Jarque-Bera test(J-B檢定)(步驟D027)之統計。當A-D檢定判讀結束後,先跳至步驟D024進入人數統計,若樣本數大於50則使用K-S檢定步驟D026,而小於50則使用S-W檢定(步驟D025)進行檢定,而步驟D025或步驟D026之方法執行後即進行步驟D027方法之Jarque-Bera test。當結束步驟D023至步驟D027方法並判讀為正確後,即代表回歸為高可信度之結果,跳至方法步驟D028讓使用者判讀回歸之資料。以上判斷方法為非時,則進入步驟D029自動重新進行回歸診斷。 After the process of step D001 to step D005 and step D018 to step D020 is completed and the reading is correct, that is, the result of returning to high confidence, the process may jump to step D021 to allow the user to read the returned data. If the user selects step D022 of the optional option, then the Anderson-Darling test (AD verification) (step D023), Shapiro-Wilk test (SW verification) (step D025), Kolmogorov-Smirnow test (KS verification) (Statistics of step D026) and Jarque-Bera test (step D027). When the AD verification is finished, first jump to step D024 to enter the number of people. If the number of samples is greater than 50, use KS verification step D026, and if less than 50, use SW verification (step D025) for verification, and step D025 or step D026. After execution, the Jarque-Bera test of the method of step D027 is performed. When the method of step D023 to step D027 is completed and the reading is correct, it represents the result of returning to high confidence, and the method jumps to method step D028 to let the user interpret the data of the regression. If the above determination method is non-time, the process proceeds to step D029 to automatically perform the regression diagnosis again.
三、驗證模組 Third, the verification module
請參照圖2以及圖6,圖6繪示為本發明實施例之驗證模組的示意圖。 Please refer to FIG. 2 and FIG. 6. FIG. 6 is a schematic diagram of a verification module according to an embodiment of the present invention.
在本實施例中,驗證模組D包括一內部驗證模組以及一外部驗證模組,分別敘述如下。 In this embodiment, the verification module D includes an internal verification module and an external verification module, which are respectively described below.
(1)內部驗證子模組 (1) Internal verification sub-module
為了驗證本系統推算出來的資料是精準且值得信賴的,驗證便成為了本系統不可或缺而相當重要的部份。為此,本系統的驗證模組D利用伺服器1的通訊裝置14以及資料探勘技術把過去使用臺灣健保資料庫(NHIRD)作為研究樣本之論文進行收集,並制定特定格式,彙集成資料庫以利日後進行比較。 In order to verify that the data derived from this system is accurate and trustworthy, verification has become an indispensable and important part of the system. To this end, the verification module D of the system uses the communication device 14 of the server 1 and the data exploration technology to collect the papers that used the Taiwan Health Insurance Database (NHIRD) as a research sample, and formulates specific formats and aggregates them into a database. Compare after the day.
由於依靠資料探勘所形成之資料庫需要專人驗證,並特別編列出一個獨立的小組-「驗證組」對本資料庫之資料進行處理與驗證,此段方法之簡介即會闡述整個資料探勘與驗證組的組織架構與驗證工作的流程經過。 Since the database formed by relying on data exploration needs to be verified by a special person, and an independent group--"verification group" is specially prepared to process and verify the data of this database, the introduction of this method will explain the whole data exploration and verification. The organization of the group and the process of verification work.
驗證工作可以分為五個步驟。一、以資料探勘方式收集論文;二、篩選出合適的論文並渲染成資料庫;三、經專家檢查資料庫是否正確;四、統整搜集的資料,並進行資料分析。 The verification work can be divided into five steps. First, collect papers by means of data exploration; Second, select appropriate papers and render them into a database; 3. Check whether the database is correct by experts; 4. Collect the collected data and analyze the data.
在第一步驟中,由於這個系統的核心運算結果是以呈現出「在某種疾病狀態下對於另一疾病」之「風險比值」為第一個運行的計畫,因此為了對這個計畫進行驗證,首要篩選出的論文,即是以「研究單一疾病對於另一疾病所造成的風險比值」(Hazard ratio of IJ)為篩選條件,並設定搜尋的論文為縱向研究。在本實施例中,是以臺灣健保資料庫(NHIRD)為基礎,雖然設計之方法為通用之系統,但由於其資料為臺灣市民之就醫記錄,所以在此以搜尋臺灣健保資料庫(NHIRD)所發表之論文,也就是在步驟C000-1中,是選取臺灣健保資料庫(NHIRD)作為主要資料庫,尤其是百萬歸入檔(LHID)所發表之論文。學術論文搜尋引擎選 用PubMed、Google Scholar、Web of Science、Medline等,其中以PubMed所搜尋之期刊優先處理。 In the first step, since the core operation result of this system is the first operation plan showing the "risk ratio" for "one disease in another disease state", in order to carry out the plan Verification, the primary screening of the paper, is to study the "Hazard ratio of IJ" as a screening condition, and set the search for the paper as a longitudinal study. In this embodiment, based on the Taiwan Health Insurance Data Base (NHIRD), although the design method is a general-purpose system, since the information is the medical record of the Taiwanese citizens, the Taiwan Health Insurance Database (NHIRD) is searched here. The paper published, that is, in step C000-1, is the selection of the Taiwan Health Insurance Database (NHIRD) as the main database, especially the paper published by the million-entry file (LHID). The academic paper search engine selects PubMed, Google Scholar, Web of Science, Medline, etc., and the journals searched by PubMed are prioritized.
在本實施例的步驟C000-2中,透過學術機構購買之學術期刊下載相關論文,並以論文之PMID(即論文在PubMed中所使用之論文ID)命名,以可攜式文件格式(PDF)進行儲存。由於PubMed所搜尋之論文只能代表臺灣健保資料庫(NHIRD)所發表之一大部份,其餘論文即以其他學術論文搜尋引擎搜尋,並把交集之論文刪除。由所下載之可攜式文件格式(PDF)論文經由Poppler(0.42.0 version,2016,freedesktop.org)工具進行可攜式文件格式(PDF)渲染成網頁格式(html)(步驟C001)。儲存成之網頁格式(html)之論文有利進行文字探勘,在python語言下運行pandas(0.18.0 version)處理網頁格式(html)論文(步驟C002)。最後網頁格式(html)論文會渲染成方便python處理之格式(步驟C003)。該結構之論文會進行文字探勘,探勘所取得之結果將儲存成數據庫或是資料庫,在本實施例中,即是將所下載並進行格式轉換的論文儲存在資料庫13中。 In step C000-2 of the embodiment, the relevant academic papers are downloaded through the academic journals purchased by the academic institution, and the PMID of the paper (ie the paper ID used in the PubMed) is named in the portable file format (PDF). Save it. Since the papers searched by PubMed can only represent a large part of the Taiwan Health Insurance Database (NHIRD), the rest of the papers are searched by other academic paper search engines and the papers of the intersection are deleted. The portable file format (PDF) is downloaded from the downloaded Portable Document Format (PDF) via the Poppler (0.42.0 version, 2016, freedesktop.org) tool into a web page format (html) (step C001). The paper stored in the web page format (html) facilitates text exploration, and runs pandas (0.18.0 version) in python language to process web page format (html) papers (step C002). The final web page format (html) paper will be rendered in a format that is convenient for python processing (step C003). The paper of the structure will be searched for text, and the results obtained by the exploration will be stored into a database or a database. In this embodiment, the downloaded and formatted papers are stored in the database 13.
在內部驗證中所探勘之資料必需對驗證之進行有正面影響,因此在本實施例中,選取少數能在驗證中發揮實際驗證用途之資料作探勘之用,如論文之基本資料、收件標準與排除標準、病人之分組與分層、研究之性質與統計資料共五大項,分別在以下闡述第二步驟之方法。第一項論文之基本資料,在步驟C004中已經收集了PubMed ID、論文題目、數位物件識別號(Digital Object Identifier,簡稱DOI)、論文刊登之期刊名稱、期刊之卷(Volume、Issues)與頁數(Page)等。 The information explored in the internal verification must have a positive impact on the verification. Therefore, in this embodiment, a small number of materials that can be used for verification in the verification are selected for exploration purposes, such as the basic data of the paper and the receiving standard. There are five major items related to exclusion criteria, grouping and stratification of patients, nature of research and statistics, and the methods of the second step are explained below. In the basic data of the first paper, in Pub C004, the PubMed ID, the title of the paper, the Digital Object Identifier (DOI), the title of the journal published in the paper, the volume of the journal (Volume, Issues) and the page have been collected. Number (Page) and so on.
第二項之收件標準(步驟C005)與排除標準(步驟C006),即根據研究之特殊要求而限制之標準,舉例如下:研究糖尿病病人是否比一般人有更高機會發生硬腦膜下血腫(Subdural hematoma),因此「收件標準」(步驟C005)設定為在收件期中有 發現或沒有發現糖尿病之病人,而「排除標準」(步驟C006)則是在收件期中發生過硬腦膜下血腫之病人必需排除。論文中可能多於一個標準,甚至沒有設定任何標準,因此資料庫在後繼必需進行處理。 The second item's acceptance criteria (step C005) and exclusion criteria (step C006), which are based on the specific requirements of the study, are as follows: Study whether diabetic patients have a higher chance of subdural hematoma than the average person (Subdural) Hematoma), therefore the "receipt criteria" (step C005) is set to patients who have found or not found diabetes during the inbox period, and the "exclusion criteria" (step C006) are patients who have had a subdural hematoma during the ingestory period. Must be excluded. There may be more than one standard in the paper, and there is not even any standard set, so the database must be processed in succession.
第三項為病人之分組與分層(步驟C007),在論文中通常會把病人按照年齡、性別、收入、社經地位等把病人分成幾個組別,方便比較不同組別病人是否在疾病發生率上有顯著差異,描述性統計等。 The third item is the grouping and stratification of patients (step C007). In the paper, patients are usually divided into several groups according to age, gender, income, social status, etc., to compare whether different groups of patients are in the disease. There are significant differences in incidence, descriptive statistics, and so on.
第四項為研究之性質(步驟C008),即世代追蹤研究法,病例對照研究法,隨機對照試驗(RCT),綜合分析(meta-analysis)與系統回顧(systemic reviews)等。因在本實施例中,主需要驗證之資料為縱向研究之性質,因此會優先收納與處理該類資料,而綜合分析與系統回顧則有可能包括臺灣健保資料庫(NHIRD)之資料。 The fourth item is the nature of the study (step C008), namely the generational follow-up study, the case-control study, the randomized controlled trial (RCT), the meta-analysis and the systemic reviews. In this embodiment, the information that the main need to verify is the nature of the longitudinal study, so the data will be preferentially stored and processed, and the comprehensive analysis and system review may include the information of the Taiwan Health Insurance Database (NHIRD).
第五項為統計資料(步驟C009),資料分別有概略(Crude)和調整(Adjusted)後之風險比值(Hazard Ratio)、相對風險(Relative Risk)、勝算比(Odd Ratio),與相對應之顯著性差異值(P value)與95%信賴區間。茲統計資料之收集主要取決於論文性質與資料之關鍵字,其次為組別與分層,如論文之研究性質為病例對照研究法,則收集之資料該為勝算比,而世代追蹤研究法則收集相對風險與風險比值。以上由資料(或文字)探勘所取得之資料,儲存為資料庫後,需經過專家檢查資料欄位,資料格式是否有需要經過進一步次處理。 The fifth item is statistical data (step C009), and the data has a summary (Crude) and adjusted (Hazard Ratio), relative risk (Relative Risk), odds ratio (Odd Ratio), and corresponding Significant difference value (P value) and 95% confidence interval. The collection of statistical data mainly depends on the nature of the paper and the keywords of the data, followed by group and stratification. If the research nature of the paper is case-control research, the collected data should be the odds ratio, while the generation tracking research rules are collected. Relative risk to risk ratio. After the above information obtained by data (or text) exploration is stored in the database, it is necessary to go through the expert inspection data field, and whether the data format needs to be processed further.
第三步驟為專家檢查資料庫是否正確。從資料(或文字)探勘中找出了本機構使用健保資料庫為資料庫基底撰寫出來的論文,共有600餘篇論文,而從這600餘篇論文的標題、論文內容等經上述條件判斷後,進而篩選出若干篇論文。在第二步驟中,通過步驟C005至步驟C009,將論文按照固定的資料庫格式進行資料 搜集並儲存成資料庫(步驟C010)。如果論文有寫到的其他資料,系統亦會進行紀錄並儲存在資料庫的其他項中。在搜集的過程中往往會遇到一些問題,如:學術機構因沒有購買許可權,無法免費下載論文全文,或經判斷後發現該篇論文不符合本系統的要求,前者可透過其他學術機構下載、後者則是初步不需要搜集,因此最後搜集到的論文總數會比所初步搜尋到的還要少一些。此外,雖然目前,臺灣健保局正在推行ICD-10編碼,但因為還未正式上線,在本實施例中,仍以ICD9診斷碼進行搜集。第三步驟,集結成資料庫之資料共有超千筆,最後經過專家驗證後發現錯誤共百筆,修正後總共得到了一千餘筆統計資料。最後第四步,針對搜集到的資料庫,將其統計資料與本實施例所統計之資料進行回歸分析(以Hazard Ratio之資料為統計之基準),並計算決定係數(Coefficient of determination,R2)。決定係數(R2)為大於0.6,由此可以得知,根據這樣嚴謹的資料搜集過程與回歸分析所得到的驗證,本實施例所發展出來的系統是相當可靠並值得信賴的。其他如藥物、手術、處置(procedure)等並非簡單的單一疾病對於單一疾病所造成風險比值的論文題目,便不在初步的驗證計畫之中。 The third step is for the expert to check if the database is correct. From the data (or text) exploration, we found out that the organization used the health insurance database to write the papers for the database base. There are more than 600 papers, and the titles and papers of the more than 600 papers are judged by the above conditions. And then selected several papers. In the second step, through the steps C005 to C009, the papers are collected according to a fixed database format and stored into a database (step C010). If the paper has other information written, the system will also record and store it in other items in the database. In the process of collecting, there are often some problems. For example, an academic institution cannot download the full text of the paper for free because it does not have the right to purchase the license. After the judgment, it is found that the paper does not meet the requirements of the system. The former can be downloaded through other academic institutions. The latter is initially not required to be collected, so the total number of papers collected will be less than the initial search. In addition, although the Taiwan Health Insurance Bureau is currently implementing the ICD-10 code, since it has not yet been officially launched, in this embodiment, the ICD9 diagnostic code is still collected. In the third step, the data collected in the database was over one thousand. Finally, after the expert verification, a total of hundreds of errors were found. After the amendment, a total of more than one thousand statistics were obtained. In the last step, for the collected database, the statistical data and the data calculated in this example are subjected to regression analysis (using the Hazard Ratio data as the benchmark), and the coefficient of determination (R2) is calculated. . The coefficient of determination (R2) is greater than 0.6, from which it can be known that the system developed in this embodiment is quite reliable and reliable based on the verification of such a rigorous data collection process and regression analysis. Other paper topics such as drugs, surgery, procedures, etc., which are not simple single disease-to-risk ratios for a single disease, are not included in the preliminary validation plan.
(2)外部驗證子模組(External Validation) (2) External Validation Submodule (External Validation)
由於上述之方法所描述之技術所使用之資料庫為臺灣健保資料庫(NHIRD),即為臺灣本地之資料,因此需要臺灣木土所發表之臺灣健保資料庫(NHIRD)之資料作為驗證之用,此為內部驗證。由於內部驗證有所缺考,因此需要外來的資料庫或資料作為驗證之用,此為外部驗證。假若上述技術所使用之臺灣健保資料庫(NHIRD)換成美國本土之資料庫,則內部驗證所探勘文獻為美國本土資料庫所發表之論文中的資料,而外部資料庫則為非該本土之資料之美國其他資料庫或外國所發表之論文中的資料。在此外部驗證之本土資料使用由臺北醫學大學所發展之疾病地圖 (The Disease Maphttp://disease-map.net/),該資料庫使用病歷對照研究法之自動化統計技術,計算病人之分層資料,與本技術設計之格式類似。而外部資料庫之外國資料則使用著名之費雷明漢心臟研究(Framingham heart study,https://www.framinghamheartstudy.org/)以及由哈佛大學發展之HuDiNe資料庫(http://barabasilab.neu.edu/projects/hudine)。HuDiNe資料庫為美國之類似世代追蹤之研究,即研究人員使用創新之方法模擬縱向研究。而費雷明漢心臟研究使用世代追蹤研究法,並根據不同研究題材針對干擾因數有所調整,由專家設計實驗,因追蹤病人之時間長達幾十年(1948年開始),在醫學界有著極高可信度,其缺點為局限於心臟。費雷明漢心臟研究、HuDiNe以及The Disease Map所提供之資料庫之年齡層之分層各有差異,與本技術設定之默認年齡分層有差異。如本技術設計之年齡分層為每20歲分一層共分四層,而其疾病地圖(The Disease Map)之分層為每10歲分一層共分十層,因此若要以疾病地圖(The Disease Map)之資料驗證本技術計算之精確度,則本技術之分層需要設定為每10歲分一層共分十層,並計算回歸之決定係數(R2)。決定係數(R2)愈接近1.0代表此回歸模式能夠解釋全體Yi變異量的比例愈大,兩種資料庫之資料愈接近。本實施例之驗證基準是以費雷明漢心臟研究之決定係數(R2)為優先參考標準。經過實際運算結果,本實施例的分析系統與費雷明漢心臟研究回歸之決定係數(R2)皆為大於0.6,代表本實施例所發展出來的系統是相當可靠並值得信賴,與國際標準十分接近。 Since the database used in the technology described in the above method is the Taiwan Health Insurance Database (NHIRD), which is the data of Taiwan, the data of the Taiwan Health Insurance Database (NHIRD) published by Taiwan Woodland is required for verification. This is an internal verification. Due to the lack of internal verification, an external database or data is required for verification. This is external verification. If the Taiwan Health Insurance Database (NHIRD) used in the above technology is replaced by a database in the United States, the internal verification research literature is the data in the paper published by the US local database, while the external database is not the local one. Information in other US databases or in papers published in foreign countries. The externally validated local data uses the disease map developed by Taipei Medical University (The Disease Maphttp://disease-map.net/), which uses the automated statistical techniques of the medical record control method to calculate patient stratification. The data is similar to the format of this technical design. The external data from the external database uses the famous Framingham heart study (https://www.framinghamheartstudy.org/) and the HuDiNe database developed by Harvard University (http://barabasilab.neu) .edu/projects/hudine). The HuDiNe database is a similar generation tracking study in the United States where researchers use innovative methods to simulate longitudinal studies. The Felemingham Heart Study uses the generational tracking method and adjusts the interference factor according to different research topics. The experiment is designed by experts. Because the patient has been tracking the patient for several decades (beginning in 1948), there is a medical profession. Extremely high confidence, its shortcoming is limited to the heart. The stratification of the ages of the databases provided by the Fremingham Heart Study, HuDiNe, and The Disease Map varies, and is different from the default age stratification set by this technique. The age of the technology design is divided into four layers every 20 years old, and the stratification of the Disease Map is divided into ten layers every 10 years old, so to use the disease map (The The data of Disease Map) verifies the accuracy of the calculation of this technique. The layering of this technology needs to be set to be divided into ten layers every 10 years old, and the regression coefficient (R2) is calculated. The closer the coefficient of determination (R2) is to 1.0, the greater the proportion of the total variability that can be explained by this regression model, and the closer the data of the two databases are. The verification benchmark of this example is based on the decision coefficient (R2) of the Fremenming Heart Study as a priority reference standard. After the actual calculation results, the determination coefficient (R2) of the regression system of the present embodiment and the Fremingham heart research is greater than 0.6, and the system developed by the embodiment is quite reliable and trustworthy, and the international standard is very Close.
〔本實施例之顯示裝置〕 [Display device of this embodiment]
在本實施例中,跨平台臨床醫療數據分析及顯示系統的顯示裝置12還包括一電腦版網頁顯示模組121以及一移動裝置版應用程式顯示模組122。分別提供使用者在不同的載具上登錄本發明的跨平台臨床醫療數據分析及顯示系統,都能得到最佳的觀賞經驗。 In this embodiment, the display device 12 of the cross-platform clinical medical data analysis and display system further includes a computer version of the webpage display module 121 and a mobile device version of the application display module 122. Providing the user with the cross-platform clinical medical data analysis and display system of the present invention on different vehicles can obtain the best viewing experience.
另外,在本實施例中,網頁內容還分為專業版網站以及普通版網站,其分別敘述如下。以下請參照圖8至圖11,圖8繪示為根據本發明實施例之伺服器的流程圖。圖9繪示為本發明實施例專業版網頁之示意圖。圖10繪示為本發明實施例普通版網頁之示意圖。圖11繪示為本發明實施例應用程式之示意圖。 In addition, in the embodiment, the webpage content is further divided into a professional version website and a general version website, which are respectively described below. Please refer to FIG. 8 to FIG. 11 below. FIG. 8 is a flowchart of a server according to an embodiment of the present invention. FIG. 9 is a schematic diagram of a professional version webpage according to an embodiment of the present invention. FIG. 10 is a schematic diagram of a normal version webpage according to an embodiment of the present invention. FIG. 11 is a schematic diagram of an application according to an embodiment of the present invention.
(1)專業版網站(電腦版) (1) Professional Edition website (computer version)
在本實施例中,經過資料庫13的整合、統計以及驗證後,所統計之資料以網站呈現。而網站之呈現方式,資料傳送之方法(包括輸入與輸出),以及結果之呈現將在以下詳細說明。網站的核心架構建立在物件-關係型資料庫管理系統(PostgreSQL,PostgreSQL Global Development group,9.5.2 version)與結構化查詢語言伺服器(SQL server 2016,Microsoft)上,透過Shiny(RStudio project.© 2014 RStudio,Inc.)作前端介面控制,範本(layout)傳送至伺服器1的顯示裝置12的電腦版網頁顯示模組121以及移動裝置版應用程式顯示模組122進行動態呈現(如圖8以及圖9所示的內容)。 In this embodiment, after the integration, statistics, and verification of the database 13, the statistical data is presented on the website. The way the site is presented, the method of data transfer (including input and output), and the presentation of the results are detailed below. The core architecture of the website is based on the Object-Relational Database Management System (PostgreSQL, PostgreSQL Global Development group, 9.5.2 version) and the Structured Query Language Server (SQL Server 2016, Microsoft) via Shiny (RStudio project.©). 2014 RStudio, Inc. for front-end interface control, the template is transmitted to the computer version of the display device 12 of the display device 12 of the server 1 and the mobile device application display module 122 for dynamic presentation (see FIG. 8 and The content shown in Figure 9).
由本實施例的跨平台由本專利之方法所分析之結果之呈現基於使用者之專業,提供不同網站和資料呈現之方式讓使用者使用。而網站之呈現方式可以分成以下幾種類型,一、供普通使用者(或病人)使用之網頁,二、供保險從業員使用之網頁,以及三、提供醫學、藥學或公共衛生之科研人員使用之網頁。而不同網頁之呈現,代表顯示介面、使用方法、資料呈現有所差異。 The presentation of the results analyzed by the method of the present patent by the cross-platform of the present embodiment is based on the user's specialty, and provides different ways for the website and the data to be presented to the user. The presentation methods of the website can be divided into the following types: one for the ordinary users (or patients), the second page for the insurance practitioners, and the third for medical, pharmaceutical or public health researchers. Web page. The presentation of different web pages represents differences in display interface, usage, and presentation.
茲因醫學、藥學或公共衛生之科研人員使用之網頁為本專利之核心技術,因此以專業版為核心開發網頁,如圖9中顯示的區塊F001至區塊F008是以Shiny設計之前端介面控制。區塊F001為商標或網站之名稱,以文字或圖案顯現。區塊F002則是使用者帳號(USER ACC.)之資料,使用者帳號之建立(步驟E016)時,需要使用者選擇使用之類別,使用之類別分為上述之三種(普通使用者、保險界與學術界人士),選擇使用類別後,需要填寫使用帳 號、密碼、驗證密碼、電郵(驗證或結果傳送)、最高學歷、職稱、學術機構名稱(學術類)、公司(保險或其他機構)、位址等資料,若普通使用者則新增過去病史、過去手術史、過去藥物史、過去外傷等用於統計未來十二年之風險。該電郵位址往後會使用在用戶登入、收費與結果之傳送上。使用者之側寫(USER PROFILES)儲存在結構化查詢語言伺服器(SQL Server)與物件-關係型資料庫管理系統(PostgreSQL)中。當使用者在離開前端網頁,或註銷帳號之後再重新進入時,需要重新登入系統方能繼續。使用者登入時輸入使用者之帳號(或電郵)與密碼,並與儲存在結構化查詢語言伺服器(SQL Server)與物件-關係型資料庫管理系統(PostgreSQL)中之使用者之側寫(USER PROFILES)進行核對。使用者之喜好,過去曾經計算過之結果會一一被記錄。使用者之喜好記錄使用者計算時之選擇,共統計選擇次數並儲存在資料庫中,每一之使用之記錄會累積並更新在資料庫中。在登入之後,這些使用次數最多之記錄會經過使用者需求(USER REQUEST)並調整選項之預設值,並進行後續處理。使用者亦可以按照自己之喜好進行設定。使用者登入後會呼叫使用者之側寫(步驟E011),使用者開始使用查詢系統(步驟E006)時,使用者使用的查詢參數,都會傳送到伺服器1(步驟E007),並根據使用者設定之參數進行內部運算(步驟E008),當內部運算結束後,其結果則會記錄在用戶之側寫中(步驟E015),當使用者查詢記錄(步驟E014)並調閱某一項已運算之結果時,所點選之記錄則會再次被呼叫(步驟E012),當伺服器接收後不需要重新運算,運算結果則會顯示於前端介面上(步驟E005)。在區塊F003中,即是為選擇疾病之系統或器官,其疾病之分類是按照診斷碼(ICD)與主要診斷類別(Major Diagnostic Category,MDC)的疾病分類排序的25種組合。區塊中之3碼選項是根據系統之選擇,動態改變3碼之選項,而5碼選項則根據3碼之選擇或系統之選擇而改變,若使用者沒有 選擇系統或器官之選項,則會選項會按照診斷碼(ICD)由小到大排序顯示。區塊F003之選取具多重選擇之功能,即研究人員可以同時選擇高血壓與糖尿病等疾病,代表可以同時計算「共病」狀態下之統計。區塊F004之版面提供使用者選擇藥物種類(Drug Type,依照ATC與NHIRD之編碼),藥物劑量(Drug Dose),藥物劑型(Dosage forms),藥物顆數(Tablet),藥物使用部位與方法(Pathway),藥物使用天數(Day Used),藥物使用頻率與藥物總劑量(Total Dose)。藥物種類與藥物總劑量為必需填寫項目。藥物總劑量之計算為必計算之選項,目的為了方便計算其區間,對資料進行分群工作。若根據臺灣藥物之編碼搜尋,由於臺灣健保局建立自己獨有之藥物分類系統,有別於國際認可之解剖學治療學及化學分類系統(Anatomical Therapeutic Chemical Classification System,ATC)編碼,NHIRD藥物資料中亦缺乏藥物之Defined Daily Dose(DDD),因此若使用者提供之編碼為臺灣之編碼,則採用如下之計算方法。臺灣之藥物總劑量之換算等於單一藥物劑量乘以藥物顆數乘以藥物頻率乘以藥物天數。而ATC之計算則按照WHOCC之指引。區塊E004之版面亦如區塊F003相同,具有多重選擇之功能,即研究人員可以同時選擇降血壓藥與降血糖藥,代表可以同時計算病人之複雜用藥之統計。另外區塊F004亦提供系統讓使用者計算外科之相關統計,根據手術部位,手術方式,手術裝置(DEVICES)等資料作分層使用。區塊F003前端介面之選擇參數(步驟E001),透過Shiny傳送(步驟E005)至區塊F009之版面。區塊F006為基本設定,本版面在使用者登入時有一默認設定,如圖中所顯示之勾選符號。基本設定區塊F006之版面分成基本資料分組之選擇項,為性別(Gender)、年齡(Age)、收入(Income)、社經等級(Urbanization Level)共四項。選擇項之數值為本網站之默認設定,使用者可以按照需求勾選不同組別,使用者亦可以自己輸入區間值。社經等級為非數值項, 因此無法提供輸入區間值之功能。區塊F006之版面除提供使用者調整研究週期(Study period),即收件時間(Include)、排除時間(Exclude)、追蹤時間或觀察時間(Follow up)。研究種類則提供世代追蹤(Cohort Study)跟病歷對照研究法(Case Control Study)兩種方法,研究人員可以根據其要求選擇不同之研究方法,例如使用臺灣健保資料庫(NHIRD)研究藥物與疾病發現之風險最常用之方法為病歷對照研究法。回歸診斷之方法提供之自選功能,即(步驟C018至步驟C020)之方法,在此可供用戶選擇。圖表之製作方法為如區塊F006所顯示,如人口統計(Demographic)、追蹤期之統計(Follow Up)、Kaplan-Meier Curve(KM curve)、分點陣圖(Quantile-Quantile Plot,QQ Plot)、箱形圖(Box Plot)、殘差分析(Residual Analysis)、森林圖(Forest Plot)、累積發生率(Cumulative incidence)、Schoenfeld Plot等,由於圖表種類之多,在圖中不能盡述。上述選擇啟動之後的參數[E001[E002[E003,透過區塊F007版面之按鍵啟動(步驟E002),傳送至伺服器(步驟E008)進行運算。區塊F005之版面為選項控制項控制設定(tab control)與結果頁面之顯示,具有能顯示多層頁面之功能,共有8層前頁面可呼叫並顯示在前端介面(步驟E005)。在登入系統時顯示在第一層之頁面為設定之頁面(Setting),區塊F009的介面設計為顯示在區塊F003與區塊F004中所選取之選項,並按照選取之物件分類,如區塊F002所示,診斷碼(ICD)之選取選擇會優先於區塊F004之選擇,區塊F003則是顯示在區塊F004之上,並提供多重選擇之顯示。診斷碼顯示之排序是按照診斷碼之大小,而每個診斷碼亦附上各自之疾病名稱,以及所屬之系統或器官,選擇所啟動之參數(步驟E001)傳送至步驟E002待命,當區塊F007被啟動後,步驟E002中使用的參數將傳送至步驟E004,而步驟E001之參數亦可透過Shiny傳送至步驟E005通過前端介面進行顯示。 The webpage used by researchers in medicine, pharmacy or public health is the core technology of this patent. Therefore, the webpage is developed with the professional version as the core. The block F001 to block F008 shown in Figure 9 is the front interface designed by Shiny. control. Block F001 is the name of a trademark or website and appears in text or graphics. Block F002 is the data of the user account (USER ACC.). When the user account is established (step E016), the user needs to select the category to be used. The categories used are classified into the above three categories (ordinary users, insurance circles). And academics), after choosing to use the category, you need to fill in the account number, password, verification password, email (verification or result delivery), highest education, title, academic institution name (academic), company (insurance or other institution), For information such as address, if the average user adds new medical history, past surgical history, past drug history, past trauma, etc., it will be used to calculate the risk for the next 12 years. The email address will be used later for user login, billing and delivery of results. The user's profile (USER PROFILES) is stored in the Structured Query Language Server (SQL Server) and the Object-Relational Database Management System (PostgreSQL). When the user leaves the front page or re-enters after canceling the account, they need to log in again to continue. When the user logs in, the user's account (or email) and password are entered and written to the user stored in the Structured Query Language Server (SQL Server) and the Object-Related Database Management System (PostgreSQL) ( USER PROFILES) to check. User preferences, the results that have been calculated in the past will be recorded one by one. The user's preference records the user's choice of calculations, counts the number of selections and stores them in the database, and each used record is accumulated and updated in the database. After logging in, these most used records will pass the user request (USER REQUEST) and adjust the preset values of the options for subsequent processing. Users can also set it according to their own preferences. After the user logs in, the user's profile is called (step E011). When the user starts using the query system (step E006), the query parameters used by the user are transmitted to the server 1 (step E007), and according to the user. The set parameter is internally calculated (step E008). When the internal operation ends, the result is recorded in the user's side write (step E015), when the user queries the record (step E014) and accesses an item that has been calculated. As a result, the selected record will be called again (step E012). When the server receives it, it does not need to be recalculated, and the operation result is displayed on the front end interface (step E005). In block F003, the system or organ for selecting a disease, the classification of the disease is 25 combinations according to the disease classification of the diagnostic code (ICD) and the major diagnostic category (MDC). The 3 code option in the block is to dynamically change the option of 3 codes according to the choice of the system, and the 5 code option is changed according to the choice of 3 codes or the choice of the system. If the user does not select the system or organ option, The options are sorted by the diagnostic code (ICD) from small to large. The selection of block F003 has multiple selection functions, that is, researchers can select diseases such as hypertension and diabetes at the same time, and the representative can calculate the statistics under the state of "common disease" at the same time. The layout of block F004 provides the user with the choice of drug type (Drug Type, according to ATC and NHIRD code), drug dose (Drug Dose), drug dosage form (Dosage forms), drug number (Tablet), drug use site and method ( Pathway), Day Used, frequency of drug use and total dose of drug (Total Dose). The type of drug and the total dose of the drug are required to fill out the project. The calculation of the total dose of the drug is an inevitable option for the purpose of facilitating the calculation of the interval and grouping the data. According to the Taiwanese drug code search, because the Taiwan Health Insurance Bureau has established its own unique drug classification system, it is different from the internationally recognized Anatomical Therapeutic Chemical Classification System (ATC) code, NHIRD drug data. There is also a lack of Defined Daily Dose (DDD) for drugs. Therefore, if the code provided by the user is Taiwan, the following calculation method is used. The conversion of the total dose of the drug in Taiwan is equal to the single drug dose multiplied by the number of drugs multiplied by the drug frequency multiplied by the number of drug days. The calculation of ATC is in accordance with the guidelines of the WHOCC. Block E004 is also the same as block F003, with multiple selection functions, that is, researchers can select both blood pressure lowering drugs and hypoglycemic agents, which can calculate the statistics of patients' complicated medications at the same time. In addition, block F004 also provides a system for the user to calculate the relevant statistics of the surgery, based on the surgical site, surgical mode, surgical device (DEVICES) and other data for layering. The selection parameter of the front end interface of the block F003 (step E001) is transmitted through Shiny (step E005) to the layout of the block F009. Block F006 is the basic setting. This layout has a default setting when the user logs in, as shown in the figure. The layout of the basic setting block F006 is divided into the basic data grouping options, which are four items of gender, age, income, and Urbanization Level. The value of the option is the default setting of the website. Users can check different groups according to their needs. Users can also enter the interval value by themselves. The social level is a non-numeric item, so the function of inputting the interval value cannot be provided. The layout of block F006 provides a user to adjust the study period, that is, the Include time, the Exclude time, the tracking time or the Follow up time. The research category provides two methods, Cohort Study and Case Control Study. Researchers can choose different research methods according to their requirements, such as using the Taiwan Health Insurance Database (NHIRD) to study drugs and disease discovery. The most common method of risk is the medical record control study. The method of returning to the diagnosis provides a self-selecting function, that is, the method (step C018 to step C020), which is available for the user to select. Charts are produced as shown in block F006, such as Demographic, Follow Up, Kaplan-Meier Curve (KM curve), and Quantile-Quantile Plot (QQ Plot). Box Plot, Residual Analysis, Forest Plot, Cumulative incidence, Schoenfeld Plot, etc. Due to the variety of charts, they cannot be described in the figure. The above parameter after the start of the selection [E001[E002[E003] is activated by the button of the block F007 layout (step E002) and transmitted to the server (step E008). The layout of block F005 is the display of the tab control and the result page. It has the function of displaying multiple layers of pages. A total of 8 layers of front pages can be called and displayed on the front-end interface (step E005). When logging in to the system, the page displayed on the first layer is the setting page (Setting), and the interface of the block F009 is designed to display the options selected in the block F003 and the block F004, and classified according to the selected object, such as the area. As shown in block F002, the selection of the diagnostic code (ICD) will take precedence over the selection of block F004, and block F003 will be displayed above block F004 and provide a display of multiple selections. The diagnostic code display is sorted according to the size of the diagnostic code, and each diagnostic code is also accompanied by the name of the disease, and the system or organ to which it belongs, and the selected parameter (step E001) is transmitted to step E002 for standby, when the block is After F007 is started, the parameters used in step E002 will be transmitted to step E004, and the parameters of step E001 can also be transmitted through Shiny to step E005 through the front end interface for display.
區塊F010為第二層頁面,其為人口統計之表格(Demographic),人口統計之表格顯示是根據使用者設定之組別,如性別、年齡等分組,以及各組別的統計資料,如實驗組(Exposure positive)與對照組(Exposure negative)在各組別之人口以及百分比,各組別的顯著性差異值(P value)等。若使用者需要統計藥物或共病史,則會在表格中增設對應之欄位。若使用者在區塊F006勾選時改變了預設值,系統即會根據區塊F006之勾選選項重新安排欄位。 Block F010 is the second level page, which is a demographic table. The demographic table is displayed according to the user-defined group, such as gender, age, etc., as well as statistics of each group, such as experiments. Exposure positive and the control negative (Exposure negative) in each group of population and percentage, the significance of each group (P value) and so on. If the user needs to count drugs or a history of comorbidities, a corresponding field will be added to the form. If the user changes the preset value when the block is checked in block F006, the system will rearrange the field according to the check box of block F006.
區塊F011即是第三層頁面,也就是追蹤期統計之表格(Follow Up),類似人口統計之表格,顯示根據使用者設定之組別,如性別、年齡等分組,以及各組別「追蹤」時發生事件的統計資料,如實驗組(Exposure positive)與對照組(Exposure negative)在各組別之事件(Event,圖中簡寫為EVE)、人年(Person-year,PY)、與發生率(Incidence rate,IR)等,以及率比(Incidence Rate Ratio,IRR),調整後之風險比值(hazard ratio),95百分比信賴區間(95% Confidence Interval,95%CI)與顯著性差異值(P value)與上述一樣,若使用者需要統計藥物或共病史,則會在表格中增設對應之欄位。 Block F011 is the third-level page, which is the follow-up statistics table (Follow Up). The table similar to the demographics shows the group according to the user's settings, such as gender, age, etc., and the group tracking. Statistics of events that occur, such as the Exposure positive and the Exposure negative in each group of events (Event, abbreviated as EVE), Person-year (PY), and occurrence Incinence rate (IR), etc., and Incinence Rate Ratio (IRR), adjusted hazard ratio, 95% confidence interval (95% CI) and significant difference value (95% Confidence Interval, 95% CI) P value) As above, if the user needs to count drugs or a history of common diseases, a corresponding field will be added to the form.
區塊F012為第四層之頁面,其為回歸診斷(Diagnostic Regression),也就是顯示方法(步驟C)所統計之結果,再加上部分概似(Partial likelihood)、計分檢測(Score test)與華德統計量(Wald test)等統計方法之結果。若用戶勾選自選項之方法,即步驟C023、步驟C025-C027之方法,就會在此頁面會顯示結果。 Block F012 is the fourth layer page, which is the Diagnostic Regression, which is the result of the display method (step C), plus Partial likelihood, Score test. And the results of statistical methods such as Wald test. If the user selects the method of the option, that is, the method of step C023 and step C025-C027, the result will be displayed on this page.
區塊F013是第五層之頁面,其為森林圖(Forest Plot),用於顯示各組人數、調整後之風險比值、95%信賴區間與P值等資料,森林圖會根據此資料計算出總風險比值,評估在整理人口上某一疾病或某一藥物是否會導致某一疾病之風險增加或降低,或死亡率增加或降低。 Block F013 is the fifth floor page, which is Forest Plot, which is used to display the number of people in each group, the adjusted risk ratio, 95% confidence interval and P value. The forest map will calculate based on this data. The total risk ratio, which assesses whether a disease or a drug in a population is likely to cause an increase or decrease in the risk of a disease, or an increase or decrease in mortality.
區塊F014為第六層之頁面,其為圖片(Figure)顯示,圖片顯示為存活曲線圖(Kaplan-Meier Curve)、分點陣圖、箱形圖、殘差分析、累積發生率與殘差分析圖(Schoenfeld Plot)等。該層版面圖片以各組順序排列。 Block F014 is the sixth layer of the page, which is displayed as a picture, the picture is shown as Kaplan-Meier Curve, sub-dot, box, residual analysis, cumulative incidence and residual Analysis chart (Schoenfeld Plot) and so on. The layer layout pictures are arranged in the order of each group.
以上第一層到第六層版面可按使用者意願,根據使用者設定之電郵位址,或可讓用戶更改電郵位址,把六層的內容合併成可攜式文件格式(PDF)並傳送到該指定之位址。也就是伺服器1會可以依據使用者需求,通過通訊裝置14提供給使用者一專業版個人疾病分析結果。 The above first to sixth layouts can be combined into a portable file format (PDF) and transmitted according to the user's wishes, based on the email address set by the user, or by allowing the user to change the email address. Go to the specified address. That is, the server 1 can provide the user with a professional version of the personal disease analysis result through the communication device 14 according to the user's needs.
區塊F015是第七層頁面,其為使用者設定,顯示使用者之帳號、密碼、驗證密碼、電郵(驗證或結果傳送)、最高學歷、職稱、學術機構名稱(學術類)、公司(保險或其他機構)、位址等資料,若普通使用者則新增過去病史、過去手術史、過去藥物史、過去外傷等資料,其中除了帳號之外的所有資料均可以讓使用者修改並儲存於使用者側中,並儲存在結構化查詢語言伺服器(SQL Server)與物件-關係型資料庫管理系統(PostgreSQL)。 Block F015 is a seventh layer page, which is set for the user to display the user's account number, password, verification password, email (verification or result transmission), highest education, title, academic institution name (academic), company (insurance) Or other institutions), address and other information, if the average user adds past medical history, past surgical history, past drug history, past trauma and other information, all the data except the account can be modified and stored by the user. The user side is stored in the Structured Query Language Server (SQL Server) and the Object-Relational Database Management System (PostgreSQL).
區塊F016為第八層頁面,其為關於(About)頁面,用於提供網站介紹與教學,本專利與網站之設計與功能,基礎統計教學之外部連接等,本實驗室與公司之介紹等內容讓使用者參考。 Block F016 is the eighth layer page, which is the About page, which is used to provide website introduction and teaching, the design and function of this patent and website, the external connection of basic statistical teaching, etc., the introduction of the laboratory and the company, etc. The content is for the user's reference.
(2)普通版網站(電腦版) (2) Normal Edition website (computer version)
本專利提供不同網站和資料呈現之方式讓使用者使用,包括普通使用者(或病人)使用之網頁(步驟G)。普通版網頁繼承了專業用戶版本的之功能,但前端介面與資料顯示上經過設計,在使用上與資料顯示上適合一般用戶。在前端介面設計上使用循序漸進之方法(Step by step),即第一個介面點選完畢後再出現第二介面,如此類推,直到所有介面顯示完畢。前端介面分成五大部份,包括過去病史(區塊G001)、疾病處理(區塊G005)、處理後結果(區塊G009)、分析現在健康(區塊G010)、顯示分析結果(區 塊G011)。在上述使用者創建帳號後,使用者需簡單輸入其過去醫療史,在此普通使用者系統則需要記錄詳細之醫療史,用於後續之分析。在使用者登入後,即進入區塊G001之過去病史頁面,在區塊G001之介面需要使用者選擇過去之疾病(區塊G002),在疾病之選擇中,本實施例之系統提供類似於前述區塊F003之診斷碼(ICD)與器官分類與主要診斷類別(Major Diagnostic Category,MDC)之分類讓普通使用者選擇,選擇完畢後則不會像區塊F003一般顯示診斷碼(ICD)。在此,本實施例的顯示系統會呼叫儲存有診斷碼(ICD)之結構化查詢語言伺服器(SQL Server)與物件-關係型資料庫管理系統(PostgreSQL),並只取得診斷碼(ICD)之名稱,回傳至區塊G002之透過Shiny顯示於前端介面上。如使用者過去相對健康則可以略過本項目,直接進入區塊G005。若使用者回憶不起過去疾病,則略過(SKIP)本項,進入區塊G005。 This patent provides a way for different websites and materials to be presented to the user, including web pages used by ordinary users (or patients) (step G). The normal version of the web page inherits the features of the professional user version, but the front-end interface and data display are designed to be suitable for general users in terms of usage and data display. Step by step is used in the front-end interface design, that is, the second interface appears after the first interface is selected, and so on until all the interfaces are displayed. The front-end interface is divided into five parts, including past medical history (block G001), disease treatment (block G005), processed results (block G009), analysis of current health (block G010), and display analysis results (block G011) . After the above user creates an account, the user needs to simply enter his past medical history, where the general user system needs to record a detailed medical history for subsequent analysis. After the user logs in, the user enters the past medical history page of the block G001, and the user at the interface of the block G001 needs to select the past disease (block G002). In the choice of the disease, the system of the embodiment provides a similarity to the foregoing. The classification code (ICD) of the block F003 and the classification of the organ classification and major diagnostic categories (MDC) are selected by the general user. After the selection is completed, the diagnostic code (ICD) is not displayed as in the block F003. Here, the display system of the embodiment calls a structured query language server (SQL Server) and an object-relational database management system (PostgreSQL) in which an diagnostic code (ICD) is stored, and only obtains an diagnostic code (ICD). The name is passed back to block G002 and displayed on the front panel through Shiny. If the user is relatively healthy in the past, you can skip the project and go directly to block G005. If the user cannot recall the past disease, skip (SKIP) this item and enter block G005.
在使用者選擇完疾病名稱後,顯示模組會跳出區塊G003之視窗,該視窗需要填寫有關於區塊G002的該次疾病,其初次症狀發生之日期,以及疾病診斷之日期。若使用者無法回憶症狀之日期或疾病診斷之日則略過該項目,進入區塊G004之介面。 After the user selects the disease name, the display module will pop out of the window of block G003, which needs to fill in the disease about the block G002, the date of the initial symptom, and the date of the disease diagnosis. If the user cannot recall the date of the symptom or the date of diagnosis of the disease, skip the item and enter the interface of block G004.
區塊G004是提供使用者填寫該疾病的主要症狀或主訴(Chief Complain)、病徵(Sign)以及並他症狀(可多寫)。若使用者無法回憶症狀、病徵以及並他症狀則略過該項,進入區塊G005之介面。 Block G004 is to provide the user with the main symptoms of the disease or Chief Complain, Sign and other symptoms (may be written). If the user cannot recall the symptoms, symptoms, and symptoms, skip the item and enter the interface of G005.
區塊G005讓用戶填寫與區塊G002所填寫之疾病相關之治療,在區塊G005介面所填寫之欄位根據區塊G002之疾病數目而增加,症狀亦可以讓使用者勾選相關治療。區塊G005介面所選擇之處治包括藥物區塊G006、手術區塊G007(包括門診或住院手術)與追蹤區塊G008。當使用者選擇藥物時,區塊G006之視窗會跳出,類似區塊F004,讓使用者根據ATC或臺灣之分類選擇藥物種類,以及填寫該藥物的使用途徑、每天用次數、共用多少多,而藥物總劑量則是由系統換算後顯示在區塊G005與區塊G006上。當 使用者選擇手術時,區塊G007之視窗會跳出,根據疾病以及選擇之位置,提供手術名稱(如闌尾切除術)、手術方式(傳統或達文西手術系統)、麻醉方式(如局部麻醉或全身麻醉)、手術時間(包括麻醉時間)、以及病理報告上傳。當使用者選擇追蹤時,區塊G008之視窗會跳出,並提供使用者選擇追蹤之頻率,以及追蹤時發現,如使用者選擇追蹤時發現之選項,則跳出視窗讓使用者勾選該發現「是否與以上所列之疾病相關」之選項,如相關剛讓使用者選擇疾病,如不相關則詢問使用者「是否有需要新增疾病」,若選擇增新則跳出區塊G001,並迴圈以上步驟,若不新增,則進入區塊G009之介面。區塊G009為使用者組處理後的結果,有以下選項可供選擇,痊癒、死亡(使用者計算其他人之資料)、復發、繼續治療、繼續追蹤等選項。 Block G005 allows the user to fill in the treatment related to the disease filled in block G002. The field filled in the G005 interface is increased according to the number of diseases in block G002. The symptoms can also allow the user to check the relevant treatment. The selection of the G005 interface includes the drug block G006, the surgical block G007 (including outpatient or inpatient surgery) and the tracking block G008. When the user selects the drug, the window of block G006 will pop up, similar to block F004, allowing the user to select the type of drug according to the classification of ATC or Taiwan, and how to use the drug, the number of times of use, and how much to share. The total dose of the drug is displayed on the block G005 and block G006 after being converted by the system. When the user chooses surgery, the window of block G007 will jump out, according to the disease and the location of the choice, provide the name of the surgery (such as appendectomy), surgical method (traditional or Da Vinci surgical system), anesthesia (such as local anesthesia or General anesthesia), operative time (including anesthesia time), and pathology report upload. When the user selects the tracking, the window of the block G008 will pop up, and the frequency of the user's selection tracking is provided, and the tracking finds that if the user selects the option found during the tracking, the user jumps out of the window to allow the user to check the discovery. Whether it is related to the diseases listed above, if the relevant user just chooses the disease, if it is not relevant, ask the user "whether there is a need to add a new disease", if you choose to add new, jump out of the block G001, and circulate If the above steps are not added, the interface of block G009 is entered. Block G009 is the result of the user group processing. The following options are available: heal, death (user calculates other people's data), recurrence, continue treatment, continue tracking and other options.
在普通使用者再次登入後,使用者過去之醫療記錄,即先前填寫之資料會以文字方塊(Text box)之形式顯示在區塊G001、區塊G005、區塊G009的介面。如果使用者沒有患上新疾病,則區塊G001、區塊G005、區塊G009等介面可以略過,直接進入區塊G001之介面。若使用者需要補充過去醫療史,本實施例的顯示模組也提供使用者進入區塊G001區塊、G005區塊、區塊G009等介面進行補充。當區塊G001、區塊G005、區塊G009之過去醫療史之介面確定無誤之後,可進入區塊G010之介面,該介面為進入分析之前端介面,供使用者根據過去疾病分析未來之健康風險。其分析之方法有二,一為分析未來某一疾病之風險,另一為分析未來約一萬六千多種疾病之風險(該數字取為診斷碼(ICD)之數目),在本實施例中,後者為收費專案,在其他實施例中,可結合不同金融服務提供相關結果,在本發明中不作限制。 After the ordinary user logs in again, the user's past medical record, that is, the previously filled data will be displayed in the form of a text box in the block G001, block G005, and block G009. If the user does not suffer from a new disease, the interface G001, block G005, block G009, etc. can be skipped and directly enter the interface of block G001. If the user needs to supplement the past medical history, the display module of the embodiment also provides the user to enter the interface G001 block, G005 block, block G009 and other interfaces to supplement. After the interface of the past medical history of block G001, block G005, and block G009 is confirmed, the interface of block G010 can be entered. The interface is the interface for entering the analysis before the user can analyze the future health risks based on past diseases. . There are two methods for analysis, one is to analyze the risk of a certain disease in the future, and the other is to analyze the risk of more than 16,000 diseases in the future (this number is taken as the number of diagnostic codes (ICD)), in this embodiment The latter is a charging project. In other embodiments, related results may be provided in combination with different financial services, which are not limited in the present invention.
當使用者選擇完畢後,區塊G010之介面會轉換到區塊G011之介面,並顯示其運算結果。區塊G011所顯示之結果基於區塊G010之選擇。由於使用者是非專業人員,因此在資料顯示上只顯示兩 種資料,如風險(adjusted hazard ratio)與10年存活率,並提供文字解釋。如果使用者需要查看更多資料,則可按展開更多數據(Expand for more data)之按鈕,展開後之資料包括病死率(Case fatality,顯示罹患某一疾病的病患中死亡的比率)、某一時間點的條件存活機率(Conditional Probability)、累積存活機率(Cumulative Survival Probability)等。若使用者選擇某一疾病之風險,則只顯示某一疾病之風險與10年存活率,即病人選擇只分析未來10年患有高血壓之風險,則只顯示高血壓之風險與10年存活率。若使用者選擇分析未來約一萬六千多種疾病之風險,則顯示一萬六千多種疾病之資料,並可按照風險或存活率等資料之大少重新排序。使用者在查看結果後若需要建議,則可查看本網站之建議系統。本網站之建議系統,根據疾病之系統,以及病人所設定系地址,尋找最近之醫院提供使用者查詢,並協助使用者聯絡該科之醫師、預約,而區塊G011之結果亦可透過列印輸出或寄到某一電郵(例如:使用者之電郵或家庭醫師之電郵),提供給使用一普通版個人疾病分析結果。 When the user selects, the interface of block G010 will be switched to the interface of block G011, and the result of the operation will be displayed. The result displayed by block G011 is based on the selection of block G010. Since the user is a non-professional, only two types of information are displayed on the data display, such as the adjusted hazard ratio and the 10-year survival rate, and a textual explanation is provided. If the user needs to view more information, he can click the button for Expand for more data, and the expanded data includes Case fatality (the ratio of deaths among patients suffering from a disease). Conditional Probability, Cumulative Survival Probability, etc. at a certain point in time. If the user chooses the risk of a disease, it only shows the risk of a disease and the 10-year survival rate, that is, the patient chooses to analyze only the risk of hypertension in the next 10 years, only the risk of hypertension and 10 years of survival. rate. If the user chooses to analyze the risk of more than 16,000 diseases in the future, it will display more than 16,000 diseases and can be reordered according to the risk or survival rate. If the user needs advice after viewing the results, they can check the suggestion system of this website. The proposed system of this website, based on the system of the disease and the address of the patient's department, finds the nearest hospital to provide user enquiries and assists the user in contacting the department's physicians and appointments. The result of block G011 can also be printed. Export or send to an email (for example: user's email or family physician's email) for use with a regular version of the individual's disease analysis results.
(3)移動裝置版應用程式 (3) Mobile device version application
本發明實施例的跨平台臨床醫療數據分析及顯示系統還可通過伺服器1的顯示裝置12的移動裝置版應用程式顯示模組122提供適合在移動裝置上顯示的系統網頁內容或是應用程式。以下根據應用程式進行敘述。 The cross-platform clinical medical data analysis and display system of the embodiment of the present invention can also provide the system webpage content or application suitable for display on the mobile device through the mobile device version application display module 122 of the display device 12 of the server 1. The following is described according to the application.
為了適用於智慧型手機與醫學儀器,本實施例中的移動裝置版應用程式顯示模組122利用基於.NET框架的物件導向的高階程式語言C#(6.0 version,Microsoft)來製作iOS、Android、Windows Phone等移動裝置的移動應用程式(mobile application,APP)。應用程式可數位媒體網路商店(如iOS之itune store)下載或內建於手機中,不需要下載即可使用。在本實施例中,應用程式系統之設計參考先前所述的網站系統,繼承其包括普通使用者與專業使 用者之兩大系統,介面簡化了網頁前端介面之排版,並將按鈕、勾選按鈕、文字方塊、字體加大,方便移動裝置之使用者用觀看或用手指點選。在本實施例中,應用程式之所有介面皆具有垂直滑動的捲軸功能,讓使用者在填寫資料或查閱資料時不會因空間不足而無法使用。所有介面均具有查詢說明之功能。而應用程式介面間之轉換則是使用循序漸進之方法(Step by step),按下下一頁之按鈕(Next)即轉至下一個介面。另外,應用程式也分為普通(病人)版應用程式與學術(專業)版應用程式共二種,以下分別描述其介面之設計與功能。以上2種應用程式皆繼承區塊F002的建立使用者帳號與登入、登出之介面,使用者帳號之建立(步驟E016)後,資料即上傳至伺服器1,儲存在結構化查詢語言伺服器(SQL Server)與物件-關係型資料庫管理系統(PostgreSQL)中,待使用者重新登入時進行核對。使用者登入後會呼叫使用者之側寫(步驟E001),使用者開始使用查詢系統時(步驟E006),所有選擇的參數都會傳送到伺服器1(步驟E007),並根據使用者設定之參數進行內部運算(步驟E008)。待計算完成後,伺服器1即會通過通訊裝置14提供給使用者一分析結果。 In order to be applied to smart phones and medical instruments, the mobile device version application display module 122 in this embodiment uses the object-oriented high-level programming language C# (6.0 version, Microsoft) based on the .NET framework to create iOS, Android, and Windows. Mobile application (app) for mobile devices such as Phone. The app can be downloaded from a digital media web store (such as the iOS itune store) or built into the phone, and can be used without downloading. In this embodiment, the design of the application system refers to the previously described website system, inherits two major systems including ordinary users and professional users, and the interface simplifies the layout of the front-end interface of the webpage, and buttons and tick buttons. The text box and font size are increased, which is convenient for the user of the mobile device to watch or select with a finger. In this embodiment, all interfaces of the application have a scroll function that slides vertically, so that the user does not have to use the data or consult the data because of insufficient space. All interfaces have the ability to query instructions. The conversion between the application interfaces is a step by step method. Press the next page button to go to the next interface. In addition, the application is divided into two types: the ordinary (patient) version of the application and the academic (professional) version of the application. The following describes the design and function of the interface. The above two applications inherit the user account and the login and logout interface of the block F002. After the user account is established (step E016), the data is uploaded to the server 1 and stored in the structured query language server. (SQL Server) and object-relational database management system (PostgreSQL), check when the user re-login. After the user logs in, the user's profile will be called (step E001). When the user starts using the query system (step E006), all selected parameters are transmitted to the server 1 (step E007), and according to the parameters set by the user. Perform an internal operation (step E008). After the calculation is completed, the server 1 provides the user with an analysis result through the communication device 14.
(4)移動裝置版應用程式(普通版) (4) Mobile device version application (normal version)
普通版應用程式之介面與網站系統之普通版一樣,分為五大部份,包括過去病史(區塊H001,即網頁版之區塊G001介面)、疾病處理(區塊H002,即網頁版之區塊G005介面)、處理後結果(區塊H003,即網頁版之區塊G009介面)、分析現在健康(區塊H004,即網頁版之區塊G010介面)、顯示分析結果(區塊H005至區塊H006,即網頁版之區塊G011介面)。區塊H001至區塊H003介面保留網頁版之區塊G002至區塊G004與區塊G006至區塊G008版面之功能,而待資料登錄完成後,即進入區塊H004之介面,提供使用者選擇計算服務。內部運算結束後之結果會傳送至使用者之手機儲存、亦會記錄在使用者之側寫中(步驟E015),當使用者查 詢記錄(步驟E014)並調閱某一項已運算之結果時,所點選之記錄再次被呼叫(步驟E012),當伺服器1接收後,判斷不需要重新運算,運算結果即會顯示於前端介面上(步驟E005)。儲存之結果則會在點選區塊H005或區塊H006之介面時,顯示在前端介面上。區塊H004之介面有兩個運算功能,分別為提供用戶計算其在任一種疾病之風險,而另一種功能則為計算其全數一萬六千餘種疾病之風險;兩者在點選運算按鈕時即按區塊H001、區塊H003介面所填寫之資料與區塊H004所選擇之疾病範圍將會傳送至伺服器1(步驟E007),以進行運算(步驟E008),並把結果輸出至區塊H005與區塊H006之介面。區塊H005之介面顯示某一種疾病之風險,區塊H006之介面則顯示全數疾病之風險。資料之顯示具垂直滑動的捲軸功能。 The interface of the normal application is the same as the normal version of the website system. It is divided into five parts, including the past medical history (block H001, which is the G001 interface of the web version), and disease processing (block H002, the web version). Block G005 interface), processed result (block H003, ie the G009 interface of the web version), analysis of the current health (block H004, ie the G010 interface of the web version), display analysis results (block H005 to district) Block H006, the block G011 interface of the web version). Block H001 to block H003 interface retains the function of block G002 to block G004 and block G006 to block G008 of webpage, and after the data registration is completed, it enters the interface of block H004, providing user selection. Calculation service. The result of the internal operation is transferred to the user's mobile phone for storage and also recorded in the user's profile (step E015). When the user queries the record (step E014) and accesses the result of an operation. The selected record is called again (step E012). When the server 1 receives it, it is judged that the recalculation is not required, and the operation result is displayed on the front end interface (step E005). The result of the storage will be displayed on the front-end interface when the interface of block H005 or block H006 is clicked. The interface of block H004 has two computing functions, which provide the user with the risk of calculating any disease in one case, and the other function is to calculate the risk of all the diseases of more than 16,000 diseases; The data filled in by the block H001 and the block H003 interface and the disease range selected by the block H004 will be transmitted to the server 1 (step E007) to perform the operation (step E008), and the result is output to the area. The interface between block H005 and block H006. The interface of block H005 shows the risk of a certain disease, and the interface of block H006 shows the risk of all diseases. The display of the data has a scroll function that slides vertically.
(5)移動裝置版應用程式(專業版) (5) Mobile Device Edition (Professional Edition)
專業版應用程式之介面與網站系統之專業版本之功能一樣,所設計之版面簡化成4大部份,分別為選項控制項控制版面(Tab Control)(區塊H008)、商標顯示版面(區塊H009)、工作區版面(區塊H010)、文字說明區(區塊H011)。 The interface of the professional application is the same as the professional version of the website system. The layout is designed to be simplified into 4 parts, namely the tab control (block H008) and the trademark display layout (block). H009), workspace layout (block H010), text description area (block H011).
區塊H008之版面為選項控制項(Tab Control)控制設定與結果頁面之顯示,具有能顯示多層頁面之功能,為了簡化介面,減低使用空間,區塊H008版面繼承區塊F005版面同時合併區塊F003-F004與區塊F006-F007之版面與功能。區塊H008之版面共有12層前端頁面可進行呼叫,並顯示在前端介面之區塊H010版面,其12層版面分別為:第一層使用教學(區塊H012)、第二層輸入診斷碼(ICD)(區塊H013)、輸入藥物(區塊H014)、基礎設定與運算模式(區塊H015)、行為(區塊H016)、人口比例結果(區塊H017)、追蹤與觀察結果(區塊H018)、回歸診斷結果(區塊H019)、森林圖(區塊H020)、其他圖(區塊H021)、使用者設定(區塊H022)、關於(區塊H023)。 The layout of block H008 is the control of the Tab Control control setting and the result page. It has the function of displaying multiple pages. In order to simplify the interface and reduce the use space, the block H005 layout inherits the block F005 layout and merges the blocks at the same time. F003-F004 and the layout and function of block F006-F007. The layout of block H008 has 12 layers of front-end pages that can be called and displayed in the block H010 layout of the front-end interface. The 12-layer layout is: the first layer uses teaching (block H012), and the second layer inputs diagnostic code ( ICD) (block H013), input drug (block H014), basic setting and operation mode (block H015), behavior (block H016), population ratio result (block H017), tracking and observation (block) H018), regression diagnosis result (block H019), forest map (block H020), other map (block H021), user setting (block H022), and (block H023).
第一層之使用教學介面(區塊H012),詳細介紹本系統如何使用,可讓使用者略過此步驟。第二層輸入診斷碼(ICD)之版面(區塊H013),第三層輸入藥物詳細資料之版面(區塊H014)與第四層基礎設定與運算模式之版面(區塊H015),分別繼承區塊F003、區塊F004與區塊F006之功能。版面區塊H016只繼承版面區塊F007之執行(Generate)功能。區塊H012至區塊H015版面各版面在切換版面時各版面之參數(步驟E001至步驟E003)即馬上暫存在儲存裝置11中,直至區塊H016之執行(Generate)功能被啟動,由步驟E002傳送至伺服器1進行運算(步驟E008),運算結果傳送接移動裝置之記憶體。資料顯示在區塊H017至區塊H021之介面。 The first layer uses the teaching interface (block H012), detailing how the system is used, allowing the user to skip this step. The second layer inputs the diagnostic code (ICD) layout (block H013), the third layer enters the drug details layout (block H014) and the fourth layer basic setting and operation mode layout (block H015), respectively. The function of block F003, block F004 and block F006. The layout block H016 inherits only the execution function of the layout block F007. The parameters of each layout (step E001 to step E003) of each layout of the block H012 to the block H015 are immediately stored in the storage device 11 until the execution function of the block H016 is started, by step E002 The operation is transmitted to the server 1 (step E008), and the result of the calculation is transferred to the memory of the mobile device. The data is displayed in the interface from block H017 to block H021.
人口比例結果(區塊H017)、追蹤與觀察結果(區塊H018)、回歸診斷結果(區塊H019)、森林圖區塊H020與其他圖(區塊H021),分別繼承了專業版網頁之區塊F010、區塊F011、區塊F012、區塊F013、區塊F014之版面設計以及其功能。選項控制項(Tab control)之最後兩層版面(區塊H022至區塊H023)則分別繼承了專業版網頁之區塊F015與區塊F016之設計與功能,資料之顯示具垂直滑動的捲軸功能。 Population ratio results (block H017), tracking and observation results (block H018), regression diagnosis results (block H019), forest map block H020 and other maps (block H021), respectively, inherit the area of the professional version of the web page The layout of block F010, block F011, block F012, block F013, block F014 and its functions. The last two layers of the tab control (block H022 to block H023) inherit the design and function of the block F015 and block F016 of the professional version of the webpage respectively. .
〔實施例的可能功效〕 [Possible effects of the examples]
綜上所述,本發明的跨平台臨床醫療數據分析及顯示系統通過一臨床醫療資料庫的資料進行近一步的分析以及統計,在計算量上大幅降低,處理速度加快,而且能夠不降低其精確程度,另外,還可從本系統提供的網頁或應用程式取得專業版或是普通版的分析結果,不僅可以提供使用者明確而且快速的醫療分析資訊,更可即時地了解自身健康的可能發展方向。 In summary, the cross-platform clinical medical data analysis and display system of the present invention performs further analysis and statistics through the data of a clinical medical database, which greatly reduces the calculation amount, speeds up the processing, and can not reduce the accuracy thereof. In addition, the professional or general version of the analysis results can be obtained from the webpage or application provided by the system, which not only provides users with clear and rapid medical analysis information, but also instantly understands the possible development direction of their health. .
以上所述僅為本發明之實施例,其並非用以侷限本發明之專利範圍。 The above description is only an embodiment of the present invention, and is not intended to limit the scope of the invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW106112205A TWI638275B (en) | 2017-04-12 | 2017-04-12 | Cross-platform anaylysing and display system of clinical data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW106112205A TWI638275B (en) | 2017-04-12 | 2017-04-12 | Cross-platform anaylysing and display system of clinical data |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI638275B TWI638275B (en) | 2018-10-11 |
TW201837745A true TW201837745A (en) | 2018-10-16 |
Family
ID=64797366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW106112205A TWI638275B (en) | 2017-04-12 | 2017-04-12 | Cross-platform anaylysing and display system of clinical data |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI638275B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI782608B (en) * | 2021-06-02 | 2022-11-01 | 美商醫守科技股份有限公司 | Electronic device and method for providing recommended diagnosis |
US11769114B2 (en) | 2020-12-03 | 2023-09-26 | Novartis Ag | Collaboration platform for enabling collaboration on data analysis across multiple disparate databases |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI738159B (en) * | 2019-12-25 | 2021-09-01 | 眾匯智能健康股份有限公司 | Medical advice system and suggestion method thereof |
TWI755995B (en) * | 2020-12-24 | 2022-02-21 | 科智企業股份有限公司 | A method and a system for screening engineering data to obtain features, a method for screening engineering data repeatedly to obtain features, a method for generating predictive models, and a system for characterizing engineering data online |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200538734A (en) * | 2004-03-12 | 2005-12-01 | Aureon Biosciences Corp | Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition |
US8024128B2 (en) * | 2004-09-07 | 2011-09-20 | Gene Security Network, Inc. | System and method for improving clinical decisions by aggregating, validating and analysing genetic and phenotypic data |
TWI399661B (en) * | 2009-08-21 | 2013-06-21 | A system for analyzing and screening disease related genes using microarray database | |
TW201118773A (en) * | 2009-11-30 | 2011-06-01 | Linkmed Asia Inc | Medical information integrated system and method |
EP2657892A4 (en) * | 2010-10-29 | 2014-10-15 | Obschestvo S Ogranichennoy Otvetstvennostiu Pravovoe Soprovojdenie Bisnesa | Clinical information system |
KR20130056095A (en) * | 2011-11-21 | 2013-05-29 | 경희대학교 산학협력단 | Data processing method and apparatus for clinical decision support system |
-
2017
- 2017-04-12 TW TW106112205A patent/TWI638275B/en not_active IP Right Cessation
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11769114B2 (en) | 2020-12-03 | 2023-09-26 | Novartis Ag | Collaboration platform for enabling collaboration on data analysis across multiple disparate databases |
TWI849361B (en) * | 2020-12-03 | 2024-07-21 | 瑞士商諾華公司 | Platform and method for enabling collaboration on data analysis across disparate databases |
TWI782608B (en) * | 2021-06-02 | 2022-11-01 | 美商醫守科技股份有限公司 | Electronic device and method for providing recommended diagnosis |
Also Published As
Publication number | Publication date |
---|---|
TWI638275B (en) | 2018-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Newton et al. | Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network | |
US11475996B2 (en) | Systems and methods for determination of patient true state for personalized medicine | |
Knake et al. | Quality of EHR data extractions for studies of preterm birth in a tertiary care center: guidelines for obtaining reliable data | |
TWI638275B (en) | Cross-platform anaylysing and display system of clinical data | |
US20100235330A1 (en) | Electronic linkage of associated data within the electronic medical record | |
Nissen et al. | Validation of asthma recording in electronic health records: a systematic review | |
Yu et al. | Assessing the importance of predictors in unplanned hospital readmissions for chronic obstructive pulmonary disease | |
CN107066783A (en) | A kind of cross-platform clinical big data analysis and display system | |
Lin et al. | External validation of an algorithm to identify patients with high data-completeness in electronic health records for comparative effectiveness research | |
Li et al. | Improving preeclampsia risk prediction by modeling pregnancy trajectories from routinely collected electronic medical record data | |
Liu et al. | Leveraging large-scale electronic health records and interpretable machine learning for clinical decision making at the emergency department: protocol for system development and validation | |
Traina et al. | Pragmatic measurement of health satisfaction in people with type 2 diabetes mellitus using the Current Health Satisfaction Questionnaire | |
Shi et al. | An automated data cleaning method for Electronic Health Records by incorporating clinical knowledge | |
Couchoud et al. | Renal replacement therapy registries—time for a structured data quality evaluation programme | |
Mayo et al. | Machine learning model of emergency department use for patients undergoing treatment for head and neck cancer using comprehensive multifactor electronic health records | |
Lalanne et al. | Biostatistics and Computer-based Analysis of Health Data Using R | |
Liu et al. | Readmission risk prediction model for patients with chronic heart failure: A systematic review and meta-analysis | |
He et al. | Advancing polytrauma care: developing and validating machine learning models for early mortality prediction | |
Mehta et al. | Data resource: vascular risk in adult new Zealanders (VARIANZ) datasets | |
Zhu et al. | Application of a computerized decision support system to develop care strategies for elderly hemodialysis patients | |
US11468981B2 (en) | Systems and methods for determination of patient true state for risk management | |
Tai et al. | Finding discriminatory features from electronic health records for depression prediction | |
Santillan et al. | Need for Improved Collection and Harmonization of Rural Maternal Healthcare Data | |
Reid | Diabetes diagnosis and readmission risks predictive modelling: USA | |
US20230153757A1 (en) | System and Method for Rapid Informatics-Based Prognosis and Treatment Development |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |