WO2021151295A1 - Method, apparatus, computer device, and medium for determining patient treatment plan - Google Patents

Method, apparatus, computer device, and medium for determining patient treatment plan Download PDF

Info

Publication number
WO2021151295A1
WO2021151295A1 PCT/CN2020/118873 CN2020118873W WO2021151295A1 WO 2021151295 A1 WO2021151295 A1 WO 2021151295A1 CN 2020118873 W CN2020118873 W CN 2020118873W WO 2021151295 A1 WO2021151295 A1 WO 2021151295A1
Authority
WO
WIPO (PCT)
Prior art keywords
patient
treatment plan
target
preset
data
Prior art date
Application number
PCT/CN2020/118873
Other languages
French (fr)
Chinese (zh)
Inventor
徐卓扬
赵惟
左磊
孙行智
胡岗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151295A1 publication Critical patent/WO2021151295A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the field of digital medicine, and in particular to a method, device, computer equipment, and medium for determining a patient's treatment plan.
  • Deep reinforcement learning is one of the machine learning methods. It completes the learning from the environment state to the action mapping, selects the optimal strategy according to the maximum feedback value, selects the optimal action for the search strategy, and causes the change of the state to obtain the delayed feedback value and evaluate Function, iterative loop, until the learning condition is met, the learning is terminated.
  • a method for determining a patient's treatment plan including:
  • the target treatment plan of the target patient is obtained by analysis.
  • a device for determining a patient's treatment plan comprising:
  • the creation module is used to create a patient grouping model for processing time series data based on deep reinforcement learning DQN;
  • a training module configured to train the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard
  • the input module is used to input target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
  • a determining module configured to determine the first treatment plan of the target patient based on the characteristics of the population in the target group
  • An extraction module for extracting contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
  • the analysis module is used to analyze and obtain the target treatment plan of the target patient according to the first treatment plan and the second treatment plan.
  • a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the following steps:
  • the target treatment plan of the target patient is obtained by analysis.
  • a computer device including a storage medium, a processor, and a computer program stored on the storage medium and running on the processor.
  • the processor executes the following steps when the program is executed :
  • the target treatment plan of the target patient is obtained by analysis.
  • FIG. 1 shows a schematic flowchart of a method for determining a patient's treatment plan provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of another method for determining a patient's treatment plan provided by an embodiment of the present application
  • FIG. 3 shows a network structure diagram of a patient grouping model provided by an embodiment of the present application
  • FIG. 4 shows a schematic structural diagram of a device for determining a patient's treatment plan provided by an embodiment of the present application
  • Fig. 5 shows a schematic structural diagram of another device for determining a patient treatment plan provided by an embodiment of the present application.
  • an embodiment of the present application provides a method for determining a patient's treatment plan, as shown in FIG. 1 , The method includes:
  • the purpose is to improve the traditional deep reinforcement learning DQN model, extend the model to a time series model, and add an Attention mechanism, and use the improved DQN model to process patients into groups so that they can be used to process time series Data, and can realize the interpretability of patient characteristics.
  • the grouping decision rules can be set in advance, and the group to which the sample data belongs can be determined based on the grouping decision rules, and then the grouping results can be marked in the corresponding sample data in a similarly labeled form for use
  • the patient clustering model is verified against the sample data output results, and then the training status of the patient clustering model is determined. If the output result of the patient clustering model is determined to have a small error with the labeling result, it can be determined that the patient clustering model conforms to Preset training standards.
  • the preset time period can be set according to actual application requirements.
  • the preset time period can be set to include the current time in the previous month, and the corresponding historical target patient data is a record recorded in the preset time period. Or multiple follow-up data about the target patient.
  • the single follow-up information cannot fully represent the patient's long-term follow-up status, which may easily lead to inaccurate analysis results. Therefore, in this embodiment, in addition to the patient follow-up data at the current moment as input, all historical patient follow-up data existing in a preset time period can also be used as input, and the output results of the follow-up data of each patient are integrated to determine The final relatively accurate target grouping result.
  • the Attention mechanism can also be used to explain the contribution, attention coefficient, contribution ratio, etc. of each feature at each time point to the clustering result.
  • a patient with a high similarity to the target patient population can be determined, so as to be based on the patient’s generated data.
  • the first treatment plan that can be selected by the target patient is screened out.
  • the contraindicated drugs of the target patient should be extracted first, so that the first treatment plan containing the corresponding contraindicated drugs should be screened out.
  • the second treatment plan so that the second treatment plan is not considered when the treatment plan recommendation is finally generated.
  • the first treatment plan and the second treatment plan analyze and obtain the target treatment plan of the target patient.
  • the second treatment plan will be excluded from the first treatment plan, and the eliminated first treatment plan will be determined as
  • the target treatment plan of the target patient in this embodiment, takes into account the drug contraindication factors, so as to ensure the safety of the patient's treatment.
  • the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ;
  • the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient.
  • the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate.
  • the method includes:
  • step 201 of the embodiment may specifically include: splitting the deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure into a first fully connected layer and a second recurrent neural network Layer, the third cyclic neural network layer; use the deep reinforcement learning DQN after changing the network structure to construct a patient grouping model, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs each time The point corresponds to the embedded value of the patient's state, the second recurrent neural network layer outputs the first degree of attention corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second degree of attention corresponding to the grouping result at each time point, and is based on The embedded value, the first degree of attention, and the second degree of attention are used to calculate the expected reward value of the patient data corresponding to each preset group.
  • the abstract features extracted by the convolutional layer are divided into three branches, that is, the last fully connected layer in the corresponding network structure of the deep reinforcement learning DQN is split into :
  • the first fully connected layer 1 is used to output the embedded value corresponding to the patient state at each time point
  • the second cyclic neural network layer 2 Is the state value function (value function), used to output the first degree of attention corresponding to the patient state at each time point
  • the third recurrent neural network layer 3 is the action advantage function (advantage function), used to output the clustering results corresponding to each time point
  • the second degree of attention is the abstract features extracted by the convolutional layer.
  • the sample data in order to monitor the training status of the patient clustering model when using the sample data to train the patient clustering model, it is necessary to mark the sample data to belong to the group in advance, which specifically includes: The sample data is grouped into groups, and the grouping result corresponding to each sample data is obtained; the sample data is marked based on the grouping result.
  • the preset grouping decision rules can be set according to actual needs.
  • the grouping decision rules can be set according to the patient's personal characteristic information and combined with the inspection index information for classification.
  • group division patients with high similarity in personal characteristic information and containing the same examination indicators and the same examination results can be divided into a group.
  • the sample data is time series data containing the current time point and a preset number of historical time points, and can include patient data information at the current time and historical time.
  • the patient data information can be personal identification information (such as name, gender, age, etc.) ), treatment plan information (drug combination, medication cycle, dosage, etc.), inspection index information (such as blood sugar, blood pressure, electrocardiogram and other inspection indicators and corresponding inspection results, etc.), etc.;
  • the expected reward value is calculated at the same time point After the first sum of the first degree of attention and the second degree of attention, and the product of the first sum and the embedded value, it is obtained by accumulating the product of the current time point and the historical time point.
  • the network structure diagram of the patient clustering model shown in Figure 3 if the current patient status (s 3 ) corresponding to the sample data is input to the patient clustering model plus the patient status at two historical time points (s 1 , s 2 ) .
  • the e(e 1 , e 2 , e 3 ) output at each time point of the first fully connected layer can be obtained, and the second recurrent neural network layer Output V (V 1 , V 2 , V 3 ) at each time point, A (A 1 , A 2 , A 3 ) at each time point output by the third loop neural network layer, and then use V in the same time step Add to A, then multiply it element-wise with e, and then accumulate the Q value (Q 3 ) of the current state.
  • V represents the degree of attention corresponding to the patient state at each time point
  • A represents the degree of attention corresponding to the patient state at each time point
  • e represents the embedded
  • h V1 ,h V2 ,h V3 LSTM-V(s 1 ,s 2 ,s 3 )
  • h A1 ,h A2 ,h A3 LSTM-A(s 1 ,s 2 ,s 3 )
  • a 1 ,A 2 ,A 3 (W A h A1 ,W A h A2 ,W A h A3 )
  • v 1 ,v 2 ,v 3 (W I s 1 ,W I s 2 ,W I s 3 )
  • s i, h vi, w v, h Ai, A i, v i, e i, Q 3 is a vector
  • V i is a scalar
  • W A, W I, W II matrix, O for corresponding elements are multiplied .
  • the interpretation method of the model decision can be: the contribution of each patient characteristic at each time point to the final Q value can be positively derived from all the input s i.
  • each sample data corresponds to a unique label group.
  • the expected reward value corresponding to each preset group will be obtained.
  • the first expected reward value is the expected reward value in the current patient state corresponding to the output of the marked group
  • the real expected reward value is the largest expected reward value in the next patient state + the actual reward (reward), which is further calculated , which is the actual expected reward value of the corresponding marked group.
  • the mean square error loss based on the first expected reward value and the real expected reward value to further determine whether the loss function has reached the convergence state.
  • the loss When the function reaches the convergence state, it can be determined that the patient grouping model meets the preset training standard.
  • the sample data is used to repeatedly train the patient grouping model, so that the patient grouping model meets the preset training standard.
  • the target patient information is time series data
  • all target patient information at the current time and historical time needs to be input into the patient grouping model to obtain the grouping results
  • the target patient information is not time series data
  • only the target patient information at the current time needs to be
  • the patient information is input into the patient clustering model, and the parameter value corresponding to the historical time point in the patient clustering model is set to 0 to obtain the clustering result.
  • step 206 of the embodiment may specifically include: extracting historical patient follow-up data and current patient follow-up data of the target patient within a preset time period; Patient follow-up data and current patient follow-up data are input into the patient grouping model that meets the preset training standards to obtain the expected reward value corresponding to each preset group; the preset group with the largest expected reward value is determined as the target patient corresponding The target group for.
  • step 207 of the embodiment may specifically include: screening the target group in the target group according to the target patient data and the similarity of the characteristics of the population corresponding to the target patient is greater than
  • the population characteristics include at least condition information and personal information
  • the plan is determined as the first treatment plan; or a preset treatment plan created according to the characteristics of the target group is obtained, and the preset treatment plan is determined as the first treatment plan.
  • the target group contains the data information of multiple sample patients.
  • the data information can also include information about the treatment effect.
  • Score information and treatment plan information such as medication combination, medication cycle, dosage, etc.; the first preset threshold and the second preset threshold are both data greater than 0 and less than or equal to 1, and the specific values can be set according to specific application scenarios , It should be noted that when the value set by the first preset threshold is closer to 1, it can indicate that the feature similarity between the first patient and the target patient selected is higher; when the value set by the second preset threshold is higher Close to 1, it can indicate that the first treatment plan selected, the better the treatment effect after patient feedback.
  • the target patient’s personal identity information, inspection index information, diagnosis result information and other multi-dimensional feature information can be extracted from the target patient’s information in advance.
  • the first patient whose matching degree with the feature information of the target patient is greater than the first preset threshold is selected from the group, and then the treatment plan whose score value corresponding to the treatment effect of the first patient is greater than the second preset threshold is extracted, and the treatment plan Determined as the first treatment plan.
  • the first patient whose feature similarity with the target patient's corresponding population is greater than the first preset threshold is selected in the target group according to the target patient data, including four first patients A, B, C, and D, among which the first patient A
  • the corresponding medication combination is a+c+d
  • the medication combination corresponding to the first patient B is a+c+e
  • the medication combination corresponding to the first patient C is a+b+c
  • the medication combination corresponding to the first patient D is a+c+d
  • the score values of the three plans regarding the treatment effect can be obtained, for example
  • the score value corresponding to the treatment plan a+c+d is 0.75
  • the score value corresponding to the treatment plan a+b+e is 0.91
  • the score value corresponding to the treatment plan a+b+c is 0.88.
  • the preset treatment plan corresponding to each target group may be determined in advance according to the characteristics of the population in the target group and the diagnosis result of the physician, for example, for the target group
  • the diagnosis result of the physician for example, for the target group
  • the commonly used treatment options include A and B
  • treatment options A and B can be directly determined as the preset treatment options corresponding to the target group.
  • treatment plans A and B can be determined as the first treatment plan corresponding to the target patient.
  • step 208 of the embodiment may specifically include: determining, according to the drug contraindication data, the target patient corresponding to the population type that is not suitable for taking the second treatment plan.
  • a contraindicated drug based on the drug allergy history in the target patient's data, determine the second contraindicated drug in which the target patient has an allergic reaction; determine the first treatment plan containing the first contraindicated drug and/or the second contraindicated drug as the second treatment plan .
  • the first contraindication drug of the target patient may correspond to the drug forbidden by the pregnant woman; when the target patient is a penicillin allergic population, penicillin drugs can be determined as the second contraindication drug of the target patient.
  • step 209 of the embodiment may specifically include: excluding the second treatment plan from the first treatment plan to obtain the target treatment plan.
  • an interpretable deep reinforcement learning model DQN network structure is proposed to create a patient clustering model for processing time series data, and then use sample data to train the patient clustering model to achieve the expected Set training standards. Then input the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ; To enhance the safety of diagnosis, the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient.
  • the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate.
  • the Attention mechanism is added in the process of calculating the expected reward value, which can achieve a certain degree of interpretability.
  • an embodiment of the present application provides a device for determining a patient's treatment plan.
  • the device includes: a creation module 31, a training module 32, and an input Module 33, determination module 34, extraction module 35, analysis module 36.
  • the creation module 31 can be used to create a patient grouping model for processing time series data based on deep reinforcement learning DQN;
  • the training module 32 can be used to train the patient clustering model by using the sample data with marked clustering results, so that the patient clustering model meets the preset training standard;
  • the input module 33 can be used to input target patient data within a preset time period into a patient grouping model that meets the preset training standard, and obtain the target grouping result;
  • the determining module 34 can be used to input target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
  • the extraction module 35 can be used to extract the contraindicated drugs of the target patient based on the target patient's data, and screen out the second treatment plan containing the contraindicated drugs from the first treatment plan;
  • the analysis module 36 can be used to analyze and obtain the target treatment plan of the target patient according to the first treatment plan and the second treatment plan.
  • the creation module 31 may specifically include: a splitting unit 311 and a construction unit 312;
  • the splitting unit 311 can be used to split the deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure into a first fully connected layer, a second cyclic neural network layer, and a third cyclic neural network layer;
  • the construction unit 312 can be used to construct a patient grouping model by using the deep reinforcement learning DQN after changing the network structure, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs the corresponding patients at each time point
  • the embedded value of the state the second recurrent neural network layer outputs the first degree of attention corresponding to the patient state at each time point
  • the third recurrent neural network layer outputs the second degree of attention corresponding to the grouping result at each time point, and based on the embedded value,
  • the first degree of attention and the second degree of attention calculate the expected reward value of the patient data corresponding to each preset group.
  • the training module 32 may specifically include: a first input unit 321, a first extraction unit 322, a calculation unit 323, and a training unit 324;
  • the first input unit 321 can be used to input the sample data at the current time point and the historical time point into the patient grouping model to obtain a preset number of groups, and each sample data corresponds to the expected reward value of each group, the expected reward The value is obtained by accumulating the product of the current time point and the historical time point after calculating the first sum of the first degree of attention and the second degree of attention at the same time point, and the product of the first sum and the embedded value;
  • the first extraction unit 322 may be used to extract the label group corresponding to the sample data, and determine the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model;
  • the calculation unit 323 can be used to calculate the mean square error loss between the first expected reward value and the real expected reward value. If it is determined that the loss function reaches the convergence state based on the mean square error loss, it is determined that the patient grouping model meets the preset training standard;
  • the training unit 324 can be used to repeatedly train the patient clustering model by using the sample data if it is determined that the loss function has not reached the convergence state, so that the patient clustering model meets the preset training standard.
  • the input module 33 may specifically include: a second extraction unit 331, a second input unit 332, and a first determination unit 333;
  • the second extraction unit 331 can be used to extract historical patient follow-up data and current patient follow-up data of the target patient within a preset time period;
  • the second input unit 332 can be used to input historical patient follow-up data and current patient follow-up data into a patient grouping model that meets the preset training standards to obtain the expected reward value corresponding to each preset group;
  • the first determining unit 333 may be used to determine the preset group with the largest expected reward value as the target group corresponding to the target patient.
  • the determining module 34 may specifically include: a screening unit 341 and a second determining unit 342;
  • the screening unit 341 can be used to screen the first patients whose population characteristics similarity to the target patient is greater than a first preset threshold in the target group based on the target patient data, and the population characteristics include at least medical condition information and personal information;
  • the second determining unit 342 may be used to extract the treatment plan corresponding to the first patient and the score value of the treatment plan with respect to the treatment effect, and determine the treatment plan with the score value greater than the second preset threshold as the first treatment plan; or
  • the second determining unit 342 may also be used to obtain a preset treatment plan created according to the characteristics of the target group, and determine the preset treatment plan as the first treatment plan.
  • the extraction module 35 may specifically include: a third determination unit 351;
  • the third determining unit 351 can be used to determine the first contraindicated drug that the target patient corresponds to the population type that is not suitable for taking according to the drug contraindicated data;
  • the third determining unit 351 can also be used to determine the second contraindicated drug for which the target patient has an allergic reaction based on the drug allergy history in the target patient's data;
  • the third determining unit 351 may also be used to determine the first treatment plan including the first contraindication drug and/or the second contraindication drug as the second treatment plan.
  • the analysis module 36 may specifically include: a rejection unit 361;
  • the rejection unit 361 can be used to remove the second treatment plan from the first treatment plan to obtain the target treatment plan.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may include non-volatile and/or volatile memory.
  • a computer program is stored thereon, and when the program is executed by the processor, the method for determining the patient's treatment plan as shown in FIG. 1 and FIG. 2 is realized.
  • the technical solution of the present application can be embodied in the form of a software product.
  • the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
  • an embodiment of the present application also provides a computer device, which may be a personal computer, Servers, network devices, etc., the physical device includes a storage medium and a processor; the storage medium is used to store a computer program, and may include non-volatile and/or volatile memory; the processor is used to execute the computer program to achieve the above
  • a computer device which may be a personal computer, Servers, network devices, etc.
  • the physical device includes a storage medium and a processor; the storage medium is used to store a computer program, and may include non-volatile and/or volatile memory; the processor is used to execute the computer program to achieve the above
  • the method for determining the patient's treatment plan is shown in Figure 1 and Figure 2.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a Wi-Fi module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the optional network interface can include standard wired interface, wireless interface (such as Bluetooth interface, WI-FI interface) and so on.
  • the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
  • the non-volatile readable storage medium may also include an operating system and a network communication module.
  • the operating system is a program that analyzes the hardware and software resources of the physical device for the semantic similarity of text, and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to implement communication between various components in the non-volatile readable storage medium, and communication with other hardware and software in the physical device.
  • the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ;
  • the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient.
  • the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate.
  • the Attention mechanism is added in the process of calculating the expected reward value, which can achieve a certain degree of interpretability.

Abstract

Provided are a method, apparatus, and computer device for determining a patient treatment plan, which can solve the problem of an insufficiently accurate generated result when generating a patient treatment plan online. The method comprises: on the basis of deep Q-learning (DQN), creating a patient grouping model used for processing a time series data (101); using sample data marked with the grouping result to train a patient grouping model so as to cause the patient grouping model to satisfy a preset training standard (102); inputting target patient data within a preset time period into the patient grouping model which satisfies the preset training standard to obtain a target group to which the target patient belongs (103); determining a first treatment plan of the target patient on the basis of the features of the population in the target group (104); extracting contraindicated drugs of the target patient according to the target patient data, and from the first treatment plan, filtering out a second treatment plan containing the contraindicated drugs (105); according to the first treatment plan and the second treatment plan, analyzing and obtaining a target treatment plan for the target patient (106).

Description

患者治疗方案的确定方法、装置、计算机设备及介质Method, device, computer equipment and medium for determining patient treatment plan
本申请要求于2020年06月29日提交中国专利局、申请号为CN202010602269.2、名称为“患者治疗方案的确定方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 29, 2020, the application number is CN202010602269.2, and the name is "Methods, devices and computer equipment for determining patient treatment plans", the entire contents of which are incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及数字医疗领域,尤其涉及到一种患者治疗方案的确定方法、装置、计算机设备及介质。This application relates to the field of digital medicine, and in particular to a method, device, computer equipment, and medium for determining a patient's treatment plan.
背景技术Background technique
深度强化学习是机器学习方法中的一种,完成从环境状态到动作映射学习,根据最大的反馈值选择最优的策略,搜索策略选择最优的动作,引起状态的变化得到延迟反馈值,评估函数,迭代循环,直到满足学习条件即终止学习。Deep reinforcement learning is one of the machine learning methods. It completes the learning from the environment state to the action mapping, selects the optimal strategy according to the maximum feedback value, selects the optimal action for the search strategy, and causes the change of the state to obtain the delayed feedback value and evaluate Function, iterative loop, until the learning condition is met, the learning is terminated.
随着科技的发展,深度强化学习已逐步应到各个领域。目前,已有工作将深度强化学习技术用于患者诊断。发明人意识到,利用深度强化学习进行患者诊断的方法往往存在以下不足:1.在患者诊断场景,在进行诊断决策时更关注哪些特征、各特征对结局贡献了多少,而目前的模型难以解释,导致信息无法做到透明化。2.目前的模型往往只能将患者的单次随访信息作为输入,但单次随访很难完全表示患者的长期随访状态,导致分析结果不够准确。With the development of science and technology, deep reinforcement learning has gradually been applied to various fields. At present, there have been works using deep reinforcement learning technology for patient diagnosis. The inventor realizes that the methods of using deep reinforcement learning for patient diagnosis often have the following shortcomings: 1. In the patient diagnosis scenario, which features are more concerned about when making diagnosis decisions, and how much each feature contributes to the outcome, while the current model is difficult to explain , Resulting in the information can not be transparent. 2. Current models often only take the patient's single follow-up information as input, but a single follow-up cannot fully represent the patient's long-term follow-up status, resulting in inaccurate analysis results.
发明内容Summary of the invention
根据本申请的一个方面,提供了一种患者治疗方案的确定方法,该方法包括:According to one aspect of the present application, there is provided a method for determining a patient's treatment plan, the method including:
基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
根据本申请的另一个方面,提供了一种患者治疗方案的确定装置,该装置包括:According to another aspect of the present application, there is provided a device for determining a patient's treatment plan, the device comprising:
创建模块,用于基于深度强化学习DQN创建用于处理时序数据的患者分群模型;The creation module is used to create a patient grouping model for processing time series data based on deep reinforcement learning DQN;
训练模块,用于利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;A training module, configured to train the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
输入模块,用于将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;The input module is used to input target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
确定模块,用于基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;A determining module, configured to determine the first treatment plan of the target patient based on the characteristics of the population in the target group;
提取模块,用于依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;An extraction module for extracting contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
分析模块,用于按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。The analysis module is used to analyze and obtain the target treatment plan of the target patient according to the first treatment plan and the second treatment plan.
根据本申请的另一个方面,提供了一种计算机可读存储介质,其上存储有计算机程 序,所述程序被处理器执行时实现以下步骤:According to another aspect of the present application, there is provided a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the following steps:
基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
根据本申请的再一个方面,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现以下步骤:According to another aspect of the present application, there is provided a computer device, including a storage medium, a processor, and a computer program stored on the storage medium and running on the processor. The processor executes the following steps when the program is executed :
基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本地申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation of the local application. In the attached picture:
图1示出了本申请实施例提供的一种患者治疗方案的确定方法的流程示意图;FIG. 1 shows a schematic flowchart of a method for determining a patient's treatment plan provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种患者治疗方案的确定方法的流程示意图;FIG. 2 shows a schematic flowchart of another method for determining a patient's treatment plan provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种患者分群模型的网络结构图;FIG. 3 shows a network structure diagram of a patient grouping model provided by an embodiment of the present application;
图4示出了本申请实施例提供的一种患者治疗方案的确定装置的结构示意图;FIG. 4 shows a schematic structural diagram of a device for determining a patient's treatment plan provided by an embodiment of the present application;
图5示出了本申请实施例提供的另一种患者治疗方案的确定装置的结构示意图。Fig. 5 shows a schematic structural diagram of another device for determining a patient treatment plan provided by an embodiment of the present application.
具体实施方式Detailed ways
下文将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互结合。Hereinafter, the present application will be described in detail with reference to the drawings and in conjunction with the embodiments. It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict.
针对在将深度强化学习技术应用于患者诊断时,对特征贡献的可解释性弱,且分析结果不够准确的问题,本申请实施例提供了一种患者治疗方案的确定方法,如图1所示,该方法包括:Aiming at the problem of weak interpretability of feature contribution and insufficient accuracy of analysis results when applying deep reinforcement learning technology to patient diagnosis, an embodiment of the present application provides a method for determining a patient's treatment plan, as shown in FIG. 1 , The method includes:
101、基于深度强化学习DQN创建用于处理时序数据的患者分群模型。101. Create a patient grouping model for processing time series data based on the deep reinforcement learning DQN.
对于本实施例,旨在通过对传统的深度强化学习DQN模型进行改进,将模型扩展为 时序模型,并加入Attention机制,利用改进后的DQN模型进行对患者分群的处理,以便能用于处理时序数据,且能够实现对患者特征的可解释性。For this embodiment, the purpose is to improve the traditional deep reinforcement learning DQN model, extend the model to a time series model, and add an Attention mechanism, and use the improved DQN model to process patients into groups so that they can be used to process time series Data, and can realize the interpretability of patient characteristics.
102、利用标记好分群结果的样本数据训练患者分群模型,以使患者分群模型符合预设训练标准。102. Train the patient clustering model by using the sample data marked with the clustering results, so that the patient clustering model meets the preset training standard.
在具体的应用场景中,可通过预先设定分群决策规则,并基于分群决策规则确定样本数据对应所属的群组,进而将分群结果以类似打标签的形式标注至对应的样本数据中,用于作为校验参照,对患者分群模型针对样本数据输出的结果进行校验,进而判定患者分群模型的训练状态,若判定患者分群模型的输出结果与标注结果误差较小,则可判定患者分群模型符合预设训练标准。In specific application scenarios, the grouping decision rules can be set in advance, and the group to which the sample data belongs can be determined based on the grouping decision rules, and then the grouping results can be marked in the corresponding sample data in a similarly labeled form for use As a reference for verification, the patient clustering model is verified against the sample data output results, and then the training status of the patient clustering model is determined. If the output result of the patient clustering model is determined to have a small error with the labeling result, it can be determined that the patient clustering model conforms to Preset training standards.
103、将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型,获取得到目标患者所属的目标群组。103. Input the target patient data in the preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs.
其中,预设时间段可根据实际应用需求进行设定,如可设定预设时间段为包含当前时刻的前一个月内,对应的历史目标患者数据为在该预设时间段内记录的一个或多个关于目标患者的随访数据。Among them, the preset time period can be set according to actual application requirements. For example, the preset time period can be set to include the current time in the previous month, and the corresponding historical target patient data is a record recorded in the preset time period. Or multiple follow-up data about the target patient.
对于本实施例,在具体的应用场景中,由于在将患者的单次随访信息作为输入时,单次随访信息很难完全表示患者的长期随访状态,进而容易导致分析结果不够准确。故在本实施例中,在将当前时刻的患者随访数据作为输入之外,还可将预设时间段内存在的所有历史患者随访数据作为输入,通过整合各个患者随访数据的输出结果,确定出最终相对较为准确的目标分群结果。此外,还可以基于Attention机制,解释每个时间点中的每个特征对分群结果的贡献度、关注系数、贡献比例等。For this embodiment, in a specific application scenario, when the patient's single follow-up information is used as input, the single follow-up information cannot fully represent the patient's long-term follow-up status, which may easily lead to inaccurate analysis results. Therefore, in this embodiment, in addition to the patient follow-up data at the current moment as input, all historical patient follow-up data existing in a preset time period can also be used as input, and the output results of the follow-up data of each patient are integrated to determine The final relatively accurate target grouping result. In addition, the Attention mechanism can also be used to explain the contribution, attention coefficient, contribution ratio, etc. of each feature at each time point to the clustering result.
104、基于目标群组内的人群特征确定目标患者的第一治疗方案。104. Determine a first treatment plan for the target patient based on the characteristics of the population in the target group.
在具体的应用场景中,在对目标患者数据进行群组划分后,可进一步基于该群组内的人群信息,确定出与目标患者对应人群特征相似度较高的患者,以便基于该患者已生成的治疗方案,筛选出可供目标患者选取的第一治疗方案。In specific application scenarios, after the target patient data is divided into groups, based on the population information in the group, a patient with a high similarity to the target patient population can be determined, so as to be based on the patient’s generated data. The first treatment plan that can be selected by the target patient is screened out.
105、依据目标患者数据提取目标患者的禁忌药品,并从第一治疗方案中筛选出包含禁忌药品的第二治疗方案。105. Extract the contraindicated drugs of the target patient based on the target patient's data, and screen out the second treatment plan containing the contraindicated drugs from the first treatment plan.
对于本实施例,在具体的应用场景中,由于不同的患者可能存在对应不同的禁忌药品,故应该首先提取出目标患者的禁忌药品,从而在第一治疗方案中筛选出包含对应禁忌药品的第二治疗方案,以便在最终生成治疗方案推荐时,不考虑第二治疗方案。For this embodiment, in a specific application scenario, because different patients may have different contraindicated drugs, the contraindicated drugs of the target patient should be extracted first, so that the first treatment plan containing the corresponding contraindicated drugs should be screened out. The second treatment plan, so that the second treatment plan is not considered when the treatment plan recommendation is finally generated.
106、按照第一治疗方案以及第二治疗方案,分析得到目标患者的目标治疗方案。106. According to the first treatment plan and the second treatment plan, analyze and obtain the target treatment plan of the target patient.
对于本实施例,在具体的应用场景中,在确定出第一治疗方案以及第二治疗方案后,会在第一治疗方案中剔除第二治疗方案,并将剔除后的第一治疗方案确定为目标患者的目标治疗方案,在本实施例中,考虑到药品禁忌因素,从而能够保证患者治疗的安全性。For this embodiment, in a specific application scenario, after the first treatment plan and the second treatment plan are determined, the second treatment plan will be excluded from the first treatment plan, and the eliminated first treatment plan will be determined as The target treatment plan of the target patient, in this embodiment, takes into account the drug contraindication factors, so as to ensure the safety of the patient's treatment.
通过本实施例中患者治疗方案的确定方法,通过提出一种改进深度强化学习模型DQN 的网络结构,以便创建用于处理时序数据的患者分群模型,之后利用样本数据训练患者分群模型,使其达到预设训练标准。再将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型中,即可获取得到目标分群结果,进而可利用目标群组内的人群特征确定出目标患者的第一治疗方案;进一步为了增强诊断安全性,还可基于目标患者数据确定出目标患者的禁忌药品,以便从第一治疗方案中筛选出包含禁忌药品的第二治疗方案;最后可利用第一治疗方案以及第二治疗方案,分析得到适用于目标患者的目标治疗方案。此外,在本申请中,可实现对患者治疗方案的数字化处理,将预期奖励值Q的计算过程扩展为一个时序结构,可考虑更多的信息,并且通过融入人工智能和深度学习算法,可使分析结果更加准确。Through the method for determining the patient treatment plan in this embodiment, by proposing an improved deep reinforcement learning model DQN network structure, in order to create a patient clustering model for processing time series data, and then use the sample data to train the patient clustering model to achieve Preset training standards. Then input the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ; To enhance the safety of diagnosis, the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient. In addition, in this application, the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例中的具体实施过程,提供了另一种患者治疗方案的确定方法,如图2所示,该方法包括:Further, as a refinement and expansion of the specific implementation of the foregoing embodiment, in order to fully explain the specific implementation process in this embodiment, another method for determining a patient's treatment plan is provided. As shown in FIG. 2, the method includes:
201、基于深度强化学习DQN创建用于处理时序数据的患者分群模型。201. Create a patient grouping model for processing time series data based on deep reinforcement learning DQN.
对于本实施例,在具体的应用场景中,实施例步骤201具体可以包括:将深度强化学习DQN对应网络结构中的最后一个全连接层,拆分成第一全连接层、第二循环神经网络层、第三循环神经网络层;利用更改网络结构后的深度强化学习DQN构建患者分群模型,以便在向患者分群模型输入包含多个时间点的患者数据时,由第一全连接层输出各个时间点对应患者状态的嵌入值,由第二循环神经网络层输出各个时间点对应患者状态的第一关注度,由第三循环神经网络层输出各个时间点对应分群结果的第二关注度,并基于嵌入值、第一关注度以及第二关注度计算患者数据对应各个预设群组的预期奖励值。For this embodiment, in a specific application scenario, step 201 of the embodiment may specifically include: splitting the deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure into a first fully connected layer and a second recurrent neural network Layer, the third cyclic neural network layer; use the deep reinforcement learning DQN after changing the network structure to construct a patient grouping model, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs each time The point corresponds to the embedded value of the patient's state, the second recurrent neural network layer outputs the first degree of attention corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second degree of attention corresponding to the grouping result at each time point, and is based on The embedded value, the first degree of attention, and the second degree of attention are used to calculate the expected reward value of the patient data corresponding to each preset group.
例如,如图3所示的患者分群模型的网络结构图,将卷积层提取的抽象特征分流到三个支路中,即将深度强化学习DQN对应网络结构中的最后一个全连接层拆分为:第一全连接层1、第二循环神经网络层2、第三循环神经网络层3,第一全连接层1用于输出各个时间点对应患者状态的嵌入值,第二循环神经网络层2为状态价值函数(value function),用于输出各个时间点对应患者状态的第一关注度,第三循环神经网络层3为动作优势函数(advantage function),用于输出各个时间点对应分群结果的第二关注度。For example, in the network structure diagram of the patient grouping model shown in Figure 3, the abstract features extracted by the convolutional layer are divided into three branches, that is, the last fully connected layer in the corresponding network structure of the deep reinforcement learning DQN is split into : The first fully connected layer 1, the second cyclic neural network layer 2, the third cyclic neural network layer 3. The first fully connected layer 1 is used to output the embedded value corresponding to the patient state at each time point, and the second cyclic neural network layer 2 Is the state value function (value function), used to output the first degree of attention corresponding to the patient state at each time point, and the third recurrent neural network layer 3 is the action advantage function (advantage function), used to output the clustering results corresponding to each time point The second degree of attention.
在具体的应用场景中,为了在利用样本数据训练患者分群模型时,能够监测患者分群模型的训练状态,故需要预先对样本数据进行所属群组的标记,具体包括:依据预设分组决策规则对样本数据进行分群处理,获取得到各个样本数据对应的分群结果;基于分群结果标记样本数据。In a specific application scenario, in order to monitor the training status of the patient clustering model when using the sample data to train the patient clustering model, it is necessary to mark the sample data to belong to the group in advance, which specifically includes: The sample data is grouped into groups, and the grouping result corresponding to each sample data is obtained; the sample data is marked based on the grouping result.
其中,预设分组决策规则可根据实际需求进行设定,如设定分群决策规则可按照患者个人特征信息,并结合检查指标信息来进行划分。在进行群组划分时,可将患者个人特征信息相似度较高且包含相同检查指标以及相同检查结果的患者划分为一个群组。Among them, the preset grouping decision rules can be set according to actual needs. For example, the grouping decision rules can be set according to the patient's personal characteristic information and combined with the inspection index information for classification. In group division, patients with high similarity in personal characteristic information and containing the same examination indicators and the same examination results can be divided into a group.
202、将当前时间点和历史时间点下的样本数据,输入患者分群模型,获取得到预设数量个群组,以及各个样本数据对应各个群组的预期奖励值。202. Input the sample data at the current time point and the historical time point into the patient grouping model to obtain a preset number of groups, and the expected reward value of each group corresponding to each sample data.
其中,样本数据为包含当前时间点以及预设数量个历史时间点的时序数据,可包括当前时刻以及历史时刻内的患者数据信息,患者数据信息可为个人身份信息(如姓名、性别、年龄等)、治疗方案信息(用药组合、用药周期、用药量等)、检查指标信息(如血糖、血压、心电图等检查指标以及对应的检查结果等)等;预期奖励值是在计算同一 时间点下第一关注度与第二关注度的第一加和,以及第一加和与嵌入值的乘积后,通过累加当前时间点和历史时间点下的乘积得到的。Among them, the sample data is time series data containing the current time point and a preset number of historical time points, and can include patient data information at the current time and historical time. The patient data information can be personal identification information (such as name, gender, age, etc.) ), treatment plan information (drug combination, medication cycle, dosage, etc.), inspection index information (such as blood sugar, blood pressure, electrocardiogram and other inspection indicators and corresponding inspection results, etc.), etc.; the expected reward value is calculated at the same time point After the first sum of the first degree of attention and the second degree of attention, and the product of the first sum and the embedded value, it is obtained by accumulating the product of the current time point and the historical time point.
例如,如图3所示的患者分群模型的网络结构图,若向患者分群模型输入样本数据对应的当前患者状态(s 3)加上历史两个时间点的患者状态(s 1、s 2),经过患者分群模型中的全连接层和两个循环神经网络层,即可得到第一全连接层输出的各个时间点的e(e 1,e 2,e 3),第二循环神经网络层输出的各个时间点的V(V 1、V 2、V 3),第三循环神经网络层输出的各个时间点的A(A 1、A 2、A 3),之后利用同一时间步骤中的V和A相加,再与e进行element-wise相乘后累加计算当前状态的Q值(Q 3)。其中,V表示每个时间点对应患者状态的关注度;A表示每个时间点的对应患者状态的关注度;e表示患者状态的嵌入表示。各层的计算公式为: For example, the network structure diagram of the patient clustering model shown in Figure 3, if the current patient status (s 3 ) corresponding to the sample data is input to the patient clustering model plus the patient status at two historical time points (s 1 , s 2 ) , Through the fully connected layer and two recurrent neural network layers in the patient grouping model, the e(e 1 , e 2 , e 3 ) output at each time point of the first fully connected layer can be obtained, and the second recurrent neural network layer Output V (V 1 , V 2 , V 3 ) at each time point, A (A 1 , A 2 , A 3 ) at each time point output by the third loop neural network layer, and then use V in the same time step Add to A, then multiply it element-wise with e, and then accumulate the Q value (Q 3 ) of the current state. Among them, V represents the degree of attention corresponding to the patient state at each time point; A represents the degree of attention corresponding to the patient state at each time point; e represents the embedded representation of the patient state. The calculation formula for each layer is:
h V1,h V2,h V3=LSTM-V(s 1,s 2,s 3) h V1 ,h V2 ,h V3 =LSTM-V(s 1 ,s 2 ,s 3 )
Figure PCTCN2020118873-appb-000001
Figure PCTCN2020118873-appb-000001
h A1,h A2,h A3=LSTM-A(s 1,s 2,s 3) h A1 ,h A2 ,h A3 =LSTM-A(s 1 ,s 2 ,s 3 )
A 1,A 2,A 3=(W Ah A1,W Ah A2,W Ah A3) A 1 ,A 2 ,A 3 =(W A h A1 ,W A h A2 ,W A h A3 )
v 1,v 2,v 3=(W Is 1,W Is 2,W Is 3) v 1 ,v 2 ,v 3 =(W I s 1 ,W I s 2 ,W I s 3 )
e 1,e 2,e 3=(W IIv 1,W IIv 2,W IIv 3) e 1 , e 2 , e 3 = (W II v 1 , W II v 2 , W II v 3 )
Q 3=(e 1O(V 1+A 1)+e 2O(V 2+A 2)+e 3O(V 3+A 3) Q 3 =(e 1 O(V 1 +A 1 )+e 2 O(V 2 +A 2 )+e 3 O(V 3 +A 3 )
其中,s i、h vi、w v、h Ai、A i、v i、e i、Q 3为向量,V i为标量,W A、W I、W II为矩阵,O表示元素对应相乘。 Wherein, s i, h vi, w v, h Ai, A i, v i, e i, Q 3 is a vector, V i is a scalar, W A, W I, W II matrix, O for corresponding elements are multiplied .
需要说明的是,在本申请中,还融入了Attention机制,进而能够实现对患者特征的可解释性。其中,模型决策的解释方法可为:通过输入的所有s i可正向推导出各个时间点中的各个患者特征对最终Q值的贡献。 It should be noted that in this application, an Attention mechanism is also incorporated to realize the interpretability of patient characteristics. Among them, the interpretation method of the model decision can be: the contribution of each patient characteristic at each time point to the final Q value can be positively derived from all the input s i.
根据预期奖励值(Q 3)的计算公式: According to the calculation formula of the expected reward value (Q 3 ):
Q 3=(e 1O(V 1+A 1)+e 2O(V 2+A 2)+e 3O(V 3+A 3))=(W IIW Is 1O(V 1+A 1)+W IIW Is 2O(V 2+A 2)+W IIW Is 3O(V 3+A 3)) Q 3 =(e 1 O(V 1 +A 1 )+e 2 O(V 2 +A 2 )+e 3 O(V 3 +A 3 ))=(W II W I s 1 O(V 1 + A 1 )+W II W I s 2 O(V 2 +A 2 )+W II W I s 3 O(V 3 +A 3 ))
可见,第i个时间点的第j个特征对第k个Q值的重要性为:It can be seen that the importance of the j-th feature at the i-th time point to the k-th Q value is:
w(i,j,k)=(V i+A i[k])*(W II[K]·W I[j])*s i[j] w(i,j,k)=(V i +A i [k])*(W II [K]·W I [j])*s i [j]
其中,(V i+A i[k])*(W II[k]·W I[j])即为贡献的系数,关注度。 Among them, (V i +A i [k])*(W II [k]·W I [j]) is the coefficient of contribution and the degree of attention.
203、提取样本数据对应的标记群组,将标记群组对应输出的第一预期奖励值确定为患者分群模型的训练输出结果。203. Extract the label group corresponding to the sample data, and determine the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model.
对于本实施例,在具体的应用场景中,每个样本数据均对应唯一一个标记群组,在将样本数据输入患者分群模型中,会得到对应各个预设群组下的预期奖励值,为了验证患者分群模型的训练进程,故仅需要提取出标记群组对应输出的第一预期奖励值,并将第一预期奖励值确定为患者分群模型的训练输出结果。For this embodiment, in a specific application scenario, each sample data corresponds to a unique label group. When the sample data is input into the patient grouping model, the expected reward value corresponding to each preset group will be obtained. In order to verify For the training process of the patient grouping model, it is only necessary to extract the first expected reward value corresponding to the output of the marked group, and determine the first expected reward value as the training output result of the patient grouping model.
204、计算第一预期奖励值与真实预期奖励值的均方差损失,若依据均方差损失判定损失函数达到收敛状态,则确定患者分群模型符合预设训练标准。204. Calculate the mean square error loss between the first expected reward value and the real expected reward value, and if the loss function is determined to reach a convergent state according to the mean square error loss, it is determined that the patient grouping model meets the preset training standard.
其中,第一预期奖励值为标记群组对应输出当前患者状态下的预期奖励值,真实预期奖励值为下一患者状态下最大的预期奖励值+实际得到的奖励(reward),进一步计算得到的,即对应标记群组的真实预期奖励值。Among them, the first expected reward value is the expected reward value in the current patient state corresponding to the output of the marked group, and the real expected reward value is the largest expected reward value in the next patient state + the actual reward (reward), which is further calculated , Which is the actual expected reward value of the corresponding marked group.
对于本实施例,在具体的应用场景中,在提取出第一预期奖励值后,需要依据第一预期奖励值与真实预期奖励值计算均方差损失,进一步确定损失函数是否达到收敛状态,当损失函数达到收敛状态时,即可确定患者分群模型符合预设训练标准。For this embodiment, in a specific application scenario, after extracting the first expected reward value, it is necessary to calculate the mean square error loss based on the first expected reward value and the real expected reward value to further determine whether the loss function has reached the convergence state. When the loss When the function reaches the convergence state, it can be determined that the patient grouping model meets the preset training standard.
205、若判定损失函数未达到收敛状态,则利用样本数据重复训练患者分群模型,以使患者分群模型符合预设训练标准。205. If it is determined that the loss function has not reached the convergence state, the sample data is used to repeatedly train the patient grouping model, so that the patient grouping model meets the preset training standard.
相应的,若判定损失函数未达到收敛状态,即可确定患者分群模型未训练成功,应利用样本数据重复上述训练步骤,以使患者分群模型符合预设训练标准。Correspondingly, if it is determined that the loss function has not reached the convergence state, it can be determined that the patient clustering model has not been successfully trained, and the above training steps should be repeated using sample data to make the patient clustering model meet the preset training standard.
206、将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型,获取得到目标患者所属的目标群组。206. Input the target patient data within a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs.
其中,当目标患者信息为时序数据时,需要将当前时刻以及历史时刻的所有目标患者信息输入患者分群模型中,获取得到分群结果;当目标患者信息不是时序数据时,仅需要将当前时刻的目标患者信息输入患者分群模型中,并将患者分群模型中历史时间点对应的参数值设置为0,即可获取得到分群结果。Among them, when the target patient information is time series data, all target patient information at the current time and historical time needs to be input into the patient grouping model to obtain the grouping results; when the target patient information is not time series data, only the target patient information at the current time needs to be The patient information is input into the patient clustering model, and the parameter value corresponding to the historical time point in the patient clustering model is set to 0 to obtain the clustering result.
对于本实施例,在具体的应用场景中,当目标患者信息为时序数据时,实施例步骤206具体可以包括:提取预设时间段内目标患者的历史患者随访数据以及当前患者随访数据;将历史患者随访数据以及当前患者随访数据,输入符合预设训练标准的患者分群模型中,获取得到对应各个预设群组下的预期奖励值;将预期奖励值最大的预设群组确定为目标患者对应的目标群组。For this embodiment, in a specific application scenario, when the target patient information is time series data, step 206 of the embodiment may specifically include: extracting historical patient follow-up data and current patient follow-up data of the target patient within a preset time period; Patient follow-up data and current patient follow-up data are input into the patient grouping model that meets the preset training standards to obtain the expected reward value corresponding to each preset group; the preset group with the largest expected reward value is determined as the target patient corresponding The target group for.
207、基于目标群组内的人群特征确定目标患者的第一治疗方案。207. Determine a first treatment plan for the target patient based on the characteristics of the population in the target group.
对于本实施例,在具体的应用场景中,为了确定出目标患者的第一治疗方案,实施例步骤207具体可以包括:根据目标患者数据在目标群组中筛选与目标患者对应人群特征相似度大于第一预设阈值的第一患者,人群特征至少包括病情信息及个人信息;提取 第一患者对应的治疗方案,以及治疗方案关于治疗效果的分数值,将分数值大于第二预设阈值的治疗方案确定为第一治疗方案;或获取依据目标群组的人群特征创建的预设治疗方案,并将预设治疗方案确定为第一治疗方案。For this embodiment, in a specific application scenario, in order to determine the first treatment plan of the target patient, step 207 of the embodiment may specifically include: screening the target group in the target group according to the target patient data and the similarity of the characteristics of the population corresponding to the target patient is greater than For the first patient with the first preset threshold, the population characteristics include at least condition information and personal information; extract the treatment plan corresponding to the first patient and the score value of the treatment plan regarding the treatment effect, and treat the treatment with the score value greater than the second preset threshold The plan is determined as the first treatment plan; or a preset treatment plan created according to the characteristics of the target group is obtained, and the preset treatment plan is determined as the first treatment plan.
其中,目标群组中包含多个样本患者的数据信息,其中数据信息除了包含样本患者的个人身份信息、检查指标信息、诊断结果信息等多个维度的特征信息之外,还可包括治疗效果的分数信息以及治疗方案信息,如用药组合、用药周期、用药量等;第一预设阈值和第二预设阈值均为大于0且小于等于1的数据,具体数值可根据具体应用场景进行设定,需要说明的是,当第一预设阈值设定的数值越接近1,则可说明筛选出的第一患者与目标患者的特征相似度越高;当第二预设阈值设定的数值越接近1,则可说明筛选出的第一治疗方案,经患者反馈的治疗效果越好。Among them, the target group contains the data information of multiple sample patients. In addition to the feature information of multiple dimensions such as the sample patient’s personal identity information, inspection index information, and diagnosis result information, the data information can also include information about the treatment effect. Score information and treatment plan information, such as medication combination, medication cycle, dosage, etc.; the first preset threshold and the second preset threshold are both data greater than 0 and less than or equal to 1, and the specific values can be set according to specific application scenarios , It should be noted that when the value set by the first preset threshold is closer to 1, it can indicate that the feature similarity between the first patient and the target patient selected is higher; when the value set by the second preset threshold is higher Close to 1, it can indicate that the first treatment plan selected, the better the treatment effect after patient feedback.
在具体的应用场景中,在完成对目标患者的分群后,可预先在目标患者信息中提取出目标患者的个人身份信息、检查指标信息、诊断结果信息等多个维度的特征信息,进而在目标群组中筛选出与目标患者的特征信息匹配度大于第一预设阈值的第一患者,进而提取第一患者对应治疗效果的分数值大于第二预设阈值的治疗方案,并将该治疗方案确定为第一治疗方案。In specific application scenarios, after grouping the target patients, the target patient’s personal identity information, inspection index information, diagnosis result information and other multi-dimensional feature information can be extracted from the target patient’s information in advance. The first patient whose matching degree with the feature information of the target patient is greater than the first preset threshold is selected from the group, and then the treatment plan whose score value corresponding to the treatment effect of the first patient is greater than the second preset threshold is extracted, and the treatment plan Determined as the first treatment plan.
例如,根据目标患者数据在目标群组中筛选与目标患者对应人群特征相似度大于第一预设阈值的第一患者包括:A、B、C、D四个第一患者,其中第一患者A对应的用药组合为a+c+d,第一患者B对应的用药组合为a+c+e,第一患者C对应的用药组合为a+b+c,第一患者D对应的用药组合为a+c+d,通过统计可发现共包含a+c+d、a+c+e以及a+b+c三个不重合治疗方案,进而获取这三个方案关于治疗效果的分数值,例如获取a+c+d这一治疗方案对应的分数值为0.75,a+b+e这一治疗方案对应的分数值为0.91,a+b+c这一治疗方案对应的分数值为0.88,若设定的第二预设阈值为0.85,则可确定筛选出的第一治疗方案包括a+b+e、a+b+c。For example, according to the data of the target patient, the first patient whose feature similarity with the target patient's corresponding population is greater than the first preset threshold is selected in the target group according to the target patient data, including four first patients A, B, C, and D, among which the first patient A The corresponding medication combination is a+c+d, the medication combination corresponding to the first patient B is a+c+e, the medication combination corresponding to the first patient C is a+b+c, and the medication combination corresponding to the first patient D is a+c+d, through statistics, it can be found that a+c+d, a+c+e, and a+b+c three non-overlapping treatment plans are included, and then the score values of the three plans regarding the treatment effect can be obtained, for example The score value corresponding to the treatment plan a+c+d is 0.75, the score value corresponding to the treatment plan a+b+e is 0.91, and the score value corresponding to the treatment plan a+b+c is 0.88. If the second preset threshold is set to 0.85, it can be determined that the selected first treatment plan includes a+b+e and a+b+c.
相应地,作为本实施例中的另一种可选方式,还可预先根据目标群组中的人群特征以及医师诊断结果事先确定各个目标群组对应的预设治疗方案,如对于目标群组中的患者为儿童,且对应的医师诊断结果为疾病a,并且普遍采用的治疗方案包括A、B时,此时可直接将治疗方案A、B确定为目标群组对应的预设治疗方案,在判定目标患者属于该目标群组时,即可将治疗方案A、B确定确定为目标患者对应的第一治疗方案。Correspondingly, as another optional method in this embodiment, the preset treatment plan corresponding to each target group may be determined in advance according to the characteristics of the population in the target group and the diagnosis result of the physician, for example, for the target group When the patient of is a child, and the corresponding doctor’s diagnosis result is disease a, and the commonly used treatment options include A and B, then treatment options A and B can be directly determined as the preset treatment options corresponding to the target group. When it is determined that the target patient belongs to the target group, treatment plans A and B can be determined as the first treatment plan corresponding to the target patient.
208、依据目标患者数据提取目标患者的禁忌药品,并从第一治疗方案中筛选出包含禁忌药品的第二治疗方案。208. Extract the contraindicated drugs of the target patient based on the target patient data, and screen out the second treatment plan containing the contraindicated drugs from the first treatment plan.
对于本实施例,在具体的应用场景中,为了确定得到包含目标患者所禁忌药品的第二治疗方案,实施例步骤208具体可以包括:根据用药禁忌数据确定目标患者对应人群类型不适于服用的第一禁忌药品;依据目标患者数据中的药物过敏史,确定目标患者存在过敏反应的第二禁忌药品;将包含第一禁忌药品和/或第二禁忌药品的第一治疗方案确定为第二治疗方案。For this embodiment, in a specific application scenario, in order to determine to obtain a second treatment plan containing drugs contraindicated by the target patient, step 208 of the embodiment may specifically include: determining, according to the drug contraindication data, the target patient corresponding to the population type that is not suitable for taking the second treatment plan. A contraindicated drug; based on the drug allergy history in the target patient's data, determine the second contraindicated drug in which the target patient has an allergic reaction; determine the first treatment plan containing the first contraindicated drug and/or the second contraindicated drug as the second treatment plan .
例如,目标患者为孕妇时,目标患者的第一禁忌药品可对应孕妇禁用药品;目标患者为青霉素过敏人群时,则可确定青霉素类药物为目标患者的第二禁忌药品。For example, when the target patient is a pregnant woman, the first contraindication drug of the target patient may correspond to the drug forbidden by the pregnant woman; when the target patient is a penicillin allergic population, penicillin drugs can be determined as the second contraindication drug of the target patient.
209、按照第一治疗方案以及第二治疗方案,分析得到目标患者的目标治疗方案。209. According to the first treatment plan and the second treatment plan, analyze and obtain the target treatment plan of the target patient.
对于本实施例,在具体的应用场景中,实施例步骤209具体可以包括:将第一治疗方案中剔除第二治疗方案,得到目标治疗方案。For this embodiment, in a specific application scenario, step 209 of the embodiment may specifically include: excluding the second treatment plan from the first treatment plan to obtain the target treatment plan.
例如,从能够治疗目标患者所属疾病的第一治疗方案中剔除该人群所禁用的第二治疗方案,就可得到适合该人群健康的治疗方案有哪些,使用这些治疗方案即可有效治疗该人群的疾病。For example, by excluding the second treatment plan that is forbidden by the population from the first treatment plan that can treat the disease to which the target patient belongs, you can get the treatment plan suitable for the health of the population. Using these treatment plans can effectively treat the population’s health. disease.
通过上述患者治疗方案的确定方法,通过提出一种可解释的深度强化学习模型DQN的网络结构,以便创建用于处理时序数据的患者分群模型,之后利用样本数据训练患者分群模型,使其达到预设训练标准。再将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型中,即可获取得到目标分群结果,进而可利用目标群组内的人群特征确定出目标患者的第一治疗方案;进一步为了增强诊断安全性,还可基于目标患者数据确定出目标患者的禁忌药品,以便从第一治疗方案中筛选出包含禁忌药品的第二治疗方案;最后可利用第一治疗方案以及第二治疗方案,分析得到适用于目标患者的目标治疗方案。此外,在本申请中,可实现对患者治疗方案的数字化处理,将预期奖励值Q的计算过程扩展为一个时序结构,可考虑更多的信息,并且通过融入人工智能和深度学习算法,可使分析结果更加准确。此外,还在计算预期奖励值的过程中加入Attention机制,能够实现一定程度的可解释性。Through the above method of determining patient treatment plan, an interpretable deep reinforcement learning model DQN network structure is proposed to create a patient clustering model for processing time series data, and then use sample data to train the patient clustering model to achieve the expected Set training standards. Then input the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ; To enhance the safety of diagnosis, the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient. In addition, in this application, the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate. In addition, the Attention mechanism is added in the process of calculating the expected reward value, which can achieve a certain degree of interpretability.
进一步的,作为图1和图2所示方法的具体体现,本申请实施例提供了一种患者治疗方案的确定装置,如图4所示,该装置包括:创建模块31、训练模块32、输入模块33、确定模块34、提取模块35、分析模块36。Further, as a specific embodiment of the method shown in FIG. 1 and FIG. 2, an embodiment of the present application provides a device for determining a patient's treatment plan. As shown in FIG. 4, the device includes: a creation module 31, a training module 32, and an input Module 33, determination module 34, extraction module 35, analysis module 36.
创建模块31,可用于基于深度强化学习DQN创建用于处理时序数据的患者分群模型;The creation module 31 can be used to create a patient grouping model for processing time series data based on deep reinforcement learning DQN;
训练模块32,可用于利用标记好分群结果的样本数据训练患者分群模型,以使患者分群模型符合预设训练标准;The training module 32 can be used to train the patient clustering model by using the sample data with marked clustering results, so that the patient clustering model meets the preset training standard;
输入模块33,可用于将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型,获取得到目标分群结果;The input module 33 can be used to input target patient data within a preset time period into a patient grouping model that meets the preset training standard, and obtain the target grouping result;
确定模块34,可用于将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;The determining module 34 can be used to input target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
提取模块35,可用于依据目标患者数据提取目标患者的禁忌药品,并从第一治疗方案中筛选出包含禁忌药品的第二治疗方案;The extraction module 35 can be used to extract the contraindicated drugs of the target patient based on the target patient's data, and screen out the second treatment plan containing the contraindicated drugs from the first treatment plan;
分析模块36,可用于按照第一治疗方案以及第二治疗方案,分析得到目标患者的目标治疗方案。The analysis module 36 can be used to analyze and obtain the target treatment plan of the target patient according to the first treatment plan and the second treatment plan.
在具体的应用场景中,为了创建用于处理时序数据的患者分群模型,如图5所示,创建模块31,具体可包括:拆分单元311、构建单元312;In a specific application scenario, in order to create a patient grouping model for processing time series data, as shown in FIG. 5, the creation module 31 may specifically include: a splitting unit 311 and a construction unit 312;
拆分单元311,可用于将深度强化学习DQN对应网络结构中的最后一个全连接层,拆分成第一全连接层、第二循环神经网络层、第三循环神经网络层;The splitting unit 311 can be used to split the deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure into a first fully connected layer, a second cyclic neural network layer, and a third cyclic neural network layer;
构建单元312,可用于利用更改网络结构后的深度强化学习DQN构建患者分群模型,以便在向患者分群模型输入包含多个时间点的患者数据时,由第一全连接层输出各个时间点对应患者状态的嵌入值,由第二循环神经网络层输出各个时间点对应患者状态的第一关注度,由第三循环神经网络层输出各个时间点对应分群结果的第二关注度,并基于嵌入值、第一关注度以及第二关注度计算患者数据对应各个预设群组的预期奖励值。The construction unit 312 can be used to construct a patient grouping model by using the deep reinforcement learning DQN after changing the network structure, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs the corresponding patients at each time point The embedded value of the state, the second recurrent neural network layer outputs the first degree of attention corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second degree of attention corresponding to the grouping result at each time point, and based on the embedded value, The first degree of attention and the second degree of attention calculate the expected reward value of the patient data corresponding to each preset group.
相应的,为了训练得到符合预设训练标准的患者分群模型,如图5所示,训练模块32,具体可包括:第一输入单元321、第一提取单元322、计算单元323、训练单元324;Correspondingly, in order to train a patient grouping model that meets the preset training standards, as shown in FIG. 5, the training module 32 may specifically include: a first input unit 321, a first extraction unit 322, a calculation unit 323, and a training unit 324;
第一输入单元321,可用于将当前时间点和历史时间点下的样本数据,输入患者分群模型,获取得到预设数量个群组,以及各个样本数据对应各个群组的预期奖励值,预期奖励值是在计算同一时间点下第一关注度与第二关注度的第一加和,以及第一加和与嵌入值的乘积后,通过累加当前时间点和历史时间点下的乘积得到的;The first input unit 321 can be used to input the sample data at the current time point and the historical time point into the patient grouping model to obtain a preset number of groups, and each sample data corresponds to the expected reward value of each group, the expected reward The value is obtained by accumulating the product of the current time point and the historical time point after calculating the first sum of the first degree of attention and the second degree of attention at the same time point, and the product of the first sum and the embedded value;
第一提取单元322,可用于提取样本数据对应的标记群组,将标记群组对应输出的第一预期奖励值确定为患者分群模型的训练输出结果;The first extraction unit 322 may be used to extract the label group corresponding to the sample data, and determine the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model;
计算单元323,可用于计算第一预期奖励值与真实预期奖励值的均方差损失,若基于均方差损失判定损失函数达到收敛状态,则确定患者分群模型符合预设训练标准;The calculation unit 323 can be used to calculate the mean square error loss between the first expected reward value and the real expected reward value. If it is determined that the loss function reaches the convergence state based on the mean square error loss, it is determined that the patient grouping model meets the preset training standard;
训练单元324,可用于若判定损失函数未达到收敛状态,则利用样本数据重复训练患者分群模型,以使患者分群模型符合预设训练标准。The training unit 324 can be used to repeatedly train the patient clustering model by using the sample data if it is determined that the loss function has not reached the convergence state, so that the patient clustering model meets the preset training standard.
在具体的应用场景中,为了确定目标患者对应所属的目标群组,如图5所示,输入模块33,具体可包括:第二提取单元331、第二输入单元332、第一确定单元333;In a specific application scenario, in order to determine the target group corresponding to the target patient, as shown in FIG. 5, the input module 33 may specifically include: a second extraction unit 331, a second input unit 332, and a first determination unit 333;
第二提取单元331,可用于提取预设时间段内目标患者的历史患者随访数据以及当前患者随访数据;The second extraction unit 331 can be used to extract historical patient follow-up data and current patient follow-up data of the target patient within a preset time period;
第二输入单元332,可用于将历史患者随访数据以及当前患者随访数据,输入符合预设训练标准的患者分群模型中,获取得到对应各个预设群组下的预期奖励值;The second input unit 332 can be used to input historical patient follow-up data and current patient follow-up data into a patient grouping model that meets the preset training standards to obtain the expected reward value corresponding to each preset group;
第一确定单元333,可用于将预期奖励值最大的预设群组确定为目标患者对应的目标群组。The first determining unit 333 may be used to determine the preset group with the largest expected reward value as the target group corresponding to the target patient.
在具体的应用场景中,为了基于目标分群结果确定出目标患者的第一治疗方案,如图5所示,确定模块34,具体可包括:筛选单元341、第二确定单元342;In a specific application scenario, in order to determine the first treatment plan of the target patient based on the target grouping result, as shown in FIG. 5, the determining module 34 may specifically include: a screening unit 341 and a second determining unit 342;
筛选单元341,可用于根据目标患者数据在目标群组中筛选与目标患者对应人群特征相似度大于第一预设阈值的第一患者,人群特征至少包括病情信息及个人信息;The screening unit 341 can be used to screen the first patients whose population characteristics similarity to the target patient is greater than a first preset threshold in the target group based on the target patient data, and the population characteristics include at least medical condition information and personal information;
第二确定单元342,可用于提取第一患者对应的治疗方案,以及治疗方案关于治疗效果的分数值,将分数值大于第二预设阈值的治疗方案确定为第一治疗方案;或The second determining unit 342 may be used to extract the treatment plan corresponding to the first patient and the score value of the treatment plan with respect to the treatment effect, and determine the treatment plan with the score value greater than the second preset threshold as the first treatment plan; or
第二确定单元342,还可用于获取依据目标群组的人群特征创建的预设治疗方案,并将预设治疗方案确定为第一治疗方案。The second determining unit 342 may also be used to obtain a preset treatment plan created according to the characteristics of the target group, and determine the preset treatment plan as the first treatment plan.
在具体的应用场景中,为了从第一治疗方案中筛选出包含目标患者的禁忌药品的第二治疗方案,如图5所示,提取模块35,具体可包括:第三确定单元351;In a specific application scenario, in order to screen out the second treatment plan containing the contraindicated drugs of the target patient from the first treatment plan, as shown in FIG. 5, the extraction module 35 may specifically include: a third determination unit 351;
第三确定单元351,可用于根据用药禁忌数据确定目标患者对应人群类型不适于服用的第一禁忌药品;The third determining unit 351 can be used to determine the first contraindicated drug that the target patient corresponds to the population type that is not suitable for taking according to the drug contraindicated data;
第三确定单元351,还可用于依据目标患者数据中的药物过敏史,确定目标患者存在过敏反应的第二禁忌药品;The third determining unit 351 can also be used to determine the second contraindicated drug for which the target patient has an allergic reaction based on the drug allergy history in the target patient's data;
第三确定单元351,还可用于将包含第一禁忌药品和/或第二禁忌药品的第一治疗方案确定为第二治疗方案。The third determining unit 351 may also be used to determine the first treatment plan including the first contraindication drug and/or the second contraindication drug as the second treatment plan.
相应的,为了分析得到目标患者的目标治疗方案,如图5所示,分析模块36,具体可包括:剔除单元361;Correspondingly, in order to analyze and obtain the target treatment plan of the target patient, as shown in FIG. 5, the analysis module 36 may specifically include: a rejection unit 361;
剔除单元361,可用于将第一治疗方案中剔除第二治疗方案,得到目标治疗方案。The rejection unit 361 can be used to remove the second treatment plan from the first treatment plan to obtain the target treatment plan.
需要说明的是,本实施例提供的一种患者治疗方案的确定装置所涉及各功能单元的其它相应描述,可以参考图1至图2中的对应描述,在此不再赘述。It should be noted that, for other corresponding descriptions of the functional units involved in the device for determining a patient treatment plan provided in this embodiment, reference may be made to the corresponding descriptions in FIGS. 1 to 2, and details are not repeated here.
基于上述如图1和图2所示方法,相应的,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质可包括非易失性和/或易失性存储器,其上存储有计算机程序,该程序被处理器执行时实现上述如图1和图2所示的患者治疗方案的确定方法。Based on the above-mentioned method shown in FIG. 1 and FIG. 2, correspondingly, an embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium may include non-volatile and/or volatile memory. A computer program is stored thereon, and when the program is executed by the processor, the method for determining the patient's treatment plan as shown in FIG. 1 and FIG. 2 is realized.
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景的方法。Based on this understanding, the technical solution of the present application can be embodied in the form of a software product. The software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
基于上述如图1、图2所示的方法,以及图4、图5所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括存储介质和处理器;存储介质,用于存储计算机程序,可包括非易失性和/或易失性存储器;处理器,用于执行计算机程序以实现上述如图1和图2所示的患者治疗方案的确定方法。Based on the above methods shown in Figures 1 and 2 and the virtual device embodiments shown in Figures 4 and 5, in order to achieve the above objectives, an embodiment of the present application also provides a computer device, which may be a personal computer, Servers, network devices, etc., the physical device includes a storage medium and a processor; the storage medium is used to store a computer program, and may include non-volatile and/or volatile memory; the processor is used to execute the computer program to achieve the above The method for determining the patient's treatment plan is shown in Figure 1 and Figure 2.
可选地,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a Wi-Fi module, and so on. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like. The optional network interface can include standard wired interface, wireless interface (such as Bluetooth interface, WI-FI interface) and so on.
本领域技术人员可以理解,本实施例提供的计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
非易失性可读存储介质中还可以包括操作系统、网络通信模块。操作系统是文本语义相似度的分析实体设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或 程序的运行。网络通信模块用于实现非易失性可读存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。The non-volatile readable storage medium may also include an operating system and a network communication module. The operating system is a program that analyzes the hardware and software resources of the physical device for the semantic similarity of text, and supports the operation of information processing programs and other software and/or programs. The network communication module is used to implement communication between various components in the non-volatile readable storage medium, and communication with other hardware and software in the physical device.
通过以上的实施方式的描述,本领域的技术人员可通过提出一种可解释的深度强化学习模型DQN的网络结构,以便创建用于处理时序数据的患者分群模型,之后利用样本数据训练患者分群模型,使其达到预设训练标准。再将预设时间段内的目标患者数据输入符合预设训练标准的患者分群模型中,即可获取得到目标分群结果,进而可利用目标群组内的人群特征确定出目标患者的第一治疗方案;进一步为了增强诊断安全性,还可基于目标患者数据确定出目标患者的禁忌药品,以便从第一治疗方案中筛选出包含禁忌药品的第二治疗方案;最后可利用第一治疗方案以及第二治疗方案,分析得到适用于目标患者的目标治疗方案。此外,在本申请中,可实现对患者治疗方案的数字化处理,将预期奖励值Q的计算过程扩展为一个时序结构,可考虑更多的信息,并且通过融入人工智能和深度学习算法,可使分析结果更加准确。此外,还在计算预期奖励值的过程中加入Attention机制,能够实现一定程度的可解释性。Through the description of the above embodiments, those skilled in the art can propose an interpretable deep reinforcement learning model DQN network structure to create a patient clustering model for processing time series data, and then use the sample data to train the patient clustering model , So that it meets the preset training standards. Then input the target patient data in the preset time period into the patient grouping model that meets the preset training standards, and then the target grouping result can be obtained, and then the first treatment plan of the target patient can be determined by using the characteristics of the population in the target group ; To enhance the safety of diagnosis, the target patient’s contraindicated drugs can also be determined based on the target patient’s data, so that the second treatment plan containing the contraindicated drugs can be screened from the first treatment plan; finally, the first treatment plan and the second treatment plan can be used The treatment plan is analyzed to obtain the target treatment plan suitable for the target patient. In addition, in this application, the digital processing of the patient's treatment plan can be realized, and the calculation process of the expected reward value Q can be extended to a time series structure, which can consider more information, and by integrating artificial intelligence and deep learning algorithms, The analysis result is more accurate. In addition, the Attention mechanism is added in the process of calculating the expected reward value, which can achieve a certain degree of interpretability.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawings are only schematic diagrams of preferred implementation scenarios, and the modules or processes in the accompanying drawings are not necessarily necessary for implementing this application. Those skilled in the art can understand that the modules in the device in the implementation scenario can be distributed in the device in the implementation scenario according to the description of the implementation scenario, or can be changed to be located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further divided into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。The above serial number of this application is for description only, and does not represent the pros and cons of implementation scenarios. What has been disclosed above are only a few specific implementation scenarios of this application, but this application is not limited to these, and any changes that can be thought of by those skilled in the art should fall into the protection scope of this application.

Claims (20)

  1. 一种患者治疗方案的确定方法,其中,包括:A method for determining a patient's treatment plan, which includes:
    基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
    利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
    将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
    基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
    依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
    按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
  2. 根据权利要求1所述的方法,其中,所述基于深度强化学习DQN创建用于处理时序数据的患者分群模型,具体包括:The method according to claim 1, wherein the creation of a patient grouping model for processing time series data based on deep reinforcement learning DQN specifically comprises:
    将深度强化学习DQN对应网络结构中的最后一个全连接层,拆分成第一全连接层、第二循环神经网络层、第三循环神经网络层;The deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure is split into the first fully connected layer, the second recurrent neural network layer, and the third recurrent neural network layer;
    利用更改网络结构后的所述深度强化学习DQN构建患者分群模型,以便在向所述患者分群模型输入包含多个时间点的患者数据时,由所述第一全连接层输出各个时间点对应患者状态的嵌入值,由所述第二循环神经网络层输出各个时间点对应患者状态的第一关注度,由所述第三循环神经网络层输出各个时间点对应分群结果的第二关注度,并基于所述嵌入值、所述第一关注度以及所述第二关注度计算所述患者数据对应各个预设群组的预期奖励值。Use the deep reinforcement learning DQN after changing the network structure to construct a patient grouping model, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs the corresponding patients at each time point The embedding value of the state, the second recurrent neural network layer outputs the first attention degree corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second attention degree corresponding to the grouping result at each time point, and The expected reward value of each preset group corresponding to the patient data is calculated based on the embedded value, the first degree of attention, and the second degree of attention.
  3. 根据权利要求2所述的方法,其中,所述样本数据为包含当前时间点以及预设数量个历史时间点的时序数据;The method according to claim 2, wherein the sample data is time series data including a current time point and a preset number of historical time points;
    所述利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准,具体包括:The training of the patient clustering model using sample data marked with clustering results so that the patient clustering model meets a preset training standard specifically includes:
    将所述当前时间点和所述历史时间点下的样本数据,输入所述患者分群模型,获取得到预设数量个群组,以及各个样本数据对应各个群组的预期奖励值,所述预期奖励值是在计算同一时间点下所述第一关注度与所述第二关注度的第一加和,以及所述第一加和与所述嵌入值的乘积后,通过累加所述当前时间点和所述历史时间点下的所述乘积得到的;The sample data at the current time point and the historical time point are input into the patient grouping model to obtain a preset number of groups, and each sample data corresponds to the expected reward value of each group, the expected reward The value is calculated by accumulating the first sum of the first degree of interest and the second degree of interest at the same time point and the product of the first sum and the embedded value. And the product obtained at the historical time point;
    提取所述样本数据对应的标记群组,将所述标记群组对应输出的第一预期奖励值确定为所述患者分群模型的训练输出结果;Extracting the label group corresponding to the sample data, and determining the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model;
    计算所述第一预期奖励值与真实预期奖励值的均方差损失,若基于所述均方差损失判定损失函数达到收敛状态,则确定所述患者分群模型符合预设训练标准;Calculate the mean square error loss between the first expected reward value and the real expected reward value, and if it is determined based on the mean square error loss that the loss function reaches a convergence state, it is determined that the patient grouping model meets a preset training standard;
    若判定所述损失函数未达到收敛状态,则利用所述样本数据重复训练所述患者分群模型,以使所述患者分群模型符合所述预设训练标准。If it is determined that the loss function has not reached the convergence state, the sample data is used to repeatedly train the patient grouping model, so that the patient grouping model meets the preset training standard.
  4. 根据权利要求3所述的方法,其中,所述将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组,具体包括:The method according to claim 3, wherein the inputting the target patient data within a preset time period into a patient grouping model that meets the preset training standard to obtain the target group to which the target patient belongs specifically includes:
    提取预设时间段内目标患者的历史患者随访数据以及当前患者随访数据;Extract historical patient follow-up data and current patient follow-up data of the target patient within a preset time period;
    将所述历史患者随访数据以及所述当前患者随访数据,输入符合所述预设训练标准的患者分群模型中,获取得到对应各个预设群组下的预期奖励值;Input the historical patient follow-up data and the current patient follow-up data into a patient grouping model that meets the preset training standard, and obtain the expected reward value corresponding to each preset group;
    将所述预期奖励值最大的预设群组确定为目标患者对应的目标群组。The preset group with the largest expected reward value is determined as the target group corresponding to the target patient.
  5. 根据权利要求4所述的方法,其中,所述基于所述目标群组内的人群特征确定所 述目标患者的第一治疗方案,具体包括:The method according to claim 4, wherein the determining the first treatment plan of the target patient based on the characteristics of the population in the target group specifically comprises:
    根据所述目标患者数据在所述目标群组中筛选与所述目标患者对应人群特征相似度大于第一预设阈值的第一患者,所述人群特征至少包括病情信息及个人信息;Screening, in the target group according to the target patient data, first patients whose population characteristics similarity to the target patient are greater than a first preset threshold, and the population characteristics include at least medical condition information and personal information;
    提取所述第一患者对应的治疗方案,以及所述治疗方案关于治疗效果的分数值,将所述分数值大于第二预设阈值的治疗方案确定为第一治疗方案;或Extracting the treatment plan corresponding to the first patient and the score value of the treatment plan with respect to the treatment effect, and determining the treatment plan with the score value greater than the second preset threshold as the first treatment plan; or
    获取依据所述目标群组的人群特征创建的预设治疗方案,并将所述预设治疗方案确定为所述第一治疗方案。Obtain a preset treatment plan created according to the characteristics of the population of the target group, and determine the preset treatment plan as the first treatment plan.
  6. 根据权利要求5所述的方法,其中,所述依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案,具体包括:The method according to claim 5, wherein said extracting the contraindicated drugs of the target patient based on the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan, Specifically:
    根据用药禁忌数据确定所述目标患者对应人群类型不适于服用的第一禁忌药品;According to the drug contraindication data, determine the first contraindication drug that the target patient corresponds to the population type that is not suitable for taking;
    依据所述目标患者数据中的药物过敏史,确定所述目标患者存在过敏反应的第二禁忌药品;According to the drug allergy history in the target patient's data, it is determined that the target patient has a second contraindication drug for allergic reactions;
    将包含所述第一禁忌药品和/或所述第二禁忌药品的第一治疗方案确定为第二治疗方案。The first treatment plan including the first contraindication drug and/or the second contraindication drug is determined as the second treatment plan.
  7. 根据权利要求6所述的方法,其中,所述按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案,具体包括:The method according to claim 6, wherein the analyzing and obtaining the target treatment plan of the target patient according to the first treatment plan and the second treatment plan specifically comprises:
    将所述第一治疗方案中剔除所述第二治疗方案,得到所述目标治疗方案。The second treatment plan is removed from the first treatment plan to obtain the target treatment plan.
  8. 一种患者治疗方案的确定装置,其中,包括:A device for determining a patient's treatment plan, which includes:
    创建模块,用于基于深度强化学习DQN创建用于处理时序数据的患者分群模型;The creation module is used to create a patient grouping model for processing time series data based on deep reinforcement learning DQN;
    训练模块,用于利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;A training module, configured to train the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
    输入模块,用于将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;The input module is used to input target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
    确定模块,用于基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;A determining module, configured to determine the first treatment plan of the target patient based on the characteristics of the population in the target group;
    提取模块,用于依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;An extraction module for extracting contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
    分析模块,用于按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。The analysis module is used to analyze and obtain the target treatment plan of the target patient according to the first treatment plan and the second treatment plan.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现以下步骤:A computer-readable storage medium having a computer program stored thereon, wherein the following steps are implemented when the program is executed by a processor:
    基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
    利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
    将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
    基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
    依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
    按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
  10. 根据权利要求9所述的计算机可读存储介质,其中,所述基于深度强化学习DQN创建用于处理时序数据的患者分群模型,具体包括:The computer-readable storage medium according to claim 9, wherein the creation of a patient grouping model for processing time series data based on deep reinforcement learning DQN specifically comprises:
    将深度强化学习DQN对应网络结构中的最后一个全连接层,拆分成第一全连接层、第二循环神经网络层、第三循环神经网络层;The deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure is split into the first fully connected layer, the second recurrent neural network layer, and the third recurrent neural network layer;
    利用更改网络结构后的所述深度强化学习DQN构建患者分群模型,以便在向所述患者分群模型输入包含多个时间点的患者数据时,由所述第一全连接层输出各个时间点对应患者状态的嵌入值,由所述第二循环神经网络层输出各个时间点对应患者状态的第一关注度,由所述第三循环神经网络层输出各个时间点对应分群结果的第二关注度,并基于所述嵌入值、所述第一关注度以及所述第二关注度计算所述患者数据对应各个预设群组的预期奖励值。Use the deep reinforcement learning DQN after changing the network structure to construct a patient grouping model, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs the corresponding patients at each time point The embedding value of the state, the second recurrent neural network layer outputs the first attention degree corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second attention degree corresponding to the grouping result at each time point, and The expected reward value of each preset group corresponding to the patient data is calculated based on the embedded value, the first degree of attention, and the second degree of attention.
  11. 根据权利要求10所述的计算机可读存储介质,其中,所述样本数据为包含当前时间点以及预设数量个历史时间点的时序数据;The computer-readable storage medium according to claim 10, wherein the sample data is time series data including a current time point and a preset number of historical time points;
    所述利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准,具体包括:The training of the patient clustering model using sample data marked with clustering results so that the patient clustering model meets a preset training standard specifically includes:
    将所述当前时间点和所述历史时间点下的样本数据,输入所述患者分群模型,获取得到预设数量个群组,以及各个样本数据对应各个群组的预期奖励值,所述预期奖励值是在计算同一时间点下所述第一关注度与所述第二关注度的第一加和,以及所述第一加和与所述嵌入值的乘积后,通过累加所述当前时间点和所述历史时间点下的所述乘积得到的;The sample data at the current time point and the historical time point are input into the patient grouping model to obtain a preset number of groups, and each sample data corresponds to the expected reward value of each group. The expected reward The value is calculated by accumulating the first sum of the first degree of interest and the second degree of interest at the same time point and the product of the first sum and the embedded value And the product obtained at the historical time point;
    提取所述样本数据对应的标记群组,将所述标记群组对应输出的第一预期奖励值确定为所述患者分群模型的训练输出结果;Extracting the label group corresponding to the sample data, and determining the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model;
    计算所述第一预期奖励值与真实预期奖励值的均方差损失,若基于所述均方差损失判定损失函数达到收敛状态,则确定所述患者分群模型符合预设训练标准;Calculate the mean square error loss between the first expected reward value and the real expected reward value, and if it is determined that the loss function reaches a convergent state based on the mean square error loss, it is determined that the patient grouping model meets a preset training standard;
    若判定所述损失函数未达到收敛状态,则利用所述样本数据重复训练所述患者分群模型,以使所述患者分群模型符合所述预设训练标准。If it is determined that the loss function has not reached the convergence state, the sample data is used to repeatedly train the patient grouping model, so that the patient grouping model meets the preset training standard.
  12. 根据权利要求11所述的计算机可读存储介质,其中,所述将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组,具体包括:11. The computer-readable storage medium according to claim 11, wherein said inputting target patient data within a preset period of time into a patient grouping model that meets said preset training criteria to obtain the target group to which the target patient belongs, Specifically:
    提取预设时间段内目标患者的历史患者随访数据以及当前患者随访数据;Extract historical patient follow-up data and current patient follow-up data of the target patient within a preset time period;
    将所述历史患者随访数据以及所述当前患者随访数据,输入符合所述预设训练标准的患者分群模型中,获取得到对应各个预设群组下的预期奖励值;Input the historical patient follow-up data and the current patient follow-up data into a patient grouping model that meets the preset training standard, and obtain the expected reward value corresponding to each preset group;
    将所述预期奖励值最大的预设群组确定为目标患者对应的目标群组。The preset group with the largest expected reward value is determined as the target group corresponding to the target patient.
  13. 根据权利要求12所述的计算机可读存储介质,其中,所述基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案,具体包括:The computer-readable storage medium according to claim 12, wherein the determining the first treatment plan of the target patient based on the characteristics of the population in the target group specifically comprises:
    根据所述目标患者数据在所述目标群组中筛选与所述目标患者对应人群特征相似度大于第一预设阈值的第一患者,所述人群特征至少包括病情信息及个人信息;Screening, in the target group according to the target patient data, first patients whose population characteristics similarity to the target patient are greater than a first preset threshold, and the population characteristics include at least medical condition information and personal information;
    提取所述第一患者对应的治疗方案,以及所述治疗方案关于治疗效果的分数值,将所述分数值大于第二预设阈值的治疗方案确定为第一治疗方案;或Extracting the treatment plan corresponding to the first patient and the score value of the treatment plan with respect to the treatment effect, and determining the treatment plan with the score value greater than the second preset threshold as the first treatment plan; or
    获取依据所述目标群组的人群特征创建的预设治疗方案,并将所述预设治疗方案确定为所述第一治疗方案。Obtain a preset treatment plan created according to the characteristics of the population of the target group, and determine the preset treatment plan as the first treatment plan.
  14. 一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现以下步骤:A computer device includes a storage medium, a processor, and a computer program stored on the storage medium and running on the processor, wherein the processor implements the following steps when executing the program:
    基于深度强化学习DQN创建用于处理时序数据的患者分群模型;Create a patient clustering model for processing time series data based on deep reinforcement learning DQN;
    利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准;Training the patient clustering model by using sample data marked with clustering results, so that the patient clustering model meets a preset training standard;
    将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组;Input the target patient data in a preset time period into a patient grouping model that meets the preset training standard, and obtain the target group to which the target patient belongs;
    基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案;Determining the first treatment plan of the target patient based on the characteristics of the population in the target group;
    依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛 选出包含所述禁忌药品的第二治疗方案;Extracting the contraindicated drugs of the target patient according to the target patient data, and selecting a second treatment plan containing the contraindicated drugs from the first treatment plan;
    按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案。According to the first treatment plan and the second treatment plan, the target treatment plan of the target patient is obtained by analysis.
  15. 根据权利要求14所述的计算机设备,其中,所述基于深度强化学习DQN创建用于处理时序数据的患者分群模型,具体包括:The computer device according to claim 14, wherein the creation of a patient grouping model for processing time series data based on deep reinforcement learning DQN specifically comprises:
    将深度强化学习DQN对应网络结构中的最后一个全连接层,拆分成第一全连接层、第二循环神经网络层、第三循环神经网络层;The deep reinforcement learning DQN corresponding to the last fully connected layer in the network structure is split into the first fully connected layer, the second recurrent neural network layer, and the third recurrent neural network layer;
    利用更改网络结构后的所述深度强化学习DQN构建患者分群模型,以便在向所述患者分群模型输入包含多个时间点的患者数据时,由所述第一全连接层输出各个时间点对应患者状态的嵌入值,由所述第二循环神经网络层输出各个时间点对应患者状态的第一关注度,由所述第三循环神经网络层输出各个时间点对应分群结果的第二关注度,并基于所述嵌入值、所述第一关注度以及所述第二关注度计算所述患者数据对应各个预设群组的预期奖励值。Use the deep reinforcement learning DQN after changing the network structure to construct a patient grouping model, so that when the patient data containing multiple time points is input to the patient grouping model, the first fully connected layer outputs the corresponding patients at each time point The embedding value of the state, the second recurrent neural network layer outputs the first attention degree corresponding to the patient state at each time point, and the third recurrent neural network layer outputs the second attention degree corresponding to the grouping result at each time point, and The expected reward value of each preset group corresponding to the patient data is calculated based on the embedded value, the first degree of attention, and the second degree of attention.
  16. 根据权利要求15所述的计算机设备,其中,所述样本数据为包含当前时间点以及预设数量个历史时间点的时序数据;The computer device according to claim 15, wherein the sample data is time series data including a current time point and a preset number of historical time points;
    所述利用标记好分群结果的样本数据训练所述患者分群模型,以使所述患者分群模型符合预设训练标准,具体包括:The training of the patient clustering model using sample data marked with clustering results so that the patient clustering model meets a preset training standard specifically includes:
    将所述当前时间点和所述历史时间点下的样本数据,输入所述患者分群模型,获取得到预设数量个群组,以及各个样本数据对应各个群组的预期奖励值,所述预期奖励值是在计算同一时间点下所述第一关注度与所述第二关注度的第一加和,以及所述第一加和与所述嵌入值的乘积后,通过累加所述当前时间点和所述历史时间点下的所述乘积得到的;The sample data at the current time point and the historical time point are input into the patient grouping model to obtain a preset number of groups, and each sample data corresponds to the expected reward value of each group. The expected reward The value is calculated by accumulating the first sum of the first degree of interest and the second degree of interest at the same time point and the product of the first sum and the embedded value And the product obtained at the historical time point;
    提取所述样本数据对应的标记群组,将所述标记群组对应输出的第一预期奖励值确定为所述患者分群模型的训练输出结果;Extracting the label group corresponding to the sample data, and determining the first expected reward value corresponding to the output of the label group as the training output result of the patient grouping model;
    计算所述第一预期奖励值与真实预期奖励值的均方差损失,若基于所述均方差损失判定损失函数达到收敛状态,则确定所述患者分群模型符合预设训练标准;Calculate the mean square error loss between the first expected reward value and the real expected reward value, and if it is determined that the loss function reaches a convergent state based on the mean square error loss, it is determined that the patient grouping model meets a preset training standard;
    若判定所述损失函数未达到收敛状态,则利用所述样本数据重复训练所述患者分群模型,以使所述患者分群模型符合所述预设训练标准。If it is determined that the loss function has not reached the convergence state, the sample data is used to repeatedly train the patient grouping model, so that the patient grouping model meets the preset training standard.
  17. 根据权利要求16所述的计算机设备,其中,所述将预设时间段内的目标患者数据输入符合所述预设训练标准的患者分群模型,获取得到目标患者所属的目标群组,具体包括:The computer device according to claim 16, wherein said inputting target patient data within a preset time period into a patient grouping model that meets said preset training standard to obtain the target group to which the target patient belongs specifically comprises:
    提取预设时间段内目标患者的历史患者随访数据以及当前患者随访数据;Extract historical patient follow-up data and current patient follow-up data of the target patient within a preset time period;
    将所述历史患者随访数据以及所述当前患者随访数据,输入符合所述预设训练标准的患者分群模型中,获取得到对应各个预设群组下的预期奖励值;Input the historical patient follow-up data and the current patient follow-up data into a patient grouping model that meets the preset training standard, and obtain the expected reward value corresponding to each preset group;
    将所述预期奖励值最大的预设群组确定为目标患者对应的目标群组。The preset group with the largest expected reward value is determined as the target group corresponding to the target patient.
  18. 根据权利要求17所述的计算机设备,其中,所述基于所述目标群组内的人群特征确定所述目标患者的第一治疗方案,具体包括:The computer device according to claim 17, wherein the determining the first treatment plan of the target patient based on the characteristics of the population in the target group specifically comprises:
    根据所述目标患者数据在所述目标群组中筛选与所述目标患者对应人群特征相似度大于第一预设阈值的第一患者,所述人群特征至少包括病情信息及个人信息;Screening, in the target group according to the target patient data, first patients whose population characteristics similarity to the target patient is greater than a first preset threshold, and the population characteristics include at least medical condition information and personal information;
    提取所述第一患者对应的治疗方案,以及所述治疗方案关于治疗效果的分数值,将所述分数值大于第二预设阈值的治疗方案确定为第一治疗方案;或Extracting the treatment plan corresponding to the first patient and the score value of the treatment plan with respect to the treatment effect, and determining the treatment plan with the score value greater than the second preset threshold as the first treatment plan; or
    获取依据所述目标群组的人群特征创建的预设治疗方案,并将所述预设治疗方案确定为所述第一治疗方案。Obtain a preset treatment plan created according to the characteristics of the population of the target group, and determine the preset treatment plan as the first treatment plan.
  19. 根据权利要求18所述的计算机设备,其中,所述依据所述目标患者数据提取所述目标患者的禁忌药品,并从所述第一治疗方案中筛选出包含所述禁忌药品的第二治疗 方案,具体包括:18. The computer device according to claim 18, wherein said extracting contraindicated drugs of said target patient based on said target patient data, and selecting a second treatment plan containing said contraindicated drugs from said first treatment plan , Specifically including:
    根据用药禁忌数据确定所述目标患者对应人群类型不适于服用的第一禁忌药品;According to the drug contraindication data, determine the first contraindication drug that the target patient corresponds to the population type that is not suitable for taking;
    依据所述目标患者数据中的药物过敏史,确定所述目标患者存在过敏反应的第二禁忌药品;According to the drug allergy history in the target patient's data, it is determined that the target patient has a second contraindication drug for allergic reactions;
    将包含所述第一禁忌药品和/或所述第二禁忌药品的第一治疗方案确定为第二治疗方案。The first treatment plan including the first contraindication drug and/or the second contraindication drug is determined as the second treatment plan.
  20. 根据权利要求19所述的计算机设备,其中,所述按照所述第一治疗方案以及所述第二治疗方案,分析得到所述目标患者的目标治疗方案,具体包括:18. The computer device according to claim 19, wherein the analyzing and obtaining the target treatment plan of the target patient according to the first treatment plan and the second treatment plan specifically includes:
    将所述第一治疗方案中剔除所述第二治疗方案,得到所述目标治疗方案。The second treatment plan is excluded from the first treatment plan to obtain the target treatment plan.
PCT/CN2020/118873 2020-06-29 2020-09-29 Method, apparatus, computer device, and medium for determining patient treatment plan WO2021151295A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010602269.2 2020-06-29
CN202010602269.2A CN111785366B (en) 2020-06-29 2020-06-29 Patient treatment scheme determination method and device and computer equipment

Publications (1)

Publication Number Publication Date
WO2021151295A1 true WO2021151295A1 (en) 2021-08-05

Family

ID=72760241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118873 WO2021151295A1 (en) 2020-06-29 2020-09-29 Method, apparatus, computer device, and medium for determining patient treatment plan

Country Status (2)

Country Link
CN (1) CN111785366B (en)
WO (1) WO2021151295A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990430B (en) * 2021-02-08 2021-12-03 辽宁工业大学 Group division method and system based on long-time and short-time memory network
CN113011102B (en) * 2021-04-01 2022-05-24 河北工业大学 Multi-time-sequence-based Attention-LSTM penicillin fermentation process fault prediction method
CN113255735B (en) * 2021-04-29 2024-04-09 平安科技(深圳)有限公司 Method and device for determining medication scheme of patient
CN113782192A (en) * 2021-09-30 2021-12-10 平安科技(深圳)有限公司 Grouping model construction method based on causal inference and medical data processing method
CN115423054B (en) * 2022-11-07 2023-04-07 北京智精灵科技有限公司 Uncertain training and exciting method and system based on personality characteristics of cognitive disorder patient

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107851464A (en) * 2015-08-17 2018-03-27 西门子保健有限责任公司 For carrying out the method and system of progression of disease modeling and therapy optimization for individual patient
CN108511056A (en) * 2018-02-09 2018-09-07 上海长江科技发展有限公司 Therapeutic scheme based on patients with cerebral apoplexy similarity analysis recommends method and system
CN109859851A (en) * 2018-12-27 2019-06-07 平安科技(深圳)有限公司 A kind of therapeutic scheme recommended method and device
US20190385738A1 (en) * 2018-06-19 2019-12-19 Siemens Healthcare Gmbh Characterization of amount of training for an input to a machine-learned network
CN110826624A (en) * 2019-11-05 2020-02-21 电子科技大学 Time series classification method based on deep reinforcement learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019025270A1 (en) * 2017-08-01 2019-02-07 Siemens Healthcare Gmbh Non-invasive assessment and therapy guidance for coronary artery disease in diffuse and tandem lesions
CN109817329B (en) * 2019-01-21 2021-06-29 暗物智能科技(广州)有限公司 Medical inquiry dialogue system and reinforcement learning method applied to same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107851464A (en) * 2015-08-17 2018-03-27 西门子保健有限责任公司 For carrying out the method and system of progression of disease modeling and therapy optimization for individual patient
CN108511056A (en) * 2018-02-09 2018-09-07 上海长江科技发展有限公司 Therapeutic scheme based on patients with cerebral apoplexy similarity analysis recommends method and system
US20190385738A1 (en) * 2018-06-19 2019-12-19 Siemens Healthcare Gmbh Characterization of amount of training for an input to a machine-learned network
CN109859851A (en) * 2018-12-27 2019-06-07 平安科技(深圳)有限公司 A kind of therapeutic scheme recommended method and device
CN110826624A (en) * 2019-11-05 2020-02-21 电子科技大学 Time series classification method based on deep reinforcement learning

Also Published As

Publication number Publication date
CN111785366B (en) 2023-05-26
CN111785366A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
WO2021151295A1 (en) Method, apparatus, computer device, and medium for determining patient treatment plan
Hung et al. Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database
Ambekar et al. Disease risk prediction by using convolutional neural network
WO2020181805A1 (en) Diabetes prediction method and apparatus, storage medium, and computer device
US20220254493A1 (en) Chronic disease prediction system based on multi-task learning model
EP2997514A1 (en) Context-aware prediction in medical systems
CN108461110B (en) Medical information processing method, device and equipment
JP2018060529A (en) Method and apparatus of context-based patient similarity
WO2020224433A1 (en) Target object attribute prediction method based on machine learning and related device
CN115050442B (en) Disease category data reporting method and device based on mining clustering algorithm and storage medium
CN113724815A (en) Information pushing method and device based on decision grouping model
CN109698018A (en) Medical text handling method, device, computer equipment and storage medium
CN112447270A (en) Medication recommendation method, device, equipment and storage medium
CN110897634A (en) Electrocardiosignal generation method based on generation countermeasure network
CN114416967A (en) Method, device and equipment for intelligently recommending doctors and storage medium
CN114783580A (en) Medical data quality evaluation method and system
CN114191665A (en) Method and device for classifying man-machine asynchronous phenomena in mechanical ventilation process
CN112071431B (en) Clinical path automatic generation method and system based on deep learning and knowledge graph
WO2023240837A1 (en) Service package generation method, apparatus and device based on patient data, and storage medium
CN113782146B (en) Artificial intelligence-based general medicine recommendation method, device, equipment and medium
CN110010231A (en) A kind of data processing system and computer readable storage medium
Sinha et al. Automated detection of coronary artery disease using machine learning algorithm
CN111816276A (en) Method and device for recommending education courses, computer equipment and storage medium
CN113808731A (en) Intelligent medical diagnosis system and method
CN110164523A (en) A kind of intelligent health analysis method and system with intelligence function

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20917112

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20917112

Country of ref document: EP

Kind code of ref document: A1