CN110211701B

CN110211701B - Model generation method, data processing method and corresponding device

Info

Publication number: CN110211701B
Application number: CN201910520846.0A
Authority: CN
Inventors: 戴松世
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-06-17
Filing date: 2019-06-17
Publication date: 2021-05-25
Anticipated expiration: 2039-06-17
Also published as: CN110211701A

Abstract

An object of an embodiment of the present application is to provide a model generation method, a data processing method, and a corresponding apparatus, where the model generation method includes: the method comprises the steps of obtaining physiological characteristics of a preset time point of each user in a plurality of users, survival state duration corresponding to the physiological characteristics of the preset time point and an intervention strategy adopted by the preset time point, wherein the survival state duration is the duration of a time interval between the preset time point and a time point of change of the survival state of the user; and training the survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as output quantities to obtain the trained survival state model.

Description

Model generation method, data processing method and corresponding device

Technical Field

The application relates to the technical field of artificial intelligence application, in particular to a model generation method, a data processing method and a corresponding device.

Background

In the existing scheme for evaluating the survival state, the mortality of various physiological characteristics is calculated only by counting the survival state of the user under various physiological characteristics, and the current survival state of the user is characterized by matching the corresponding mortality through the current physiological characteristics of the user, so that the problem that the current survival state of the user is too large in scale only through the mortality is solved.

Disclosure of Invention

An object of the embodiments of the present application is to provide a model generation method, a data processing method, and a corresponding apparatus, which are used to solve the problem that the current living state of a user in a current living state assessment scheme is too large in scale only by mortality.

In order to achieve the above object, the present application provides the following technical solutions:

in a first aspect: the application provides a model generation method, which comprises the following steps: acquiring physiological characteristics of a preset time point of each user in a plurality of users, a survival state duration corresponding to the physiological characteristics of the preset time point and an intervention strategy adopted by the preset time point, wherein the survival state duration is a duration between the preset time point and a time point of change of the survival state of the user; and training the survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as output quantities to obtain the training-finished survival state model.

According to the method designed by the scheme, the physiological characteristics of the historical user at the preset time point and the intervention strategy adopted at the preset time point are used as input, the survival state model is obtained by training the survival state duration as output, the individual state of a specific patient can be reflected, the time distance from the death end point of the individual patient to the current time distance from the recovery discharge of the individual patient can be reflected, the problem that the survival state is too extensive only by the death rate is solved, and the current survival state of the patient can be more comprehensively predicted.

In an optional implementation manner of the first aspect, after the obtaining of the trained survival state model, the method further includes: acquiring a plurality of physiological characteristics of each user in a plurality of users with the same physiological state at the preset time point, intervention strategies adopted by each user at the preset time point, the life state duration corresponding to each user and influence values of the adopted intervention strategies on the physiological characteristic change degree of each user; and training the reinforcement learning model by taking the influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user as an output quantity to obtain the reinforcement learning model after training is completed.

According to the method designed by the scheme, the trained reinforcement learning model can predict the influence of the intervention strategy adopted by the user at each time node on the physiological characteristics of the user, and then a beneficial intervention strategy can be selected according to the influence in subsequent application, so that the intervention strategy adopted by the user is more accurate and reliable.

In an optional implementation manner of the first aspect, after the obtaining of the trained survival state model, the method further includes: and taking the physiological characteristics of each user at the preset time point, the survival state duration corresponding to the physiological characteristics of the preset time point and the influence value of the intervention strategy adopted at the preset time point on the change degree of the physiological characteristics of each user as input quantities, taking the intervention strategy adopted by each user at the preset time point as output quantities, and training the scheme selection model to obtain the trained scheme selection model.

According to the scheme design method, the scheme selection model obtained through training can directly output the suggested scheme, and the time for judging and selecting the scheme is saved.

In a second aspect: the application provides a data processing method, which utilizes a survival state model obtained by training in the first aspect and a reinforcement learning model to process data, and the method comprises the following steps: acquiring current physiological characteristics of a user and a plurality of intervention strategies to be selected; inputting the current physiological characteristics of the user and each intervention strategy in a plurality of intervention strategies to be selected into the survival state model to obtain the survival state duration corresponding to each intervention strategy; inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user; and determining an intervention strategy which is suggested to be adopted according to the influence value.

According to the method designed by the scheme, the intervention strategy which is recommended to the user at the moment can be obtained by inputting the physiological characteristics of the user into the survival state model and the reinforcement learning model which are obtained through training, so that the adopted intervention strategy is the optimal strategy for the patient in the plurality of intervention strategies, and certain guarantee is provided.

In an optional implementation manner of the second aspect, the determining, according to the impact value, an intervention strategy to be suggested includes: and responding to an operation instruction of a user, and selecting an intervention strategy suggested to be adopted from a plurality of intervention strategies to be selected according to the influence value.

In an optional implementation manner of the second aspect, the determining, according to the impact value, an intervention strategy to be suggested includes: and sequencing the influence values in a descending order, and determining the intervention strategy corresponding to the influence value with the most advanced sequence as the intervention strategy suggested to be adopted.

In a third aspect: the present application provides a data processing method, which performs data processing by using a scheme selection model obtained by training in the first aspect, the method including: acquiring current physiological characteristics of a user and a plurality of intervention strategies to be selected; inputting the current physiological characteristics of the user and each intervention strategy in a plurality of intervention strategies to be selected into the survival state model to obtain the survival state duration corresponding to each intervention strategy; inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user; and inputting the current physiological characteristics of the user, the living state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the living state duration of the user into the scheme selection model to obtain the intervention strategy selected from the plurality of intervention strategies to be selected and suggested to be currently adopted by the user.

The method of the scheme design suggests that the intervention strategy currently adopted by the user can be directly output by the scheme selection model, so that once new user data is formed, the current optimal intervention strategy can be given immediately, the time for judging and selecting the strategy is saved, and the method is particularly important under the condition of minute, minute and second and precious severe rescue.

In a fourth aspect: the application provides a data processing method, which comprises the following steps: acquiring a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; inputting the current multiple physiological characteristics of the user and a plurality of intervention strategies to be selected into a pre-trained survival state model, and obtaining the survival state duration corresponding to each intervention strategy, wherein the survival state duration is the duration of the current time point and the time point of the change of the survival state of the user; inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user; and inputting the current physiological characteristics of the user, the living state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the living state duration of the user into a pre-trained scheme selection model to obtain the intervention strategy selected from the plurality of intervention strategies to be selected and suggested to be currently adopted by the user.

In a fifth aspect: the present application provides a model generation apparatus, the apparatus comprising: the system comprises an acquisition module, a judgment module and a processing module, wherein the acquisition module is used for acquiring the physiological characteristics of a preset time point of each user in a plurality of users, the survival state duration corresponding to the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point, and the survival state duration is the duration between the preset time point and the time point of the change of the survival state of the user; the training module is used for training the survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as an output quantity; the obtaining module is further used for obtaining the survival state model after the training module trains the survival state model.

The device designed by the scheme takes the physiological characteristics of the preset time point of the historical user and the intervention strategy adopted at the preset time point as input, the survival state duration is taken as output training to obtain the survival state model, the individual state of a specific patient can be reflected, the time distance from the death end point of the individual patient to the death end point and the time distance from the recovery discharge of the individual patient to the recovery destination can be reflected, the problem that the survival state is too extensive only by the death rate is solved, and the current survival state of the patient can be more comprehensively predicted.

In an optional implementation manner of the fifth aspect, after the trained survival state model is obtained, the obtaining module is further configured to obtain a physiological characteristic of each user of the multiple users having the same physiological characteristic at a preset time point, an intervention strategy adopted by each user at the preset time point, a survival state duration corresponding to each user, and an influence value of the adopted intervention strategy on a change degree of the physiological characteristic of each user. The training module is further used for training the reinforcement learning model by taking the physiological characteristics of a plurality of users with the same physiological characteristics at preset time points, the intervention strategy adopted by each user at the preset time points and the corresponding survival state duration of each user as input quantities, and taking the influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user as output quantities. The acquisition module is used for acquiring the reinforcement learning model after the reinforcement learning model is trained by the training module.

In an optional implementation manner of the fifth aspect, the training module is further configured to train the scenario selection model by using, as input quantities, the physiological characteristic of each of the multiple users at a preset time point, the survival state duration corresponding to the physiological characteristic of the preset time point, and an influence value of an intervention strategy adopted at the preset time point on a change degree of the physiological characteristic of each of the users, and using the intervention strategy adopted at the preset time point by each of the users as an output quantity. And the acquisition module is used for acquiring the trained scheme selection model after the training module trains the scheme selection model.

A sixth aspect: the application provides a data processing device, which utilizes a survival state model obtained by training in the first aspect and a reinforcement learning model to perform data processing, and comprises an acquisition module, a selection module and a selection module, wherein the acquisition module is used for acquiring a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; the input module is used for inputting the current physiological characteristics of the user and each intervention strategy in a plurality of intervention strategies to be selected into the survival state model; the obtaining module is further configured to obtain a survival state duration corresponding to each intervention strategy after the input module inputs each intervention strategy of the current physiological characteristics of the user and the plurality of intervention strategies to be selected into the survival state model; the input module is further used for inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model; the obtaining module is further configured to obtain an influence value of each intervention strategy on the lifetime of the user after the input module inputs the plurality of intervention strategies to be selected and the lifetime corresponding to each intervention strategy into the reinforcement learning model; and the determining module is used for determining the suggested intervention strategy according to the influence value.

The device designed by the scheme can obtain the intervention strategy recommended to the user at the moment by inputting the physiological characteristics of the user into the living state model and the reinforcement learning model obtained by training, so that the adopted intervention strategy is the optimal strategy for the patient in a plurality of intervention strategies, and certain guarantee is provided.

Seventh aspect: the application provides a data processing device, which comprises an acquisition module, a selection module and a processing module, wherein the acquisition module is used for acquiring a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; the input module is used for inputting the current physiological characteristics of the user and each intervention strategy in the plurality of intervention strategies to be selected into a pre-trained survival state model; the obtaining module is further configured to obtain a survival state duration corresponding to each intervention strategy after the input module inputs each intervention strategy of the current physiological characteristics of the user and the plurality of intervention strategies to be selected into a pre-trained survival state model, where the survival state duration is a duration between a current time point and a time point at which the survival state of the user changes; the input module is further used for inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model; the obtaining module is further configured to obtain an influence value of each intervention strategy on the life state duration of the user after the input module inputs the plurality of intervention strategies to be selected and the life state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model; the input module is further used for inputting the current physiological characteristics of the user, the life state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the life state duration of the user into a pre-trained scheme selection model; the obtaining module is further configured to obtain an intervention strategy selected from the multiple intervention strategies to be selected and suggested to be currently adopted by the user.

The device designed by the scheme suggests that the currently adopted intervention strategy of the user can be directly output by the scheme selection model, so that once new user data is formed, the current optimal intervention strategy can be given immediately, the time for judging and selecting the strategy is saved, and the device is particularly important under the condition of minute, minute and second and precious severe rescue.

An eighth aspect: the present application further provides an electronic device, comprising: a processor, a memory connected to the processor, the memory storing a processor-executable machine-readable storage medium, the processor executing the machine-readable storage medium when the computing device is running to perform the method of the first aspect, any optional implementation of the first aspect, the second aspect, any optional implementation of the second aspect, the third aspect, any optional implementation of the third aspect, and any optional implementation of the fourth aspect, the fourth aspect.

A ninth aspect: the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of the first aspect, any optional implementation of the first aspect, the second aspect, any optional implementation of the second aspect, the third aspect, any optional implementation of the third aspect, and any optional implementation of the fourth aspect, or any optional implementation of the fourth aspect.

A tenth aspect: the present application provides a computer program product, which when run on a computer causes the computer to perform the method of any one of the first aspect, any optional implementation of the first aspect, the second aspect, any optional implementation of the second aspect, the third aspect, any optional implementation of the third aspect, and any optional implementation of the fourth aspect, the fourth aspect.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a first flowchart of a model generation method according to a first embodiment of the present application;

FIG. 2 is a second flow chart of a model generation method according to the first embodiment of the present application;

fig. 3 is a third flow chart of a model generation method according to the first embodiment of the present application;

fig. 4 is a schematic flow chart of a data processing method according to a second embodiment of the present application;

fig. 5 is a schematic flow chart of a data processing method according to a third embodiment of the present application;

fig. 6 is a schematic flow chart of a data processing method according to a fourth embodiment of the present application;

fig. 7 is a schematic structural diagram of a model generation apparatus according to a fifth embodiment of the present application;

fig. 8 is a schematic structural diagram of a data processing apparatus according to a sixth embodiment of the present application;

fig. 9 is a schematic structural diagram of a data processing apparatus according to a seventh embodiment of the present application;

fig. 10 is a schematic structural diagram of a data processing apparatus according to an eighth embodiment of the present application;

fig. 11 is a schematic structural diagram of an electronic device according to a ninth embodiment of the present application.

Detailed Description

To facilitate understanding by those skilled in the art, the words in the embodiments of the application are explained and illustrated below.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application.

First embodiment

As shown in fig. 1, the present application provides a model generation method, including:

step S100: the method comprises the steps of obtaining physiological characteristics of a preset time point of each user in a plurality of users, survival state duration corresponding to the physiological characteristics of the preset time point and an intervention strategy adopted by the preset time point, wherein the survival state duration is the duration of the preset time point and the time point of change of the survival state of the user.

Step S102: and training the survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as output quantities to obtain the trained survival state model.

The physiological characteristics of each user at the preset time point are expressed as a plurality of physiological characteristics corresponding to any time point in the day of each user in S100. The survival state duration is a duration between a preset time point and a time point of a change of the survival state of the user, wherein the change of the survival state is represented as a change of the survival state, for example, assuming that the user is a patient, the change of the survival state is represented as a discharge or death of the patient, and the survival state duration is represented as a time distance from the discharge or a time distance from the death of the patient at a time point corresponding to the physiological characteristic of the patient; intervention strategies refer to means of intervention on a patient's physiological characteristics, including fluid replacement strategies: there are different fluid replacement strategy options for different types of diseases at different stages of disease progression.

S100 is a source of training data, and the following example takes a user as a patient, and the specific implementation process is as follows:

all data were obtained from Hospital Information Systems (HIS) and Laboratory Information Systems (LIS) and were summarized and categorized by patient number. The resulting data set can be expressed as: c ═ S_i1, 2.., n, which in turn correspond to each patient's data set;

S_i＝{D_t},t＝t₁,t₂,...,t_k,., which in turn correspond to the patient's data set at each point in time;

D_t＝{v_l},l＝l₁,l₂,...,l_j,., which in turn corresponds to the value of each feature item of the patient at the time point, wherein the value of each feature item at the time point includes the physiological features of the preset time point in S100 and the intervention strategy adopted at the preset time point.

Preprocessing patient data includes: the range and unit of the same kind of data; all data of the patient are processed by time point in the unit of each hour; having a plurality of values, merging according to the nature of the data; and if the missing value exists, determining whether to adopt an unsupervised clustering technology for interpolation according to the property of the data.

The specific implementation process of interpolating the data items with the data missing rate not more than 85% by adopting the unsupervised clustering technology is as follows:

the first step is as follows: let the data for patient i be represented as: s_iT × v, wherein t is t₁,t₂,...,t_k,., corresponding to a sampling time point; v ═ v₁,v₂,...,v_l,., corresponding to the values of the characteristic items.

The second step is that: the value of the characteristic item can be divided into two parts of an acquired value and a missing value, wherein v { (v)_e,t_e),(v_m,t_m) Wherein (v)_e,t_e) Representing the collected part of the characteristic item; (v)_m,t_m) Representing the missing part of the characteristic item.

The third step: to (v)_e,t_e) Unsupervised cluster analysis was performed to obtain a cluster Model of Model KNN (v)_e,t_e) (ii) a The missing feature term value may be calculated as v_m＝Model(t_m)。

Since the patient data is a historical data set of the patient, the time length of the patient from discharge or death at each time point is definite, but the distribution is disordered, and assuming that the survival time length is measured day by day, the survival time length is expressed as the time length between the date of the preset time point and the date of the change of the survival state of the user, so that the definition s ═ { s ═ is defined_i1,2, 10, wherein s is₁Representative patients may be discharged within 3 days; s₂The patient can be discharged within 3-10 days; s₃The patient can be discharged within 10-30 days; s₄The patient can be discharged within 30-90 days; s₅Representative patients may be discharged after 90 days; s₆Representative patients will die after 90 days; s₇It means that the patient will die within 30-90 days; s₈It means that the patient will die within 10-30 days; s₉It means that the patient will die within 3-10 days; s₁₀Representative patients will die within 3 days.

On the basis of the foregoing, step S102 is executed to start the model training phase, which is as follows:

the patient data sets were divided into two groups: c ═ C_train,C_test}，C_trainAssuming a training set; c_testAssume a test group, wherein the training group and the test group comprise physiological characteristics of each patient at a predetermined time point, an intervention strategy adopted by the patient at the predetermined time point, and a survival state duration of the patient at the predetermined time point.

The physiological characteristics of each patient at a preset time point and an intervention strategy at the preset time point are used as input, the survival state duration at the preset time point is used as output, a Deep Belief Network (DBN) is adopted to train data of a training set, a training result based on the Deep Belief Network is obtained, then the training result is verified by the data of a testing set, and finally a survival state model is obtained.

In an alternative embodiment of the first aspect, after obtaining the trained survival state model at S102, as shown in fig. 2, the method further includes:

s104: the method comprises the steps of obtaining physiological characteristics of each user in a plurality of users with the same physiological characteristics at a preset time point, an intervention strategy adopted by each user at the preset time point, the life state duration corresponding to each user and an influence value of the adopted intervention strategy on the physiological characteristic change degree of each user.

S106: and training the reinforcement learning model by taking the multiple physiological characteristics of the multiple users with the same physiological characteristics at the preset time point, the intervention strategy adopted by each user at the preset time point and the corresponding survival state duration of each user as input quantities and taking the influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user as output quantities to obtain the reinforcement learning model after training.

In S104, the physiological characteristic of each of the multiple users having the same physiological characteristic at the preset time point, the intervention strategy adopted by each user at the preset time point, and the life state duration corresponding to each user are all included in the patient data set, and after the intervention strategy adopted by each user at the preset time point is obtained, a fluid replacement strategy is defined as follows: a ═ a_i1,2,, 9, wherein a₁The value of under-supplementation is larger than 2000 ml; a is₂The under-supplementation value is between 1000 and 2000 ml; a is₃The under-supplementation value is 500-1000 ml; a is₄The under-supplementation value is between 200 and 500 ml; a is₅The fluid infusion value is between 200ml of under-fluid infusion value and 200ml of over-fluid infusion value; a is₆The fluid replacement value is 200-500 ml; a is₇The fluid replacement value is 500-1000 ml; a is₈The liquid supplementing value is between 1000 and 2000 ml; a is₉The liquid supplementing value is more than 2000 ml.

The influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user can be specifically obtained according to the following steps: ExMCTS (v)_i,v)＝c_n×Q_n(v_i)/N(v_i)+c_e×Q_e(v_i)/N(v_i)+c×(N(v_i) (v), where v ═ (s × a), represents all possible states and strategy spaces for fluid replacement; v. of_i＝(s_k×a_n) Representing a certain fluid infusion strategy selection in a certain living state; n (v)_i) On behalf of the selection policy node v_iThe sum of the times; n (v), representing a policy node v_iThe total of all selected times of the parent nodes; c. C_n，c_eC, the factor weight is composed of various awards and punishments; q_n(v_i) Is(s)_i×a_k) Under-selection fluid infusion strategy a_kThe reward and punishment value is equivalent to a first-level MDP, and only the reward and punishment value in the next liquid replenishing strategy is considered; q_e(v_i) At v for the final outcome_i＝(s_i×a_k) And (4) Back-propagation reward and punishment values (influence values) of the strategy nodes. Also therefore, Q_n(v_i)/N(v_i) The reward and punishment value of the fluid replacement strategy adopted in the current state represents that the next stage brought by the fluid replacement strategy is improvement of the physiological characteristic or deterioration of the physiological characteristic or maintenance of the living state, for example, a positive score is given if the physiological characteristic is improved, a negative score is given if the physiological characteristic is deteriorated, and the score is larger if the physiological characteristic is improved or deteriorated; q_e(v_i)/N(v_i) Represents the reward and punishment value brought by the final outcome to the fluid infusion strategy, and in the invention, the larger outcome influence weight explains the tendency of the invention to the final outcome; n (v)_i) The/n (v) represents the additive penalty for frequency of use. From the foregoing, the influence value of the intervention strategy on the degree of change of the physiological characteristic of each user may be a positive or negative score, which represents a good influence of the intervention strategy on the physiological characteristic of the user when the intervention strategy is positive, a bad influence of the intervention strategy on the physiological characteristic of the user when the intervention strategy is negative, and the magnitude of the score represents the degree of influence.

On the basis, the multiple physiological characteristics of the multiple users with the same physiological characteristics at the preset time point, the intervention strategy adopted by each user at the preset time point and the corresponding survival state duration of each user are used as input quantities, the influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user is used as an output quantity, and the reinforcement learning model is trained to obtain the reinforcement learning model after training. Wherein the reinforcement learning model can be trained by monte carlo tree search.

In an alternative embodiment of the first aspect, after obtaining the trained survival state model at S102, as shown in fig. 3, the method further includes:

s108: and taking the physiological characteristic of each user at a preset time point, the survival state duration corresponding to the physiological characteristic at the preset time point and the influence value of the intervention strategy adopted at the preset time point on the change degree of the physiological characteristic of each user as input quantities, taking the intervention strategy adopted at the preset time point by each user as output quantities, and training the scheme selection model to obtain a trained scheme selection model.

The training process of S108 is consistent with the training process of the survival state model, and the deep belief network is used to train the survival state model, which is only different in input and output data, and is not described herein again.

Second embodiment

As shown in fig. 4, the present application provides a data processing method applied to a server, for performing data processing by using a survival state model and a reinforcement learning model obtained by training in the first embodiment, the method includes:

step S200: the method comprises the steps of obtaining current physiological characteristics of a user and a plurality of intervention strategies to be selected.

Step S202: and inputting the current physiological characteristics of the user and each intervention strategy in the plurality of intervention strategies to be selected into the survival state model to obtain the survival state duration corresponding to each intervention strategy.

Step S204: inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the life state duration corresponding to each intervention strategy into a reinforcement learning model, and obtaining the influence value of each intervention strategy on the life state duration of the user.

Step S206: and determining the suggested intervention strategy according to the influence value.

In step S200, the current physiological characteristic of the user may be detected by some physiological characteristic detecting devices and physiological characteristic detecting means, and then may be obtained according to the detected current physiological characteristic of the user, and the plurality of intervention strategies to be selected may be fluid infusion strategies a ═ defined in the first embodiment_i},i＝1,2,,...,9。

Based on the description of S200, step S202 is executed to input the current physiological characteristics of the user and each intervention strategy of the plurality of intervention strategies to be selected into a survival state model, where the survival state model is the survival state model trained in the first embodiment, and the survival state model automatically outputs the duration of the survival state of the user under each intervention strategy. Wherein the lifetime of the user may be different for each intervention strategy.

On the basis, S204 is continuously executed to input the current physiological characteristics of the user, the plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model, and then the reinforcement learning model automatically outputs the influence value of each intervention strategy on the survival state duration corresponding to the user. Wherein, the influence value of each intervention strategy on the corresponding life state duration of the user may also be different. And then S206, the intervention strategy adopted by the user currently is determined and suggested according to the influence value.

Wherein, the step S206 of determining the suggested intervention strategy according to the influence value may include the following steps: firstly, in response to an operation instruction of a user, selecting an intervention strategy recommended to be adopted from a plurality of intervention strategies to be selected according to an influence value, which can be understood in such a way that after the server executes S204 to output the influence value of each intervention strategy on the life state duration corresponding to the user, a doctor observes the influence value of each intervention strategy on the life state duration corresponding to the user from the server, then the doctor judges the selected intervention strategy according to the influence value, and operates the intervention strategy selected by the doctor from the plurality of intervention strategies to be selected on the server to judge the selected intervention strategy.

Secondly, after the server obtains the influence value of each intervention strategy on the life state duration of the user, the influence value indicates that the influence on the physiological characteristics of the patient is a positive value, the influence on the physiological characteristics of the patient is a negative value, and the influence degree indicates the magnitude of the numerical value. The impact values can thus be sorted in order of magnitude, e.g. the impact values of 5 intervention strategies are 1.7,2, -1.1, 1.2, -2.2, respectively. So the sequences from large to small are 2,1.7,1.2, -1.1, -2.2. And determining the intervention strategy corresponding to the influence value ranked most front as the intervention strategy adopted by the suggested user at the moment, namely determining the intervention strategy with the influence value of 2 in the example as the intervention strategy adopted by the suggested user at the moment.

Third embodiment

The application provides a data processing method, which is applied to a server and performs data processing by using a survival state model, a reinforcement learning model and a scheme selection model obtained by training in a first embodiment, as shown in fig. 5, the method includes:

s300: the method comprises the steps of obtaining current physiological characteristics of a user and a plurality of intervention strategies to be selected.

S302: and inputting the current physiological characteristics of the user and each intervention strategy in the plurality of intervention strategies to be selected into the survival state model to obtain the survival state duration corresponding to each intervention strategy.

S304: inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the life state duration corresponding to each intervention strategy into a reinforcement learning model, and obtaining the influence value of each intervention strategy on the life state duration of the user.

S306: inputting the current physiological characteristics of the user, the living state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the living state duration of the user into a scheme selection model, and obtaining the intervention strategy selected from the intervention strategies to be selected and suggested to be currently adopted by the user.

S300 to S304 in the above steps S300 to S306 are consistent with the implementation manner in the second embodiment, and are not described herein again, and it is embodied in S306 that the outputs of the survival state model and the reinforcement learning model are used as the inputs of the scheme selection model and then combined with the current physiological characteristics of the user, so that the intervention strategy suggested to be currently adopted by the user can be directly output by the scheme selection model, and thus, once new user data is formed, the current optimal intervention strategy can be given immediately, and is particularly important in the case of minute, second, and precious critical rescue.

Fourth embodiment

As shown in fig. 6, the present application provides a data processing method applied to a server, including:

s400: obtaining a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected.

S402: inputting a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected into a pre-trained survival state model, and obtaining the survival state duration corresponding to each intervention strategy, wherein the survival state duration is the duration between the current time point and the time point of the change of the survival state of the user;

s404: inputting the current physiological characteristics of a user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user;

s406: inputting the current physiological characteristics of the user, the survival state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the survival state duration of the user into a pre-trained scheme selection model, and obtaining the intervention strategy selected from a plurality of intervention strategies to be selected and suggested to be currently adopted by the user.

The pre-trained survival state model, the pre-trained reinforcement learning model, and the pre-trained scheme selection model in S400 to S406 are all represented as the trained survival state model, the reinforcement learning model, and the scheme selection model in the first embodiment, and the training process is consistent with the training process in the first embodiment and is not described herein again. The processes executed in S400 to S406 are also the same as the execution process in the third embodiment, and are not described herein again.

Fifth embodiment

Fig. 7 shows a schematic block diagram of the model generation apparatus 5 provided in the present application, and it should be understood that the apparatus corresponds to the above-mentioned method embodiments of fig. 1 to 3, and can perform the steps involved in the method in the first embodiment, and the specific functions of the apparatus can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy. The device includes at least one software function that can be stored in memory in the form of software or firmware (firmware) or solidified in the Operating System (OS) of the device. Specifically, the apparatus includes: an obtaining module 500, configured to obtain a physiological characteristic of each user at a preset time point in a plurality of users, a survival state duration corresponding to the physiological characteristic of the preset time point, and an intervention policy adopted at the preset time point, where the survival state duration is a duration between the preset time point and a time point at which a survival state of the user changes; a training module 502, configured to train a survival state model by using the physiological characteristics at a preset time point and an intervention strategy adopted at the preset time point as input quantities and using a survival state duration as an output quantity; the obtaining module 500 is further configured to obtain the trained survival state model after the training module 502 trains the survival state model.

In an optional implementation manner of the fifth embodiment, after obtaining the trained survival state model, the obtaining module 500 is further configured to obtain a physiological characteristic of each of a plurality of users having the same physiological characteristic at a preset time point, an intervention strategy adopted by each user at the preset time point, a survival state duration corresponding to each user, and an influence value of the adopted intervention strategy on a change degree of the physiological characteristic of each user. The training module 502 is further configured to train the reinforcement learning model by using the physiological characteristics of the multiple users with the same physiological characteristics at the preset time point, the intervention strategy adopted by each user at the preset time point, and the survival state duration corresponding to each user as input quantities, and using an influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user as an output quantity. The obtaining module 500 obtains the trained reinforcement learning model after the training module 502 trains the reinforcement learning model.

In an optional implementation manner of the fifth embodiment, the training module 502 is further configured to train the scenario selection model by using, as input quantities, the physiological characteristic of each of the multiple users at a preset time point, the survival state duration corresponding to the physiological characteristic at the preset time point, and an influence value of an intervention strategy adopted at the preset time point on the change degree of the physiological characteristic of each user, and using the intervention strategy adopted at the preset time point by each user as an output quantity. The obtaining module 500 obtains the trained scheme selection model after the training module 502 trains the scheme selection model.

Sixth embodiment

Fig. 8 shows a schematic block diagram of a data processing device 6 provided in the present application, and it should be understood that the device corresponds to the above-mentioned embodiment of the method in fig. 4, and can perform the steps involved in the method in the second embodiment, and the specific functions of the device can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy. The device includes at least one software function that can be stored in memory in the form of software or firmware (firmware) or solidified in the Operating System (OS) of the device. Specifically, the apparatus includes: an obtaining module 600, configured to obtain a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; an input module 602, configured to input, into the survival status model, a current physiological characteristic of the user and each of the plurality of intervention strategies to be selected; the obtaining module 600 is further configured to obtain a survival state duration corresponding to each intervention policy after the input module 602 inputs the current physiological characteristics of the user and each intervention policy of the plurality of intervention policies to be selected into the survival state model; the input module 602 is further configured to input the current physiological characteristics of the user, a plurality of intervention strategies to be selected, and a survival state duration corresponding to each intervention strategy into the reinforcement learning model; the obtaining module 600 is further configured to, after the input module 602 inputs the multiple intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model, obtain an influence value of each intervention strategy on the survival state duration of the user; a determining module 604, configured to determine the suggested intervention strategy according to the impact value.

Seventh embodiment

Fig. 9 shows a schematic block diagram of the data processing device 7 provided in the present application, and it should be understood that the device corresponds to the above-mentioned embodiment of the method in fig. 5, and can perform the steps involved in the method in the third embodiment, and the specific functions of the device can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy. The device includes at least one software function that can be stored in memory in the form of software or firmware (firmware) or solidified in the Operating System (OS) of the device. Specifically, the apparatus includes: an obtaining module 700, configured to obtain a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; an input module 702, configured to input the current physiological characteristics of the user and each intervention policy of the plurality of intervention policies to be selected into the survival status model; the obtaining module 700 is further configured to obtain a survival state duration corresponding to each intervention policy after the input module 702 inputs the current physiological characteristics of the user and each intervention policy of the plurality of intervention policies to be selected into the survival state model; the input module 702 is further configured to input the current physiological characteristics of the user, a plurality of intervention strategies to be selected, and a survival state duration corresponding to each intervention strategy into the reinforcement learning model; the obtaining module 700 is further configured to, after the input module 702 inputs the multiple intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model, obtain an influence value of each intervention strategy on the survival state duration of the user; the input module 702 is further configured to input the current physiological characteristics of the user, the lifetime of the user corresponding to each intervention policy, and an influence value of each intervention policy on the lifetime of the user into the scheme selection model; the obtaining module 700 is further configured to obtain an intervention policy selected from a plurality of intervention policies to be selected and suggested to be currently adopted by a user.

Eighth embodiment

Fig. 10 shows a schematic block diagram of a data processing device 8 provided in the present application, and it should be understood that the device corresponds to the above-mentioned embodiment of the method in fig. 6, and can execute the steps involved in the method in the fourth embodiment, and the specific functions of the device can be referred to the description above, and the detailed description is appropriately omitted here to avoid redundancy. The device includes at least one software function that can be stored in memory in the form of software or firmware (firmware) or solidified in the Operating System (OS) of the device. Specifically, the apparatus includes: an obtaining module 800, configured to obtain a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected; an input module 802, configured to input a current physiological characteristic of a user and each intervention strategy of a plurality of intervention strategies to be selected into a pre-trained survival state model; the obtaining module 800 is further configured to obtain a survival state duration corresponding to each intervention strategy after the input module 802 inputs the current physiological characteristics of the user and each intervention strategy of the plurality of intervention strategies to be selected into a pre-trained survival state model; the input module 802 is further configured to input the current physiological characteristics of the user, the plurality of intervention strategies to be selected, and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model; the obtaining module 800 is further configured to, after the input module 802 inputs the multiple intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model for training and training, obtain an influence value of each intervention strategy on the survival state duration of the user; the input module 802 is further configured to input the current physiological characteristics of the user, the lifetime of the user corresponding to each intervention strategy, and an influence value of each intervention strategy on the lifetime of the user into a pre-trained scheme selection model; the obtaining module 800 is further configured to obtain an intervention strategy selected from a plurality of intervention strategies to be selected, which is suggested to be currently adopted by a user.

Ninth embodiment

As shown in fig. 11, the present application provides an electronic device including: the processor 901, the memory 902 connected to the processor, and the memory 902 store a storage medium 903 executable by the processor 901, when the computing device is running, the processor 901 executes the storage medium 903 to execute the method in any optional implementation manner of the first embodiment, the second embodiment, any optional implementation manner of the second embodiment, the third embodiment, any optional implementation manner of the third embodiment, and any optional implementation manner of the fourth embodiment and the fourth embodiment.

The present application provides a storage medium 903, where a computer program is stored on the storage medium 903, and when the computer program is executed by a processor, the computer program executes the method in any optional implementation manner of the first embodiment, the second embodiment, any optional implementation manner of the second embodiment, the third embodiment, any optional implementation manner of the third embodiment, and any optional implementation manner of the fourth embodiment and the fourth embodiment.

The storage medium 903 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.

The present application provides a computer program product, which when run on a computer causes the computer to execute the method in any of the first embodiment, any optional implementation of the first embodiment, the second embodiment, any optional implementation of the second embodiment, the third embodiment, any optional implementation of the third embodiment, and any optional implementation of the fourth embodiment.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of model generation, the method comprising:

acquiring physiological characteristics of a preset time point of each user in a plurality of users, a survival state duration corresponding to the physiological characteristics of the preset time point and an intervention strategy adopted by the preset time point, wherein the survival state duration is a duration between the preset time point and a time point of change of the survival state of the user; the physiological characteristics of each user at a preset time point are expressed as a plurality of physiological characteristics corresponding to any time point in each user day, and the survival state change is expressed as discharge or death of the user;

training a survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as an output quantity to obtain the trained survival state model;

after obtaining the trained survival state model, the method further comprises:

acquiring the physiological characteristics of each user in the plurality of users with the same physiological characteristics at the preset time point, the intervention strategy adopted by each user at the preset time point, the survival state duration corresponding to each user and the influence value of the adopted intervention strategy on the physiological characteristic change degree of each user;

and taking the physiological characteristics of a plurality of users with the same physiological characteristics at the preset time point, the intervention strategy adopted by each user at the preset time point and the corresponding survival state duration of each user as input quantities, taking the influence value of the adopted intervention strategy on the physiological characteristic change degree of each user as output quantities, training the reinforcement learning model, and obtaining the reinforcement learning model after training.

2. The method of claim 1, wherein after obtaining the trained survival model, the method further comprises:

and taking the physiological characteristics of each user at the preset time point, the survival state duration corresponding to the physiological characteristics of the preset time point and the influence value of the intervention strategy adopted at the preset time point on the change degree of the physiological characteristics of each user as input quantities, taking the intervention strategy adopted by each user at the preset time point as output quantities, and training the scheme selection model to obtain the trained scheme selection model.

3. A data processing method, wherein the survival state model trained by the method of claim 1 and the reinforcement learning model are used for data processing, and the method comprises:

acquiring current physiological characteristics of a user and a plurality of intervention strategies to be selected;

inputting the current physiological characteristics of the user and each intervention strategy in a plurality of intervention strategies to be selected into the survival state model to obtain the survival state duration corresponding to each intervention strategy;

inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user;

and determining an intervention strategy which is suggested to be adopted according to the influence value.

4. The method of claim 3, wherein said determining a suggested intervention strategy based on said impact value comprises:

and responding to an operation instruction of a user, and selecting an intervention strategy suggested to be adopted from a plurality of intervention strategies to be selected according to the influence value.

5. The method of claim 3, wherein said determining a suggested intervention strategy based on said impact value comprises:

and sequencing the influence values in a descending order, and determining the intervention strategy corresponding to the influence value with the most advanced sequence as the intervention strategy suggested to be adopted.

6. A data processing method, wherein the survival state model, the reinforcement learning model and the scheme selection model trained by the method of claim 2 are used for data processing, and the method comprises:

and inputting the current physiological characteristics of the user, the living state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the living state duration of the user into the scheme selection model to obtain the intervention strategy selected from the plurality of intervention strategies to be selected and suggested to be currently adopted by the user.

7. A method of data processing, the method comprising:

acquiring a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected;

inputting the current multiple physiological characteristics of the user and a plurality of intervention strategies to be selected into a pre-trained survival state model, and obtaining the survival state duration corresponding to each intervention strategy, wherein the survival state duration is the duration of the current time point and the time point of the change of the survival state of the user; the physiological characteristics of each user at a preset time point are expressed as a plurality of physiological characteristics corresponding to any time point in each user day, and the survival state change is expressed as discharge or death of the user;

inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model to obtain the influence value of each intervention strategy on the survival state duration of the user;

and inputting the current physiological characteristics of the user, the living state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the living state duration of the user into a pre-trained scheme selection model to obtain the intervention strategy selected from the plurality of intervention strategies to be selected and suggested to be currently adopted by the user.

8. An apparatus for model generation, the apparatus comprising:

the system comprises an acquisition module, a judgment module and a processing module, wherein the acquisition module is used for acquiring the physiological characteristics of a preset time point of each user in a plurality of users, the survival state duration corresponding to the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point, and the survival state duration is the duration between the preset time point and the time point of the change of the survival state of the user; the physiological characteristics of each user at a preset time point are expressed as a plurality of physiological characteristics corresponding to any time point in each user day, and the survival state change is expressed as discharge or death of the user;

the training module is used for training the survival state model by taking the physiological characteristics of the preset time point and the intervention strategy adopted by the preset time point as input quantities and taking the survival state duration as an output quantity;

the obtaining module is further configured to obtain the trained survival state model after the training module trains the survival state model;

the obtaining module is used for obtaining the physiological characteristics of each user in a plurality of users with the same physiological characteristics at a preset time point, the intervention strategy adopted by each user at the preset time point, the life state duration corresponding to each user and the influence value of the adopted intervention strategy on the physiological characteristic change degree of each user after obtaining the trained survival state model;

the training module is further used for training the reinforcement learning model by taking the physiological characteristics of a plurality of users with the same physiological characteristics at a preset time point, the intervention strategy adopted by each user at the preset time point and the corresponding survival state duration of each user as input quantities, and taking the influence value of the adopted intervention strategy on the change degree of the physiological characteristics of each user as output quantities;

the acquisition module is used for acquiring the reinforcement learning model after the reinforcement learning model is trained by the training module.

9. The device according to claim 8, wherein the training module is further configured to train the scenario selection model by using, as input quantities, the physiological characteristics of each of the plurality of users at a preset time point, the duration of the survival state corresponding to the physiological characteristics of the preset time point, and an influence value of the intervention strategy adopted at the preset time point on the change degree of the physiological characteristics of each of the users, and using the intervention strategy adopted at the preset time point of each of the users as an output quantity;

and the acquisition module is used for acquiring the trained scheme selection model after the training module trains the scheme selection model.

10. A data processing apparatus, wherein the living state model trained by the method of claim 1 and the reinforcement learning model are used for data processing, the apparatus comprising:

the acquisition module is used for acquiring a plurality of current physiological characteristics of a user and a plurality of intervention strategies to be selected;

the input module is used for inputting the current physiological characteristics of the user and each intervention strategy in a plurality of intervention strategies to be selected into the survival state model;

the obtaining module is further configured to obtain a survival state duration corresponding to each intervention strategy after the input module inputs each intervention strategy of the current physiological characteristics of the user and the plurality of intervention strategies to be selected into the survival state model;

the input module is further used for inputting a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model;

the obtaining module is further configured to obtain an influence value of each intervention strategy on the lifetime of the user after the input module inputs the current physiological characteristics of the user, the plurality of intervention strategies to be selected, and the lifetime corresponding to each intervention strategy into the reinforcement learning model;

and the determining module is used for determining the suggested intervention strategy according to the influence value.

11. The apparatus of claim 10, wherein the determining module determines a suggested intervention strategy based on the impact value, comprising:

12. The apparatus of claim 10, wherein the determining module determines a suggested intervention strategy based on the impact value, comprising:

13. A data processing apparatus, wherein the survival state model, the reinforcement learning model and the scheme selection model trained by the method of claim 2 are used for data processing, and the apparatus comprises:

the input module is used for inputting the current physiological characteristics of the user and each intervention strategy in the plurality of intervention strategies to be selected into the survival state model;

the input module is further used for inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into the reinforcement learning model;

the obtaining module is further configured to obtain an influence value of each intervention strategy on the life state duration of the user after the input module inputs the plurality of intervention strategies to be selected and the life state duration corresponding to each intervention strategy into the reinforcement learning model;

the input module is further used for inputting the current physiological characteristics of the user, the life state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the life state duration of the user into the scheme selection model;

the obtaining module is further configured to obtain an intervention strategy selected from the multiple intervention strategies to be selected and suggested to be currently adopted by the user.

14. A data processing apparatus, characterized in that the apparatus comprises:

the input module is used for inputting the current physiological characteristics of the user and each intervention strategy in the plurality of intervention strategies to be selected into a pre-trained survival state model;

the obtaining module is further configured to obtain a survival state duration corresponding to each intervention strategy after the input module inputs each intervention strategy of the current physiological characteristics of the user and the plurality of intervention strategies to be selected into a pre-trained survival state model, where the survival state duration is a duration between a current time point and a time point at which the survival state of the user changes;

the input module is further used for inputting the current physiological characteristics of the user, a plurality of intervention strategies to be selected and the survival state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model;

the obtaining module is further configured to obtain an influence value of each intervention strategy on the life state duration of the user after the input module inputs the plurality of intervention strategies to be selected and the life state duration corresponding to each intervention strategy into a pre-trained reinforcement learning model;

the input module is further used for inputting the current physiological characteristics of the user, the life state duration of the user corresponding to each intervention strategy and the influence value of each intervention strategy on the life state duration of the user into a pre-trained scheme selection model;

the acquisition module is further used for acquiring an intervention strategy which is selected from a plurality of intervention strategies to be selected and is currently adopted by a suggested user; the physiological characteristics of each user at the preset time point are expressed as a plurality of physiological characteristics corresponding to any time point in the day of each user, and the survival state change is expressed as discharge or death of the user.