WO2023284321A1 - Method and device for predicting survival hazard ratio - Google Patents

Method and device for predicting survival hazard ratio Download PDF

Info

Publication number
WO2023284321A1
WO2023284321A1 PCT/CN2022/081403 CN2022081403W WO2023284321A1 WO 2023284321 A1 WO2023284321 A1 WO 2023284321A1 CN 2022081403 W CN2022081403 W CN 2022081403W WO 2023284321 A1 WO2023284321 A1 WO 2023284321A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
data
predicted
survival
network
Prior art date
Application number
PCT/CN2022/081403
Other languages
French (fr)
Chinese (zh)
Inventor
乔楠
林歆远
徐迟
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2023284321A1 publication Critical patent/WO2023284321A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a method and device for predicting a survival risk ratio (hazard ratio, HR).
  • hazard ratio hazard ratio
  • Survival analysis refers to a series of statistical methods used to explore the time of occurrence of target events. For example, survival time analysis of cancer patients. Another example, failure time analysis of equipment, and so on.
  • an analysis model can be established based on data obtained from pre-investigation or experiments, and the analysis model can be used to predict one or more characteristics based on one or more characteristic variables that affect the occurrence of the target event The effect of variables on the survival curve of the target event to achieve the survival analysis of the target event.
  • a cox proportional hazards regression model Cox proportional hazards model, coxPH
  • cox proportional hazards model coxPH
  • the risk of the target event occurring at different times may reflect the survival curve for the observed events.
  • the ending event of the observation event is the target event.
  • the coxPH model can be expressed as formula (1):
  • t is the survival time
  • h(t) is the risk function of the target event, which represents the death risk of the target event when the survival time is t.
  • h 0 (t) represents the base risk function, which is usually determined in advance through the survival curves of a large number of samples.
  • x 1 , x 2 , ... x p represent p covariates, that is, characteristic variables that affect the target event to be predicted
  • b 1 , b 2 , ... b p represent the regression coefficient of each covariate.
  • the coxPH model is a linear model, that is, the coxPH model can only be used to analyze data with a linear relationship between input features and learning objectives (ie, the risk of occurrence of target events).
  • the influence of the characteristic variables affecting the occurrence of the target event on the occurrence of the target event is often nonlinear, that is, the relationship between the characteristic variables affecting the target event and the occurrence of the target event is usually a nonlinear relationship. Therefore, the linear model coxPH cannot accurately perform survival analysis on the target event. Based on this, how to improve the accuracy of survival analysis is a technical problem to be solved urgently in the prior art.
  • the present application provides a method and device for predicting survival risk rate, which can improve the accuracy of survival analysis.
  • the present application provides a method for predicting a survival risk rate, the method comprising: acquiring data of a sample to be predicted.
  • the data of the sample to be predicted is input into the preset model, and the data of the sample to be predicted is processed by the preset model to obtain the survival risk rate HR used to represent the survival risk of the sample to be predicted.
  • the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate output by the preset model is based on each expert The weight coefficient corresponding to the network is the result obtained by weighting and summing the output values of multiple expert networks.
  • the preset model since the preset model includes multiple expert networks and the gating network used to determine the weight coefficients of the expert networks, the preset model can output multiple expert networks according to the data of the samples to be predicted The results are integrated, therefore, the accuracy of the survival risk rate predicted by the preset model is higher, and the accuracy of the survival curve determined based on the survival risk rate is also higher. Moreover, the preset model can be trained based on an end-to-end training method.
  • the above method further includes: determining the risk function of the sample to be predicted based on the above survival risk rate and the baseline risk function, and the risk function is used to indicate the survival rate of the sample to be predicted at different times.
  • the survival risk rate is the survival risk rate of the to-be-predicted sample predicted by the above prediction model after the to-be-predicted sample is processed.
  • the survival analysis of the sample to be predicted is realized. Since the accuracy of the survival risk rate of the sample to be predicted predicted by the method provided by this application is high, the method used to indicate the sample to be predicted is determined based on the survival risk rate of the sample to be predicted by the method provided by this application. The accuracy of the hazard function of the survival rate at different times is also relatively high.
  • any expert network among the plurality of expert networks in the above preset model includes at least one candidate residual fully connected neural network RFCN, and the output value of any expert network is at least one candidate RFCN The output value that satisfies the preset condition among the output values.
  • the data of the samples to be predicted include non-Euclidean data.
  • the non-European data is the data that is arranged irregularly and irregularly. In practical applications, the amount of non-European data is huge and the structure is complex. Usually, the relationship between the non-Euclidean data in the data of the sample to be predicted and the survival rate of the sample to be predicted is a nonlinear relationship. Through this possible design, the method provided in the embodiment of the present application can realize the The sample data is processed and analyzed.
  • the above method further includes: based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, explaining the preset model to obtain the pairs of different characteristic data in the data of the sample to be predicted impact on survival risk.
  • the preset model is explained based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain Predict the impact of different characteristic data in the sample data on the survival risk rate, including: based on the patient's case data and the patient's survival risk rate, explain the preset model to obtain the impact of different characteristic data in the patient's case data on the patient's survival impact on risk.
  • the above-mentioned sample to be predicted is the data of equipment
  • the above-mentioned data based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted explain the preset model to obtain the The impact of different feature data in the data on the survival risk rate, including: based on the data of the device and the survival risk rate of the device, the preset model is explained to obtain the impact of different feature data in the device data on the survival risk rate of the device.
  • domain experts can guide based on the impact of different features on the sample survival risk rate practice. For example, for a patient, clinicians can adjust the patient's clinical treatment plan based on the impact of different treatment data in the patient's case data on the patient's survival risk rate. For another example, for equipment, engineers can improve and optimize the equipment based on the impact of different characteristic data of the equipment on the survival risk rate of the equipment.
  • the above method further includes: using training sample data to train an initial model to obtain a preset model.
  • the initial model includes an initial gating network and multiple initial expert networks.
  • the training of the initial model by using the data of the training samples includes: inputting the data of the training samples into the initial gating network and multiple initial expert networks in the initial model.
  • the weight coefficient of each initial expert network is obtained according to the initial gating network, and the output values of multiple initial expert networks are weighted and summed according to the corresponding weight coefficient of each initial expert network to obtain the predicted survival risk rate of the training sample.
  • a loss function is determined based on the predicted survival hazard rates of the training samples and the survival data of the training samples.
  • the network parameters of an initial gating network and multiple initial expert networks are tuned based on a loss function.
  • the survival data of the training sample includes the time when the training sample is observed, and the survival status of the training sample at this time.
  • the time for observing the training sample may be the survival time of the training sample, or any time after the initial event of the training sample occurs and before the ending event occurs.
  • the start event and the end event of the training sample are related to the application scenario of the preset model trained by the training sample. For example, when the survival risk rate predicted by the preset model is used to study the efficacy of anticancer drugs, the initial event of the training sample can be that the patient starts to take the anticancer drug, and the final event can be the death of the patient.
  • the initial event of the training sample can be the operation of the patient, and the outcome event can be the death of the patient.
  • the survival risk rate predicted by the preset model is used to study the life of the device
  • the initial event of the training sample can be the delivery of the device/part
  • the final event can be the failure of the device, and so on.
  • the survival state of the training sample includes two states of survival and death of the training sample. In this way, through the two possible designs, the preset model used in predicting the survival risk rate provided by the present application can be obtained through end-to-end training.
  • the present application provides a device for predicting survival risk.
  • the device for predicting the survival risk rate is used to implement any one of the methods provided in the first aspect above.
  • the present application may divide the device for predicting survival risk into functional modules according to any one of the methods provided in the first aspect above.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the present application may divide the apparatus for predicting the survival risk rate into an acquisition unit, a processing unit, and the like according to functions.
  • the device for predicting the survival risk rate includes: one or more processors and a transmission interface, the one or more processors receive or send data through the transmission interface, and the one or more processing The device is configured to invoke program instructions stored in the memory, so that the apparatus for predicting survival risk rate executes any method as provided in the first aspect and any possible design manner thereof.
  • the present application provides a computer-readable storage medium, the computer-readable storage medium includes program instructions, and when the program instructions are run on a computer or a processor, the computer or the processor executes any of the steps in the first aspect. Either method provided by a possible implementation.
  • the present application provides a computer program product, which, when running on a device for predicting survival risk, causes any one of the methods provided in any one of the possible implementations in the first aspect to be executed.
  • any of the devices, computer storage media, or computer program products provided above for predicting the survival risk rate can be applied to the corresponding methods provided above. Therefore, the beneficial effects that it can achieve can refer to The beneficial effects of the corresponding method will not be repeated here.
  • the name of the above-mentioned device for predicting the survival risk rate does not constitute a limitation on the device or functional module itself, and in actual implementation, these devices or functional modules may appear with other names. As long as the functions of each device or functional module are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalent technologies.
  • Fig. 1 is a schematic diagram of a survival curve
  • FIG. 2 is a schematic structural diagram of a prediction device provided in an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a training method for a preset model provided in an embodiment of the present application
  • Fig. 4 is a schematic structural diagram of an initial model provided in the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an expert network provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a method for predicting survival risk provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a method for explaining a preset model provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of the survival curves of samples in each group after a preset model provided in the embodiment of the present application groups the samples in the sample set;
  • Fig. 9 is a histogram of results indicating the consistency of the model after internal verification and external verification of the model obtained from the sample training of hospital A based on the method provided by the embodiment of the present application and the existing method;
  • FIG. 10 is a schematic structural diagram of a device for predicting survival risk provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a signal carrying medium for carrying a computer program product provided by an embodiment of the present application.
  • the survival curve refers to the curve of the survival rate (or survival rate) of the observed sample over time. Wherein, as opposed to death, survival may refer to living things. Survival, as opposed to relapse or progression of disease, can refer to a patient's disease being in remission. As opposed to failure (or failure) of a device/system/part, survival may be the normal functioning of the device/system/part. Compared with the loss of customers, survival can refer to the normal maintenance of customers.
  • the survival curve can be used to reflect the recurrence of the disease after being cured, or the survival curve can be used to reflect the failure of the equipment/parts from the factory.
  • FIG. 1 shows a schematic diagram of a survival curve.
  • the horizontal axis can represent the observation time
  • the vertical axis can represent the survival rate of the observed samples.
  • the curve of the survival rate of 1000 samples changing with time may be the survival curve 10 shown in FIG. 1 .
  • the survival rate of 1000 samples is 90%.
  • the survival rate of the sample drops by 45%, that is, based on the samples that survived on the first day, the survival rate of the sample on the second day is 50%.
  • the survival rate of the sample drops by 20%, that is, based on the samples that survived the second day, the survival rate of the sample on the third day is 45%, and so on.
  • the survival curve of the sample is the curve of the survival probability of the sample changing with time.
  • the probability of survival for a patient is 0.3.
  • the probability of survival was 0.5.
  • the probability of survival is 0.8, and so on.
  • Survival time refers to the time elapsed from the starting event of the observed target to the occurrence of the ending event.
  • the ending event of the observation target is the target event mentioned above.
  • the starting event of the observation target may be the operation on the patient, and the outcome event of the observation target may be the death of the patient.
  • the period from the operation to the patient's death can be called the postoperative survival time of the patient.
  • the starting event of the observation target may be the completion of the production of the equipment/parts
  • the ending event of the observation target may be the failure of the equipment/parts.
  • the period from the completion of the production of the equipment/part to the failure of the equipment/part can be called the survival time of the equipment/part.
  • the survival hazard rate is the probability of death of a sample within a unit of time. That is, the survival hazard ratio of the sample is used to express the survival risk of the sample.
  • exp(b) is the survival risk rate. It should be understood that the higher the risk rate of sample survival, that is, the higher the mortality rate of the sample, that is, the lower the survival rate of the target.
  • Truncated data can also be called time-to-event data, which is data used to indicate whether an event occurs at a certain time.
  • the patient's relapse and the time of relapse can be called truncated data.
  • Survival analysis refers to a family of statistical methods used to explore the timing of an event of interest. For example, explore the probability of occurrence of a target event at a certain time.
  • the survival data is generally truncated data, for example, data including time and whether a target event occurs at this time point. It should be understood that the time mentioned here may be a survival time or any observation time, which is not limited.
  • the method of survival analysis can be applied but not limited to the following real scenarios:
  • prognosis is the prediction of the development process and consequences of a certain disease. According to whether treatment is received during the occurrence or development of the disease, the prognosis can be divided into natural prognosis and treatment prognosis.
  • non-Euclidean data non-euclidean space data
  • Non-Euclidean data can also be called non-Euclidean data.
  • Non-Euclidean data is data that is not neatly arranged and arranged irregularly. In a sample composed of non-Euclidean data, the order or position of the data does not affect the characteristics of the sample.
  • non-European data exist in many fields.
  • social network data in the field of social science sensor networks in the field of communication technology
  • regulatory networks in the field of genomics or mesh surfaces in computer graphics, etc.
  • Residual network Residual network
  • RFCN residual fully-connected neural network
  • ResNet is a kind of neural network.
  • ResNet includes skip connections or shortcut connections. These connections can make data transfer between network layers skip some network layers, thereby avoiding network degradation and gradient disappearance in deep neural networks. And it can improve the training speed of the network, and at the same time, it can make the number of layers of the network very deep.
  • a deep neural network with a deep number of layers is more conducive to processing data with complex structures.
  • RFCN is a neural network based on a fully connected layer and introducing skip connections or shortcut connections.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design scheme described as “exemplary” or “for example” in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner.
  • first and second are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the present application, unless otherwise specified, "plurality" means two or more.
  • determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
  • multiple different weak models can be pre-trained based on the same sample set, and then integrated and fused to obtain a comparison
  • an integrated model with better accuracy and generalization ability such as the coxPH integrated model.
  • the process of integrating and merging models is usually complicated, and this way of obtaining an integrated model is not an end-to-end way of obtaining a model.
  • the integrated model since the integrated model includes multiple weak models, the differences among the multiple weak models will affect the interpretation of the integrated model to a certain extent.
  • the embodiment of the present application provides a method for predicting the survival risk rate, which can predict the survival risk rate of the sample to be predicted based on the pre-trained model obtained in advance, and the survival risk rate is used to represent the sample to be predicted Based on the risk rate and the baseline risk function, the risk function reflecting the survival curve of the sample to be predicted can be determined, thereby realizing the survival analysis of the sample to be predicted.
  • the survival risk rate of the sample to be predicted is used to represent the survival risk of the sample to be predicted.
  • the above preset model includes a gated network and multiple expert networks.
  • the gating network is used to obtain the weight coefficient corresponding to each expert network according to the samples to be predicted.
  • the survival risk rate of the sample to be predicted is the weighted summation of the output values of the above-mentioned multiple expert networks according to the weight coefficient corresponding to each expert network in the preset model.
  • the preset model provided in the embodiment of the present application can be trained based on an end-to-end method, and the preset model can be regarded as an integrated model after integration and fusion of multiple expert networks according to the weight coefficients generated by the gating network. Therefore, the accuracy of the survival risk rate of the sample to be predicted based on the prediction of the preset model is relatively high, thereby improving the accuracy of the survival analysis of the sample to be predicted based on the risk rate.
  • the specific training method of the preset model can refer to the description below, and will not be repeated here.
  • the expert network in the above preset model can be implemented by RFCN, so that the preset model can be trained based on non-Euclidean data with nonlinear characteristics. Since the amount of non-European data in the real scene is very large and the structure is complex, the preset model trained based on non-European data has stronger learning ability and higher prediction accuracy.
  • the embodiment of the present application also provides a device for predicting survival risk (hereinafter referred to as the predicting device).
  • the predicting device may be any computing device with computing capability or a computing device set composed of multiple computing devices.
  • the predicting device may be a computing device such as a notebook computer or a desktop computer, and the predicting device may also be a server or a collection of servers.
  • the above-mentioned preset model may be preset in the predicting device.
  • the preset model may be stored in the prediction device in the form of an application program.
  • the predicting device may not preset the above-mentioned predicting model, for example: the predicting device may call the preset model deployed on the cloud through an application programming interface (application programming interface, API) call.
  • API application programming interface
  • FIG. 2 shows a schematic structural diagram of a prediction device provided by an embodiment of the present application.
  • the prediction device 20 includes a processor 21 , a main memory (main memory) 22 , a storage medium 23 , a communication interface 24 and a bus 25 .
  • the processor 21 , the main memory 22 , the storage medium 23 and the communication interface 24 may be connected through a bus 25 .
  • the processor 21 is the control center of the prediction device 20, which can be a general central processing unit (central processing unit, CPU), and the processor 21 can also be other general processors, digital signal processing (digital signal processing, DSP), dedicated integrated Application-specific integrated circuit (ASIC), field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, graphics processing unit , GPU), neural network processing unit (neural processing unit, NPU), tensor processing unit (tensor processing unit, TPU) or artificial intelligence (artificial intelligent) chips, etc.
  • DSP digital signal processing
  • ASIC Application-specific integrated circuit
  • FPGA field-programmable gate array
  • the processor 21 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 2 .
  • the present application does not limit the number of processor cores in each processor.
  • the main memory 22 is used to store program instructions, and the processor 21 can execute the program instructions in the main memory 22 to implement the method for predicting the survival risk rate provided by the embodiment of the present application.
  • the main memory 22 may exist independently of the processor 21 .
  • the main memory 22 can be connected with the processor 21 through the bus 25, and is used for storing data, instructions or program codes.
  • the processor 21 invokes and executes the instructions or program codes stored in the main memory 22, the method for predicting the survival risk rate provided by the embodiment of the present application can be realized.
  • main memory 22 may also be integrated with the processor 21 .
  • Storage medium 23 may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which acts as external cache memory.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced SDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • direct rambus RAM direct rambus RAM, DR RAM
  • the storage medium 23 may be used for the training sample data in the embodiment of the present application.
  • the communication interface 24 is used to connect the prediction device 20 with other devices (such as terminals, etc.) through a communication network, and the communication network can be Ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks) , WLAN) etc.
  • the communication interface 24 may include a receiving unit for receiving data, and a sending unit for sending data.
  • the bus 25 may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 2 , but it does not mean that there is only one bus or one type of bus.
  • the structure shown in FIG. 2 does not constitute a limitation to the prediction device 20.
  • the prediction device 20 may include more or fewer components than those shown in FIG. 2, or Combining certain parts, or different arrangements of parts.
  • the embodiment of the present application also provides a system for predicting the survival risk rate (hereinafter referred to as the prediction system).
  • the prediction system may include a terminal and a server. Connect and communicate via wired or wireless.
  • the preset model mentioned above is preset in the server.
  • the terminal can be used to receive the data of the sample to be predicted input by the user
  • the server can be used to receive the data of the sample to be predicted from the terminal, and return the prediction result to the terminal after processing the received data of the sample to be predicted.
  • the terminal may be a terminal device such as a mobile phone, a notebook computer, or a desktop computer, which is not limited in this embodiment of the present application.
  • the embodiment of the present application also provides a training device for a preset model (hereinafter referred to as the training device), and the training device may be any computing device with computing capability.
  • the training device may be any computing device with computing capability.
  • the hardware description of the training device reference may be made to the hardware description of the prediction device above, and details are not repeated here.
  • the training device may be the same device as the prediction device described above, or may be a different device, which is not limited in this embodiment of the present application.
  • FIG. 3 shows a schematic flowchart of a training method for a preset model provided by an embodiment of the present application.
  • the method may be performed by the training device described above.
  • the method can include:
  • the training sample set includes data of multiple training samples and survival data of each training sample.
  • the data of the training sample can be the case data of the patient.
  • the data of the training sample can be any data related to the device, such as attribute data of the device, production data of the device, and so on.
  • the data of each training sample may include data of multiple features of the training sample, and the data of the features included in the data of each training sample may include non-Euclidean data. It can be understood that the number of features included in each training sample may be the same or different.
  • the number of training samples including the feature in the training sample set is greater than a first threshold.
  • the embodiment of the present application does not specifically limit the value of the first threshold. In this way, it can be ensured that a sufficient number of training samples include any one feature, so that the contribution of any one feature to the preset model determined when interpreting the trained preset model is more accurate.
  • Table 1 shows an example of a training sample set.
  • the training sample set includes data of n training samples, and the n training samples are respectively training sample 1, training sample 2, training sample 3, . . . , and training sample n.
  • Each training sample includes data of m features, and the m features are feature 1, feature 2, feature 3, . . . , and feature m. Wherein, both n and m are positive integers.
  • the features included in the training samples are related to the application scenarios of the preset model trained by the training samples.
  • the feature data of a training sample may include: the patient's basic data (including the patient's age and body mass index (Body Mass Index, BMI) etc.), The patient's blood test data (including the patient's blood routine data, blood cell data, liver function data and kidney function data, etc.), the patient's vital sign data (including the patient's body temperature, pulse, heart rate, blood pressure, respiration and blood oxygen, etc.), Or a variety of data in the patient's treatment records (including drug name, drug type and drug dosage during drug treatment, as well as plasma therapy, oxygen therapy, etc.).
  • the patient's basic data including the patient's age and body mass index (Body Mass Index, BMI) etc.
  • the patient's blood test data including the patient's blood routine data, blood cell data, liver function data and kidney function data, etc.
  • the patient's vital sign data including the patient's body temperature, pulse, heart rate, blood pressure, respiration and blood oxygen, etc.
  • a variety of data in the patient's treatment records including drug
  • the feature data of a training sample may include: the material data of the part, the process data of the production part, and the time of the part's delivery. kinds of data.
  • each training sample includes unique survival data
  • the survival data is truncated data (ie, time event data).
  • survival data 1 may be the survival data of training sample 1 in Table 1
  • survival data 2 may be the survival data of training sample 2 in Table 1
  • survival data 3 may be the survival data of training sample 3 in Table 1
  • survival data n can be the survival data of the training sample n in Table 1.
  • survival data 1 includes training sample 1 on the 10th day, and the ending event of training sample 1 occurs, that is, the event status value is "True/1".
  • survival data 2 when the survival data 2 includes the training sample 1 on the 14th day, the ending event of the training sample 2 does not occur, that is, the event status value is "False/0". No longer.
  • the training device can obtain the training sample set from an external storage device.
  • the training sample set is pre-stored in the external storage device.
  • the training device may also receive the training sample set from other devices through a communication interface (such as the communication interface 24 shown in FIG. 2 ). Wherein, the training sample set is pre-stored in the other device.
  • the training sample set acquired by the training device may be a preprocessed training sample set or a non-preprocessed training sample set.
  • the training device can preprocess the training sample set after obtaining the training sample set.
  • the preprocessing The specific content is not limited.
  • the preprocessing of the training sample set may be to delete abnormal training samples in the training sample set (for example, the number of features in the training sample is less than the first threshold), and may be to delete abnormal feature data in the training sample (for example, a certain In the case sample, the height of the patient is 10m), which can be to delete the features whose missing value is greater than the second threshold in all training samples in the training sample set (the feature whose missing value is greater than the threshold refers to the number of features in the training sample set greater than the second threshold features that are not included in the training samples), or data that normalizes the feature data, etc., which are not specifically limited in this embodiment of the present application.
  • the training device may iteratively train the initial model based on the acquired data of the training samples in the training sample set, so as to obtain the preset model.
  • the initial model may be a model pre-designed by the designer, and the initial model is designed as a model for predicting the survival risk rate of the sample.
  • the initial model may include a gating network and multiple expert networks.
  • the gating network can be, for example, a neural network classifier
  • the expert network can be, for example, RFCN.
  • the gating network is used to obtain the weight coefficient corresponding to each expert network according to the received training samples.
  • Multiple expert networks are respectively used to learn the training samples to output their own learning results.
  • the learning results of multiple expert networks are weighted and summed according to the weight coefficients of the multiple expert networks obtained by the gating network, and the output value (or output result) of the current model can be obtained, and the output value is The predicted value obtained after the current model learns the training sample, the predicted value is the survival risk rate of the training sample predicted by the current model.
  • the gating network can learn the type of the received training samples, and then assign corresponding weights to each expert network according to the learned type.
  • the gating network is a network that learns and classifies the training samples autonomously, and the number of types after the gating network classifies the training samples is equal to the number of expert networks included in the initial model.
  • weights of the plurality of expert networks determined by the gating network may be normalized by a preset function to obtain weight coefficients corresponding to the plurality of expert networks. Wherein, the sum of the normalized weight coefficients is 1.
  • the weights of the plurality of expert networks determined by the gating network can be exponentially normalized through the softmax function.
  • the process of exponentially normalizing multiple data through the softmax function will not be described in detail.
  • each expert network is weighted, multiplied and summed (ie weighted sum) according to the weight coefficient corresponding to each expert network, and the predicted value obtained by the current model after learning the received training samples can be obtained.
  • This process can be represented by the following formula (2).
  • each expert network is used to learn the training samples and predict the survival risk rate of the training samples.
  • x represents the training samples
  • N is the number of expert networks
  • i represents the i-th expert network among the N expert networks.
  • F(x) represents the survival risk rate (that is, the predicted value) output by the model after learning the training sample x.
  • G(x) represents the weight of the N expert networks output by the gating network
  • represents the temperature coefficient, which is used to indicate the smoothness of the normalized result when Softmax performs exponential normalization on multiple weights, which is usually preset.
  • Softmax(G(x), ⁇ ) i represents the weight coefficient corresponding to the i-th expert network obtained after exponentially normalizing the weight G(x) output by the gating network, f i (x) represents the i-th expert network Learn the processed results for the training sample x.
  • the initial model 40 includes a gating network 41 and two expert networks, and the two expert networks are respectively an expert network 421 and an expert network 422 .
  • the expert network 421 performs learning processing on the training sample 1 to obtain the result 1
  • the expert network 422 performs learning processing on the training sample 1 to obtain the result 2.
  • the gating network 41 can output weight 1 for the expert network 421 and output weight 2 for the expert network 422 based on the learned type of the training sample 1 . Then, the softmax function exponentially normalizes the two weights output by the gating network to obtain the weight coefficient 1 of the expert network 421 and the weight coefficient 2 of the expert network 422 .
  • the initial model can be obtained by adding the result obtained by multiplying the weight coefficient 1 and the result 1 output by the expert network 421 and the result obtained by multiplying the weight coefficient 2 and the result 2 output by the expert network 422
  • the predicted value output after learning the training sample 1 is the survival risk rate of the training sample 1 predicted by the initial model after learning the training sample 1.
  • any expert network among the plurality of expert networks in the above initial model may include at least one candidate RFCN.
  • the output result of the candidate RFCN after learning and processing the training samples is the output result of any expert network after learning and processing the training samples.
  • an evaluation module is also included in the any one of the expert networks, and the evaluation module is used to evaluate the results obtained after each candidate RFCN learns the training samples, and will satisfy The result of the pre-set conditions is used as the output of any expert network.
  • the output result of "satisfying the preset condition" may be the output result closest to the sample label value among the output results of multiple candidate RFCNs, which is the output result satisfying the preset condition. In this way, the accuracy of model prediction can be improved.
  • the evaluation module can obtain the result after learning the training sample a based on each candidate RFCN in the expert network (that is, the predicted value of the training sample a output by each candidate RFCN), and the training The survival data of sample a, calculate the loss function of each candidate RFCN. Then, the output result of the candidate RFCN corresponding to the loss function with the smallest value (the smallest loss function means that the predicted value is closest to the real value) is used as the output value of any expert network.
  • the embodiment of the present application does not specifically limit the specific implementation manner of evaluating the result with the best performance from the learning results of multiple candidate RFCNs.
  • the network structures of multiple candidate RFCNs in the same expert network are different.
  • the different network structures of the candidate RFCNs may be different network structures/layers skipped by the skip connections or shortcut connections of the candidate RFCNs, which is not limited in this embodiment of the present application.
  • the candidate RFCN sets included in each expert network are different from each other.
  • expert network 1 may include candidate RFCN 1, candidate RFCN 2, and candidate RFCN 3.
  • Expert network 2 may include RFCN 1 and candidate RFCN 2.
  • the expert network 3 may include RFCN 3 and candidate RFCN 4.
  • FIG. 5 shows a schematic structural diagram of an expert network provided by an embodiment of the present application.
  • the expert network 421 includes three candidate RFCNs, which are candidate RFCN 511 , candidate RFCN 512 and candidate RFCN 513 .
  • the expert network 421 also includes an evaluation module 52 .
  • the candidate RFCN 511 can learn and process the training sample 1 to obtain the result 1.
  • the candidate RFCN 512 can learn and process the training sample 1 to obtain the result 2
  • the candidate RFCN 513 can learn and process the training sample 1 to obtain the result 3.
  • the evaluation module 52 may evaluate the result 1, the result 2 and the result 3, and determine the result with the best performance. For example, the evaluation module 52 determines that the result with the best performance is the result 2, and the expert network 421 outputs the result 2.
  • the training device iteratively trains the initial model with the structure described above based on the acquired training samples to obtain the preset model.
  • the training device performs iterative training on the initial model with the structure described above based on the obtained training samples, and the process of obtaining the preset model can be described as follows:
  • the training device inputs the training sample 1 in the training sample set into the model to be trained.
  • the model to be trained is the initial model described above.
  • each expert network in the model to be trained can perform learning processing on the training sample 1 and output respective learning results.
  • the learning result output by each expert network is the predicted value output by each expert network after learning the training sample 1.
  • the gating network in the model to be trained learns and classifies the training sample 1, and outputs the weight corresponding to each expert network based on the learned type.
  • the training device performs normalization processing on the weights corresponding to each expert network, so as to determine the weight coefficients corresponding to each expert network.
  • the model to be predicted is the initial model, that is, the first training of the initial model by the training device, after the gating network learns the training sample 1, it can randomly output the weight corresponding to each expert network according to the learning result .
  • the expert network with the largest weight can be regarded as the expert network corresponding to the type of the current training sample 1 .
  • the training device weights, multiplies and sums the predicted values output by multiple expert networks according to the weight coefficient of each expert network, so as to obtain the predicted value of the training sample 1 output by the model to be trained.
  • the predicted value of the training sample 1 is the survival risk rate of the training sample 1 predicted by the model to be predicted.
  • the training device can calculate the loss function based on the predicted value output by the model to be trained and the survival data of the training sample 1 (that is, the real value, or the label value of the training sample). Since the survival data are truncated data. Therefore, optionally, in this embodiment of the present application, a loss function for truncated data may be calculated based on a negative log-likelihood (NLL) score.
  • NLL negative log-likelihood
  • the training device may calculate the loss function of the model to be predicted based on the predicted value output by the model to be predicted and the survival data of the training sample 1 .
  • the loss function of the model to be predicted is passed backwards, and the network parameters of each expert network are adjusted according to the weight coefficient of each expert network.
  • the adjustment amount of the network parameters of the expert network is directly proportional to the weight coefficient of the expert network. For example, the network parameter adjustment amount of the expert network with a large weight coefficient is relatively large, and the network parameter adjustment amount of the expert network with a small weight coefficient is small.
  • the training device may also calculate the loss function corresponding to each expert network based on the output value of each expert network and the survival data of the training sample 1 .
  • the parameters of the gating network are adjusted, so that the next time the gating network receives and trains After sample 1 has a training sample with the same or similar characteristics, assign a larger weight to the expert network corresponding to the aforementioned minimum loss function, so that the expert network can be used exclusively for training samples with the same or similar characteristics as training sample 1 in the subsequent training process. Training samples with similar characteristics are used for learning.
  • an expert network can only learn a class of samples with the same or similar characteristics. It should be understood that since the output value of the expert network with heavy weight accounts for a large proportion of the output value of the model to be predicted, when adjusting the network parameters of the expert network with heavy weight based on the loss function of the model to be predicted, the adjustment amount is relatively large, so An expert network with a large weight can learn more features of the training samples.
  • the training sample 1 completes a training of the model to be trained.
  • the training device can input the training sample 2 into the new model to be trained, and refer to the training process of the model to be trained in the training sample 1 to complete a training of the training sample 2 for the new model to be trained.
  • the gating network assigns weights to each expert network after learning the training sample 2, it can refer to the classification when learning the training sample 1, and assign a larger weight to the expert network corresponding to the type of training sample 2 .
  • the training device may execute the above process multiple times based on the training samples in the training sample set to implement iterative training of the initial model.
  • the preset model provided by the embodiment of the present application is obtained.
  • the gating network in the preset model is used to classify samples.
  • the expert network in the preset model is used to predict the survival risk rate of different types of samples. It can be understood that the frame structure of the preset model is the same as the frame structure of the above-mentioned initial model.
  • the preset model trained by the method described in S101-S102 above can process the samples to be predicted to predict the survival risk rate of the samples to be predicted, and then determine the survival risk rate of the samples to be predicted according to the survival risk rate of the samples to be predicted.
  • the survival curve of the sample realizes the survival analysis of the sample to be predicted.
  • FIG. 6 shows a schematic flowchart of a method for predicting survival risk provided by an embodiment of the present application.
  • the method can be executed by the prediction device shown in FIG. 2 , and the prediction model trained by the method described in S101-S102 is preset in the prediction device.
  • the method can include:
  • the detailed description of the prediction device acquiring the data of the sample to be predicted can refer to the description of the training device acquiring the training sample in S101 above, which will not be repeated here.
  • S202 Process the data of the sample to be predicted by using a preset model to obtain the survival risk rate of the sample to be predicted.
  • the prediction device may input the acquired data of the sample to be predicted into a preset model, and process the data of the sample to be predicted through the preset model to obtain the survival risk rate of the sample to be predicted.
  • the survival risk rate of the sample to be predicted can be used to determine the survival curve of the sample to be predicted, so that the survival analysis of the sample to be predicted can be performed.
  • the preset model processes the data of the above-mentioned samples to be predicted to obtain the survival risk rate of the samples to be predicted.
  • the risk function of the sample to be predicted can be determined based on the survival risk rate of the sample to be predicted predicted by the prediction model and the above-mentioned formula (1).
  • the characteristic data of the sample to be predicted can be used as x 1 , x 2 , ... x p in the formula (1) represent covariates, and expb 1 ⁇ expb 2 ⁇ ... ⁇ expb p is predicted by the preset model The survival hazard rate of the sample to be predicted.
  • the risk function can reflect the survival curve of the sample to be predicted. For example, if the risk value of the sample to be predicted is high at a certain time, it means that the survival rate of the sample to be predicted at this time is low.
  • the preset model used to predict the sample to be predicted is obtained by training based on non-European data, and the preset model is equivalent to a plurality of expert networks Integration and fusion, therefore, the accuracy of the survival risk rate of the sample to be predicted predicted by the method for predicting the survival risk rate provided by the embodiment of the present application is relatively high. Therefore, the accuracy of the risk function of the sample to be predicted based on the survival risk rate is improved, and the survival curve of the sample to be predicted can be accurately reflected.
  • each training sample used to train the preset model includes many features, and the contribution of each feature of the training sample to the predicted value output by the trained preset model is also not exactly. Therefore, in practical applications, if the contribution of each feature in the training sample to the predicted value output by the preset model can be determined, then the degree of influence of each feature in the training sample on the occurrence of the target event can be determined. For example, the degree of impact of different treatments on the survival time of patients. In this way, based on the degree of influence of each feature on the occurrence of target events, the optimization and improvement of samples in real scenes can be guided.
  • the embodiment of the present application can determine the contribution of each feature in the sample to the predicted value of the sample by interpreting the preset model.
  • the embodiment of the present application may also analyze the cause of the predicted value of the sample by explaining the sample.
  • the method for interpreting the preset model described in the embodiments of the present application, or the method for interpreting the sample can be executed by any device that has computing power and is preset with the preset model described above.
  • the embodiments of the present application are described below by taking the method of interpreting the preset model and samples performed by the prediction device as an example.
  • the prediction device interprets the preset model, which may include explaining the preset model itself, explaining the expert network in the preset model, or explaining the gating network in the preset model. kind.
  • the prediction device can obtain a bee swarm diagram (beeswarm) for explaining the preset model according to the preset model and a plurality of training samples.
  • the bee colony diagram is used to show the contribution of each feature in the sample to the predicted value output by the preset model.
  • the predicting device may respectively input multiple training samples into the preset model, so as to obtain respective predicted values corresponding to the multiple training samples. Then, the prediction device may draw a bee colony diagram based on the feature data of the plurality of training samples and the respective predicted values corresponding to the plurality of training samples. Wherein, the predicting device may draw the bee colony diagram based on a shape value (shape value) method.
  • shape value shape value
  • FIG. 7 shows a schematic diagram of a method for explaining a preset model provided by an embodiment of the present application.
  • a frame diagram of the preset model may be displayed on the interface 70 on the display screen of the preset device.
  • the interface 70 can be a model interpretation interface in the user interface of the preset model
  • the frame diagram on the interface 70 includes the interface buttons of the gated network 71 and two expert networks in the preset model (the expert network 711 and the expert network network 712).
  • the sample to be output can be selected on the input sample interface, and the local storage can be input to the preset model after confirmation.
  • the display screen of the prediction device can display a bee colony diagram for explaining the preset model, such as the bee colony diagram shown in (b) in Figure 7 .
  • the abscissa of the bee colony diagram is used to represent the contribution of the feature to the predicted value output by the preset model.
  • feature 9 when the feature value of feature 9 is larger, the contribution of feature 9 to the predicted value output by the preset model is positive, and the larger the feature value of feature 9 (that is, the darker the gray), the feature 9 The greater the contribution to the predicted value output by the preset model (the larger the positive value, the greater the contribution); on the contrary, when the eigenvalue of feature 9 is small, the contribution of feature 9 to the predicted value output by the preset model is Negative, and the smaller the feature value of feature 9 (that is, the lighter the gray), the smaller the contribution of feature 9 to the predicted value output by the preset model (the larger the absolute value of the negative value, the smaller the contribution).
  • the user can click the button "gated network 71" on the interface 70 after operating the "input” button to realize sample input, and then Click the "Output” button, so that the contribution of the sample features to the output value of the gating network can be obtained.
  • the user can click the button corresponding to any expert network on the interface 70 after operating the "input” button to realize sample input, Then click the "Output” button, so that the contribution of the sample characteristics to any expert network output value can be obtained.
  • the data of multiple samples used to explain the preset model is the case sample data of multiple cancer patients
  • model interpretation when it is determined to use a certain drug treatment (the use of drug treatment is a feature of the sample) to The greater contribution to reducing the survival risk rate of cancer patients indicates that the drug treatment can improve the survival rate of cancer patients. In this way, clinicians can be guided in the medication of cancer patients.
  • the predicting device can obtain a bee colony diagram for explaining any sample according to the any sample and a preset model.
  • the bee colony diagram is used to show the contribution of each feature in any sample to the predicted value of the sample output by the preset model, so that the cause of the predicted value of any sample can be analyzed.
  • the predicting device may input the sample to be explained into a preset model, so as to obtain the predicted value of the sample to be explained. Then, the predicting device may draw a bee colony diagram based on the characteristic data of the sample to be explained and the predicted value of the sample to be explained.
  • the predicted value of the sample to be explained is 24.1.
  • the arrows in the black area point to the direction in which the predicted value increases, and the arrows in the white area point to the direction in which the predicted value decreases.
  • the feature LSTAT contributes the most to improving the predicted value of the sample to be explained (that is, the longest black bar)
  • the value of the feature RM contributes the most to reducing the predicted value of the sample to be explained (that is, the longest white bar shown).
  • the degree of influence of different characteristics in a single sample on the survival risk rate of the single sample can be determined, and then relevant guidance can be given to the sample.
  • the single sample is a part
  • the material of the sample is material a
  • it will make a greater contribution to increasing the survival risk rate of the part, which means that the part manufactured based on material a
  • the survival rate is low, that is, the life of the part is the shortest. This will guide the manufacturer to avoid using material a to make the part.
  • the gating network in the preset model trained in the embodiments of the present application is essentially a classifier. Therefore, the embodiment of the present application can also classify the samples in the sample set based on the gating network in the above preset model. In this way, the samples in the sample set can be divided into multiple groups according to their types, that is, the samples in any group are samples of the same type. For example, a gating network can divide a sample of electronic medical records of patients into a sample of males and a sample of females.
  • the survival curve corresponding to each group of samples can be drawn. It can be understood that the survival data of each sample in each group of samples here is known. Through this method, the comparative analysis of the survival curves of different types of samples is realized.
  • FIG. 8 shows a schematic diagram of survival curves of samples in each group after samples in a sample set are grouped by a preset model provided by an embodiment of the present application.
  • the gating network in the preset model provided by the embodiment of the present application can divide the patient's electronic medical record sample into sample group 1 and sample group 2
  • sample group 1 includes 150 samples
  • sample group 2 includes 27 samples
  • survival data of sample group 1 and sample group 2 can be drawn in the same coordinate system survival curve.
  • survival curve 1 shown in FIG. 8 is the survival curve of sample group 1
  • survival curve 2 is the survival curve of sample group 2. In this way, the difference in survival rate between sample group 1 and sample group 2 at the same time can be seen intuitively from the figure.
  • survival curve 1 representing the survival curve of sample group 1
  • survival curve 2 Survival Curve 2
  • experts in the field ie, clinicians
  • the common features may be a decisive factor affecting the survival rate of the group of samples.
  • clinicians can be guided to adjust the patient's treatment plan.
  • sample set 1 includes 177 samples
  • sample set 2 includes 106 samples
  • sample set 3 includes 102 samples.
  • the quality of sample set 1 is higher than that of sample set 2
  • the quality of sample set 2 is higher than that of sample set 3.
  • the quality of the sample is high, for example, it can be that there are few missing features in the samples in the sample set, the number of features is large, or the number of samples with observed sample outcome events (ie, patient death/recovery) is large.
  • the sample set 1 is used as the training sample set, and the preset model 1 is obtained by training based on the method described in S101-S102 above, and the model 2 is obtained by training based on the existing coxPH method, and obtained by training based on the DeepSurv method.
  • Model 3 is used as the training sample set, and the preset model 1 is obtained by training based on the method described in S101-S102 above, and the model 2 is obtained by training based on the existing coxPH method, and obtained by training based on the DeepSurv method.
  • sample set 2 and sample set 3 as verification sample sets to verify the preset model 1, model 2 and model 3.
  • Table 4 shows the consistency (concordance index, C-index) index of the preset model 1, model 2 and model 3 after being verified by the same verification sample. It should be understood that the C-index index is used to evaluate the predictive ability of the model. It can be seen that based on the same verification sample, the C-index index of the preset model 1 obtained by the method provided in the embodiment of the present application is higher than the C-index index of the model 2 obtained by the existing coxPH method training, and higher than the existing The C-index index of model 3 obtained by DeepSurv method training.
  • Example 2 A progression prediction model for clinical disease A
  • hospital A has recorded clinical data for 2700 patients, ie hospital A includes 2700 samples.
  • hospital B has recorded clinical data of 1400 patients, that is, hospital B includes 1400 samples.
  • the sample of hospital A is used as the training sample, and the preset model is obtained by training through the method described in S101-S102 above, and based on the 10 ⁇ cross-validation method, a part of the sample of hospital A is used to perform the model training. Internal validation, and external validation of the model based on hospital B samples.
  • 10 ⁇ cross-validation refers to: divide the sample set into 10 groups, and use 9 groups of samples as training samples to train the model, and use the remaining group of samples as verification samples to train the aforementioned 9 groups of samples.
  • the model is tested and verified. This process is repeated 10 times to ensure that each group of samples has been used as a verification sample to test and verify the model. In this way, the results of 10 times of verification are averaged to obtain the result of 10 ⁇ cross-validation.
  • FIG. 9 shows a bar graph indicating the consistency of the model after internal verification and external verification of the model obtained from hospital A sample training based on the method provided by the embodiment of the present application and the existing method.
  • the checkered column is used to indicate the C-index index size of the model obtained after training the samples of hospital A based on the existing DeepSurv method after 10 ⁇ cross-validation, and the size of the C-index index after training the samples of hospital A
  • the striped column is used to indicate the C-index index size of the model obtained after training the samples of hospital A based on the existing coxnet method (the coxnet method is an improved method of the coxPH method) after 10 ⁇ cross-validation, and the C-index index of the hospital A.
  • the C-index index size of the model obtained after the sample is trained is externally verified by the sample of hospital B.
  • the white column is used to indicate the C-index index size of the model obtained after training the samples of Hospital A based on the method provided in the embodiment of this application after 10 ⁇ cross-validation, and the model obtained after training the samples of Hospital A passed The size of the C-index index after the external validation of the sample of hospital B.
  • the C-index index of the preset model trained by the method provided in the embodiment of the present application is higher than the C-index index of the model trained by the existing DeepSurv method and coxnet method. -index index.
  • the survival risk rate predicted by the method is accurate.
  • the accuracy is higher, which in turn improves the accuracy of the survival curve determined based on the survival hazard ratio.
  • the preset model used in the method of the embodiment of the present application can be obtained through end-to-end training, it is convenient to explain the model at different levels (whole and local), and then the sample can be predicted based on the sample characteristics obtained from the explanation. The contribution of the value is used to guide the sample improvement in the real scene.
  • the embodiment of the present application can divide the functional modules of the device for predicting the survival risk rate according to the above method example, for example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module .
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 10 shows a schematic structural diagram of an apparatus 100 for predicting survival risk provided by an embodiment of the present application.
  • the apparatus 100 for predicting the survival risk rate may be used to implement the above-mentioned method for predicting the survival risk rate, for example, to perform the method shown in FIG. 6 .
  • the apparatus 100 for predicting the survival risk rate may include an acquisition unit 101 and a processing unit 102 .
  • the obtaining unit 101 is configured to obtain data of samples to be predicted.
  • the processing unit 102 is configured to input the data of the sample to be predicted into a preset model, and process the data of the sample to be predicted through the preset model to obtain a survival risk rate representing the survival risk of the sample to be predicted.
  • the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate is based on the weight coefficient corresponding to each expert network The result obtained by weighted summing the output values of multiple expert networks.
  • the obtaining unit 101 may be used to execute S201, and the processing unit 102 may be used to execute S202.
  • the apparatus 100 for predicting the survival risk rate further includes: a determination unit 103, configured to determine the risk function of the sample to be predicted based on the survival risk rate of the sample to be predicted and the baseline risk function, wherein the risk function of the sample to be predicted is used Indicates the survival rate of the sample to be predicted at different times.
  • a determination unit 103 configured to determine the risk function of the sample to be predicted based on the survival risk rate of the sample to be predicted and the baseline risk function, wherein the risk function of the sample to be predicted is used Indicates the survival rate of the sample to be predicted at different times.
  • any one of the plurality of expert networks included in the prediction model includes at least one candidate RFCN, and the output value of any one expert network is an output value satisfying a preset condition among the output values of at least one candidate RFCN .
  • the data of the samples to be predicted include non-Euclidean data.
  • the apparatus 100 for predicting the survival risk rate further includes: an interpretation unit 104, configured to explain the preset model based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain the data of the sample to be predicted Effect of different characteristic data on survival hazard ratio.
  • an interpretation unit 104 configured to explain the preset model based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain the data of the sample to be predicted Effect of different characteristic data on survival hazard ratio.
  • the interpretation unit 104 is specifically configured to: interpret the preset model based on the case data of the patient and the survival risk rate of the patient, so as to obtain the The impact of different characteristic data on the survival risk rate of patients.
  • the interpretation unit 104 is specifically configured to: interpret the preset model based on the data of the device and the survival risk rate of the device, so as to obtain the different feature data in the data of the device. impact on the survival risk.
  • the function realized by the acquisition unit 101 in the apparatus 100 for predicting survival risk rate can be realized through the communication interface 24 in FIG. 2 , and the functions realized by the processing unit 102, the determination unit 103 and the interpretation unit 104 can be realized through The processor 11 in FIG. 2 executes the program code in the main memory 22 in FIG. 2 to realize.
  • Fig. 11 shows a schematic structural diagram of a signal-carrying medium for carrying a computer program product provided by an embodiment of the present application.
  • the signal-carrying medium is used for storing a computer program product or a computer program for executing a computer process on a computing device.
  • signal-bearing medium 110 may include one or more program instructions that, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to FIG. 6 .
  • one or more features referred to in S201 - S202 in FIG. 6 may be undertaken by one or more instructions associated with the signal bearing medium 110 .
  • the program instructions in FIG. 11 also describe example instructions.
  • signal bearing medium 110 may comprise computer readable medium 111 such as, but not limited to, a hard drive, compact disc (CD), digital video disc (DVD), digital tape, memory, read-only memory (read only memory) -only memory, ROM) or random access memory (random access memory, RAM) and so on.
  • computer readable medium 111 such as, but not limited to, a hard drive, compact disc (CD), digital video disc (DVD), digital tape, memory, read-only memory (read only memory) -only memory, ROM) or random access memory (random access memory, RAM) and so on.
  • signal bearing media 110 may comprise computer recordable media 112 such as, but not limited to, memory, read/write (R/W) CDs, R/W DVDs, and the like.
  • signal bearing medium 110 may include communication media 113 such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
  • communication media 113 such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
  • the signal bearing medium 110 may be conveyed by a wireless form of communication medium 113 (eg, a wireless communication medium conforming to the IEEE 1902.11 standard or other transmission protocol).
  • a wireless form of communication medium 113 eg, a wireless communication medium conforming to the IEEE 1902.11 standard or other transmission protocol.
  • One or more program instructions may be, for example, computer-executable instructions or logic-implementing instructions.
  • an apparatus for predicting survival risk such as that described with respect to FIG. Instructions provide various operations, functions, or actions.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • a software program When implemented using a software program, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the processes or functions according to the embodiments of the present application are generated in whole or in part when the computer executes the instructions on the computer.
  • a computer can be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or may contain one or more data storage devices such as servers and data centers that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present application relates to the technical field of artificial intelligence, and discloses a method and device for predicting a survival hazard ratio (HR). The method comprises: obtaining data of a sample to be predicted; and inputting the data of said sample into a preset model, and processing the data of said sample by means of the preset model to obtain a survival HR for representing a survival hazard of said sample. The preset model comprises a gating network and a plurality of expert networks, the gating network is used for determining, according to the data of said sample, a weight coefficient corresponding to each expert network, and the survival HR is a result obtained by performing weighted summation on output values of the plurality of expert networks according to the weight coefficient corresponding to each expert network.

Description

预测生存风险率的方法及装置Method and device for predicting survival risk rate 技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种预测生存风险率(hazard ratio,HR)的方法及装置。The present application relates to the technical field of artificial intelligence, and in particular to a method and device for predicting a survival risk ratio (hazard ratio, HR).
背景技术Background technique
生存分析(survival analysis),指的是一系列用来探究目标事件发生的时间的统计方法。例如,癌症患者的存活时间分析。又例如,设备的失效时间分析,等等。Survival analysis refers to a series of statistical methods used to explore the time of occurrence of target events. For example, survival time analysis of cancer patients. Another example, failure time analysis of equipment, and so on.
通常,在对目标事件进行生存分析时,可以基于预先调查或实验得到的数据建立分析模型,该分析模型即可用于根据影响目标事件发生的一个或多个特征变量,预测该一个或多个特征变量对目标事件的生存曲线的影响,以实现对目标事件的生存分析。示例性的,可以通过建立cox比例风险回归模型(Cox proportional hazards model,coxPH),并将影响目标事件发生的一个或多个特征变量输入该模型,即可预测到该目标事件在不同时间发生的风险。应理解,目标事件在不同时间发生的风险可以反映观测事件的生存曲线。这里,观测事件的结局事件即为该目标事件。其中,coxPH模型可以表示为公式(1):Usually, when performing survival analysis on a target event, an analysis model can be established based on data obtained from pre-investigation or experiments, and the analysis model can be used to predict one or more characteristics based on one or more characteristic variables that affect the occurrence of the target event The effect of variables on the survival curve of the target event to achieve the survival analysis of the target event. Exemplarily, by establishing a cox proportional hazards regression model (Cox proportional hazards model, coxPH), and inputting one or more characteristic variables that affect the occurrence of the target event into the model, it is possible to predict the occurrence of the target event at different times risk. It will be appreciated that the risk of the target event occurring at different times may reflect the survival curve for the observed events. Here, the ending event of the observation event is the target event. Among them, the coxPH model can be expressed as formula (1):
公式(1)   h(t)=h 0(t)×exp(b 1x 1+b 2x 2+...+b px p) Formula (1) h(t)=h 0 (t)×exp(b 1 x 1 +b 2 x 2 +...+b p x p )
这里,t为生存时间,h(t)即为目标事件的风险函数,表示在生存时间为t时刻时目标事件的死亡风险。h 0(t)表示基准风险函数,基准风险函数通常是预先通过大量样本的生存曲线确定得到的。x 1、x 2、…x p表示p个协变量,即影响待预测的目标事件的特征变量,b 1、b 2、…b p表示每个协变量的回归系数。可以看出,coxPH模型为线性模型,即coxPH模型仅可用于分析输入特征和学习目标(即目标事件的发生风险)之间呈线性关系的数据。 Here, t is the survival time, and h(t) is the risk function of the target event, which represents the death risk of the target event when the survival time is t. h 0 (t) represents the base risk function, which is usually determined in advance through the survival curves of a large number of samples. x 1 , x 2 , ... x p represent p covariates, that is, characteristic variables that affect the target event to be predicted, and b 1 , b 2 , ... b p represent the regression coefficient of each covariate. It can be seen that the coxPH model is a linear model, that is, the coxPH model can only be used to analyze data with a linear relationship between input features and learning objectives (ie, the risk of occurrence of target events).
然而,在实际应用中,影响目标事件发生的特征变量对目标事件的发生的影响往往都是非线性的,即影响目标事件的特征变量和目标事件的发生之间的关系通常都是非线性的关系。因此,线性模型coxPH无法准确的对该目标事件进行生存分析。基于此,如何提高生存分析的准确率,是现有技术中亟待解决的技术问题。However, in practical applications, the influence of the characteristic variables affecting the occurrence of the target event on the occurrence of the target event is often nonlinear, that is, the relationship between the characteristic variables affecting the target event and the occurrence of the target event is usually a nonlinear relationship. Therefore, the linear model coxPH cannot accurately perform survival analysis on the target event. Based on this, how to improve the accuracy of survival analysis is a technical problem to be solved urgently in the prior art.
发明内容Contents of the invention
本申请提供了一种预测生存风险率的方法及装置,可以提高生存分析的准确率。The present application provides a method and device for predicting survival risk rate, which can improve the accuracy of survival analysis.
为达上述目的,本申请提供如下技术方案:In order to achieve the above object, the application provides the following technical solutions:
第一方面,本申请提供了一种预测生存风险率的方法,该方法包括:获取待预测样本的数据。将待预测样本的数据输入至预设模型,通过预设模型对待预测样本的数据进行处理,得到用于表示该待预测样本的生存风险的生存风险率HR。其中,预设模型包括门控网络和多个专家网络,该门控网络用于根据待预测样本的数据确定每个专家网络对应的权重系数,预设模型输出的生存风险率为根据每个专家网络对应的权重系数对多个专家网络的输出值加权求和获得的结果。In a first aspect, the present application provides a method for predicting a survival risk rate, the method comprising: acquiring data of a sample to be predicted. The data of the sample to be predicted is input into the preset model, and the data of the sample to be predicted is processed by the preset model to obtain the survival risk rate HR used to represent the survival risk of the sample to be predicted. Among them, the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate output by the preset model is based on each expert The weight coefficient corresponding to the network is the result obtained by weighting and summing the output values of multiple expert networks.
通过本申请提供的方法,由于预设模型中包括有多个专家网络和用于确定专家网络权重系数的门控网络,使得该预设模型可以根据待预测样本的数据对多个专家网络的输出结果进行集成,因此,通过该预设模型预测到的生存风险率的准确度更高,进而基于生存风险率所确定的生存曲线的准确度也更高。并且,该预设模型可以基于端到端的训练方法训练得到。Through the method provided in this application, since the preset model includes multiple expert networks and the gating network used to determine the weight coefficients of the expert networks, the preset model can output multiple expert networks according to the data of the samples to be predicted The results are integrated, therefore, the accuracy of the survival risk rate predicted by the preset model is higher, and the accuracy of the survival curve determined based on the survival risk rate is also higher. Moreover, the preset model can be trained based on an end-to-end training method.
在一种可能的设计方式中,上述方法还包括:基于上述生存风险率和基准风险函数,确 定待预测样本的风险函数,风险函数用于指示待预测样本在不同时间的生存率。In a possible design mode, the above method further includes: determining the risk function of the sample to be predicted based on the above survival risk rate and the baseline risk function, and the risk function is used to indicate the survival rate of the sample to be predicted at different times.
其中,该生存风险率即为上述预测模型对待预测样本处理后,预测到的待预测样本的生存风险率。这样,通过该可能的设计方式,即实现了对待预测样本的生存分析。由于通过本申请提供的方法预测到的待预测样本的生存风险率的准确度高,因此,基于本申请提供方法预测到的待预测样本的生存风险率确定出的、用于指示待预测样本在不同时间的生存率的风险函数的准确度也比较高。Wherein, the survival risk rate is the survival risk rate of the to-be-predicted sample predicted by the above prediction model after the to-be-predicted sample is processed. In this way, through this possible design method, the survival analysis of the sample to be predicted is realized. Since the accuracy of the survival risk rate of the sample to be predicted predicted by the method provided by this application is high, the method used to indicate the sample to be predicted is determined based on the survival risk rate of the sample to be predicted by the method provided by this application. The accuracy of the hazard function of the survival rate at different times is also relatively high.
在另一种可能的设计方式中,上述预设模型中的多个专家网络中的任一个专家网络包括至少一个候选残差全连接神经网络RFCN,任一个专家网络的输出值是至少一个候选RFCN的输出值中满足预设条件的输出值。In another possible design, any expert network among the plurality of expert networks in the above preset model includes at least one candidate residual fully connected neural network RFCN, and the output value of any expert network is at least one candidate RFCN The output value that satisfies the preset condition among the output values.
在该可能的设计方式中,通过将一个专家网络中多个候选RFCN的结果中满足预设条件的候选RFCN的学习结果作为该专家网络的输出结果,可以体现出择优思想,从而可以提高对待预测样本的预测准确率。In this possible design method, by using the learning results of candidate RFCNs that meet the preset conditions among the results of multiple candidate RFCNs in an expert network as the output result of the expert network, the idea of selecting the best can be reflected, so that the predictions to be made can be improved. The prediction accuracy of the sample.
在另一种可能的设计方式中,上述待预测样本的数据包括非欧几里德类型的数据。In another possible design manner, the data of the samples to be predicted include non-Euclidean data.
其中,非欧数据是排列不整齐、排列没有规律的数据。实际应用中,非欧数据的数量庞大,且结构复杂。通常,待预测样本的数据中的非欧数据和待预测样本的生存率之间的关系是非线性的关系,通过该可能的设计,本申请实施例提供的方法可以实现包括非欧数据的待预测样本的数据进行处理分析。Among them, the non-European data is the data that is arranged irregularly and irregularly. In practical applications, the amount of non-European data is huge and the structure is complex. Usually, the relationship between the non-Euclidean data in the data of the sample to be predicted and the survival rate of the sample to be predicted is a nonlinear relationship. Through this possible design, the method provided in the embodiment of the present application can realize the The sample data is processed and analyzed.
在另一种可能的设计方式中,上述方法还包括:基于上述待预测样本的数据和待预测样本的生存风险率,对预设模型进行解释,以获得待预测样本的数据中不同特征数据对生存风险率的影响。In another possible design mode, the above method further includes: based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, explaining the preset model to obtain the pairs of different characteristic data in the data of the sample to be predicted impact on survival risk.
在另一种可能的设计方式中,当上述待预测样本的数据是患者的病例数据,则上述基于待预测样本的数据和待预测样本的生存风险率,对预设模型进行解释,以获得待预测样本的数据中不同特征数据对生存风险率的影响,包括:基于患者的病例数据和患者的生存风险率,对预设模型进行解释,以获得患者的病例数据中不同特征数据对患者的生存风险率的影响。In another possible design mode, when the data of the sample to be predicted is the case data of the patient, then the preset model is explained based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain Predict the impact of different characteristic data in the sample data on the survival risk rate, including: based on the patient's case data and the patient's survival risk rate, explain the preset model to obtain the impact of different characteristic data in the patient's case data on the patient's survival impact on risk.
在另一种可能的设计方式中,当上述待预测样本是设备的数据,则上述基于待预测样本的数据和待预测样本的生存风险率,对预设模型进行解释,以获得待预测样本的数据中不同特征数据对生存风险率的影响,包括:基于设备的数据和设备的生存风险率,对预设模型进行解释,以获得设备的数据中不同特征数据对设备的生存风险率的影响。In another possible design mode, when the above-mentioned sample to be predicted is the data of equipment, the above-mentioned data based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted explain the preset model to obtain the The impact of different feature data in the data on the survival risk rate, including: based on the data of the device and the survival risk rate of the device, the preset model is explained to obtain the impact of different feature data in the device data on the survival risk rate of the device.
通过该几种可能的设计方式,基于本申请所提供方法所获得的待预测样本的数据中不同特征数据对生存风险率的影响,领域专家可以基于不同特征对样本生存风险率的影响高低来指导实践。例如,对于患者而言,临床医生可以基于该患者的病例数据中的不同治疗数据对患者生存风险率的影响,调整该患者的临床治疗方案。再例如,对于设备而言,工程师可以基于设备的不同特征数据对设备生存风险率的影响,对设备进行改良优化。Through these several possible design methods, based on the impact of different feature data on the survival risk rate in the data of the sample to be predicted obtained by the method provided by this application, domain experts can guide based on the impact of different features on the sample survival risk rate practice. For example, for a patient, clinicians can adjust the patient's clinical treatment plan based on the impact of different treatment data in the patient's case data on the patient's survival risk rate. For another example, for equipment, engineers can improve and optimize the equipment based on the impact of different characteristic data of the equipment on the survival risk rate of the equipment.
在另一种可能的设计方式中,上述方法还包括:利用训练样本的数据对初始模型进行训练,得到预设模型。其中,初始模型包括初始门控网络和多个初始专家网络。In another possible design manner, the above method further includes: using training sample data to train an initial model to obtain a preset model. Among them, the initial model includes an initial gating network and multiple initial expert networks.
在另一种可能的设计方式中,上述利用训练样本的数据对初始模型进行训练,包括:将训练样本的数据输入初始模型中的初始门控网络和多个初始专家网络。根据该初始门控网络得到每个初始专家网络的权重系数,并根据每个初始专家网络对应的权重系数对多个初始专家网络的输出值加权求和,得到训练样本的预测生存风险率。基于训练样本的预测生存风险率和训练样本的生存数据确定损失函数。基于损失函数调节初始门控网络和多个初始专家网络的网络参数。In another possible design manner, the training of the initial model by using the data of the training samples includes: inputting the data of the training samples into the initial gating network and multiple initial expert networks in the initial model. The weight coefficient of each initial expert network is obtained according to the initial gating network, and the output values of multiple initial expert networks are weighted and summed according to the corresponding weight coefficient of each initial expert network to obtain the predicted survival risk rate of the training sample. A loss function is determined based on the predicted survival hazard rates of the training samples and the survival data of the training samples. The network parameters of an initial gating network and multiple initial expert networks are tuned based on a loss function.
其中,训练样本的生存数据,包括观测该训练样本的时间,以及在该时间,该训练样本的存活状态。这里,观测该训练样本的时间,可以是该训练样本的生存时间,也可以是该训练样本的起始事件发生后、且结局事件发生之前的任意时间。这里,训练样本的起始事件及结局事件,与该训练样本训练得到的预设模型的应用场景相关。例如,当预设模型预测的生存风险率用于研究抗癌药物的疗效时,则训练样本的起始事件可以是患者开始服用抗癌药物,结局事件可以是患者死亡。或者,当预设模型预测的生存风险率用于研究患者术后的生存率,则训练样本的起始事件可以是患者实施手术,结局事件可以是患者死亡。或者,当预设模型预测的生存风险率用于研究设备的寿命,则训练样本的起始事件可以是设备/零件的出厂,结局事件可以是设备失效,等等。训练样本的存活状态,包括训练样本的存活和死亡两种状态。这样,通过该两种可能的设计,可以通过端到端的方式训练得到本申请所提供预测生存风险率时用到的预设模型。Wherein, the survival data of the training sample includes the time when the training sample is observed, and the survival status of the training sample at this time. Here, the time for observing the training sample may be the survival time of the training sample, or any time after the initial event of the training sample occurs and before the ending event occurs. Here, the start event and the end event of the training sample are related to the application scenario of the preset model trained by the training sample. For example, when the survival risk rate predicted by the preset model is used to study the efficacy of anticancer drugs, the initial event of the training sample can be that the patient starts to take the anticancer drug, and the final event can be the death of the patient. Alternatively, when the survival risk rate predicted by the preset model is used to study the survival rate of patients after surgery, the initial event of the training sample can be the operation of the patient, and the outcome event can be the death of the patient. Alternatively, when the survival risk rate predicted by the preset model is used to study the life of the device, the initial event of the training sample can be the delivery of the device/part, and the final event can be the failure of the device, and so on. The survival state of the training sample includes two states of survival and death of the training sample. In this way, through the two possible designs, the preset model used in predicting the survival risk rate provided by the present application can be obtained through end-to-end training.
第二方面,本申请提供了一种预测生存风险率的装置。In a second aspect, the present application provides a device for predicting survival risk.
在一种可能的设计方式中,该预测生存风险率的装置用于执行上述第一方面提供的任一种方法。本申请可以根据上述第一方面提供的任一种方法,对该预测生存风险率的装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。示例性的,本申请可以按照功能将该预测生存风险率的装置划分为获取单元和处理单元等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第一方面或其相应的可能的设计提供的技术方案,此处不再赘述。In a possible design manner, the device for predicting the survival risk rate is used to implement any one of the methods provided in the first aspect above. The present application may divide the device for predicting survival risk into functional modules according to any one of the methods provided in the first aspect above. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. Exemplarily, the present application may divide the apparatus for predicting the survival risk rate into an acquisition unit, a processing unit, and the like according to functions. For the description of the possible technical solutions and beneficial effects performed by the above-mentioned divided functional modules, reference may be made to the technical solutions provided by the first aspect or its corresponding possible designs, and details will not be repeated here.
在另一种可能的设计中,该预测生存风险率的装置包括:一个或多个处理器和传输接口,该一个或多个处理器通过该传输接口接收或发送数据,该一个或多个处理器被配置为调用存储在存储器中的程序指令,以使得预测生存风险率的装置执行如第一方面及其任一种可能的设计方式提供的任一种方法。In another possible design, the device for predicting the survival risk rate includes: one or more processors and a transmission interface, the one or more processors receive or send data through the transmission interface, and the one or more processing The device is configured to invoke program instructions stored in the memory, so that the apparatus for predicting survival risk rate executes any method as provided in the first aspect and any possible design manner thereof.
第三方面,本申请提供了一种计算机可读存储介质,该计算机可读存储介质包括程序指令,当程序指令在计算机或处理器上运行时,使得计算机或处理器执行第一方面中的任一种可能的实现方式提供的任一种方法。In a third aspect, the present application provides a computer-readable storage medium, the computer-readable storage medium includes program instructions, and when the program instructions are run on a computer or a processor, the computer or the processor executes any of the steps in the first aspect. Either method provided by a possible implementation.
第四方面,本申请提供了一种计算机程序产品,当其在预测生存风险率的装置上运行时,使得第一方面中的任一种可能的实现方式提供的任一种方法被执行。In a fourth aspect, the present application provides a computer program product, which, when running on a device for predicting survival risk, causes any one of the methods provided in any one of the possible implementations in the first aspect to be executed.
可以理解的是,上述提供的任一种预测生存风险率的装置、计算机存储介质或计算机程序产品等均可以应用于上文所提供的对应的方法,因此,其所能达到的有益效果可参考对应的方法中的有益效果,此处不再赘述。It can be understood that any of the devices, computer storage media, or computer program products provided above for predicting the survival risk rate can be applied to the corresponding methods provided above. Therefore, the beneficial effects that it can achieve can refer to The beneficial effects of the corresponding method will not be repeated here.
在本申请中,上述预测生存风险率的装置的名字对设备或功能模块本身不构成限定,在实际实现中,这些设备或功能模块可以以其他名称出现。只要各个设备或功能模块的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。In this application, the name of the above-mentioned device for predicting the survival risk rate does not constitute a limitation on the device or functional module itself, and in actual implementation, these devices or functional modules may appear with other names. As long as the functions of each device or functional module are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalent technologies.
附图说明Description of drawings
图1为一种生存曲线的示意图;Fig. 1 is a schematic diagram of a survival curve;
图2为本申请实施例提供的一种预测装置的结构示意图;FIG. 2 is a schematic structural diagram of a prediction device provided in an embodiment of the present application;
图3为本申请实施例提供的预设模型的训练方法的流程示意图;FIG. 3 is a schematic flowchart of a training method for a preset model provided in an embodiment of the present application;
图4为本申请实施例提供的一种初始模型的结构示意图;Fig. 4 is a schematic structural diagram of an initial model provided in the embodiment of the present application;
图5为本申请实施例提供的一种专家网络的结构示意图;FIG. 5 is a schematic structural diagram of an expert network provided by an embodiment of the present application;
图6为本申请实施例提供的一种预测生存风险率的方法的流程示意图;FIG. 6 is a schematic flowchart of a method for predicting survival risk provided by an embodiment of the present application;
图7为本申请实施例提供的一种对预设模型进行解释的方法示意图;FIG. 7 is a schematic diagram of a method for explaining a preset model provided by an embodiment of the present application;
图8为本申请实施例提供的一种预设模型将样本集中的样本分组后,各组样本的生存曲线的示意图;FIG. 8 is a schematic diagram of the survival curves of samples in each group after a preset model provided in the embodiment of the present application groups the samples in the sample set;
图9为基于本申请实施例提供的方法和现有方法对医院A的样本训练得到模型进行内部验证和外部验证后的指示模型一致性的柱状结果图;Fig. 9 is a histogram of results indicating the consistency of the model after internal verification and external verification of the model obtained from the sample training of hospital A based on the method provided by the embodiment of the present application and the existing method;
图10为本申请实施例提供的一种预测生存风险率的装置的结构示意图;FIG. 10 is a schematic structural diagram of a device for predicting survival risk provided by an embodiment of the present application;
图11为本申请实施例提供的一种用于承载计算机程序产品的信号承载介质的结构示意图。FIG. 11 is a schematic structural diagram of a signal carrying medium for carrying a computer program product provided by an embodiment of the present application.
具体实施方式detailed description
为了更清楚的理解本申请实施例,下面对本申请实施例中涉及的部分术语或技术进行说明:In order to understand the embodiments of the present application more clearly, some terms or technologies involved in the embodiments of the present application are described below:
1)、生存曲线1), survival curve
生存曲线是指观测样本的存活率(或称为生存率)随时间变化的曲线。其中,相对于死亡而言,生存可以指生物的存活。相对于疾病复发或恶化而言,生存可以是指患者的病情处于缓解状态。相对于设备/系统/零件的失效(或故障),生存可以是设备/系统/零件的正常工作。相对于客户的流失,生存可以指客户依旧正常维护。The survival curve refers to the curve of the survival rate (or survival rate) of the observed sample over time. Wherein, as opposed to death, survival may refer to living things. Survival, as opposed to relapse or progression of disease, can refer to a patient's disease being in remission. As opposed to failure (or failure) of a device/system/part, survival may be the normal functioning of the device/system/part. Compared with the loss of customers, survival can refer to the normal maintenance of customers.
在实际应用中,可以用生存曲线反映疾病治愈后的复发情况,或者用生存曲线反映设备/零件从出厂开始的失效情况等。In practical applications, the survival curve can be used to reflect the recurrence of the disease after being cured, or the survival curve can be used to reflect the failure of the equipment/parts from the factory.
以观测样本的数量为1000,观测时间以天为单位为例,参考图1,图1示出了一种生存曲线的示意图。如图1所示,横轴可以表示观测时间,纵轴可以表示观测样本的生存率。则1000个样本的生存率随时间变化的曲线可以是图1所示的生存曲线10。可以看出,在第一天,1000个样本的生存率为90%。在第二天,样本的生存率下降45%,即以第一天存活的样本为基数,第二天样本的生存率为50%。在第三天,样本的生存率下降20%,即以第二天存活的样本为基数,第三天样本的生存率为45%,等等。Taking the number of observation samples as 1000 and the observation time as an example in days, refer to FIG. 1 , which shows a schematic diagram of a survival curve. As shown in Figure 1, the horizontal axis can represent the observation time, and the vertical axis can represent the survival rate of the observed samples. Then the curve of the survival rate of 1000 samples changing with time may be the survival curve 10 shown in FIG. 1 . It can be seen that on the first day, the survival rate of 1000 samples is 90%. On the second day, the survival rate of the sample drops by 45%, that is, based on the samples that survived on the first day, the survival rate of the sample on the second day is 50%. On the third day, the survival rate of the sample drops by 20%, that is, based on the samples that survived the second day, the survival rate of the sample on the third day is 45%, and so on.
另外,对于一个样本而言,该样本的生存曲线为该样本生存概率随时间变化的曲线。In addition, for a sample, the survival curve of the sample is the curve of the survival probability of the sample changing with time.
例如患者在术后的第一天,其生存概率是0.3。在术后的第二天,其生存概率0.5。在术后的第三天,其生存概率是0.8,等等。For example, on the first day after surgery, the probability of survival for a patient is 0.3. On the second postoperative day, the probability of survival was 0.5. On the third postoperative day, the probability of survival is 0.8, and so on.
2)、生存时间2), survival time
生存时间是指从观测目标的起点事件到结局事件发生时所经历的时间。其中,该观测目标的结局事件即为上文所述的目标事件。Survival time refers to the time elapsed from the starting event of the observed target to the occurrence of the ending event. Wherein, the ending event of the observation target is the target event mentioned above.
例如,如果观测目标是患者术后的存活情况,则该观测目标的起点事件可以是对患者实施手术,该观测目标的结局事件可以是患者死亡。这种情况下,对患者进行手术到患者死亡的这一段时间,即可称为该患者术后的生存时间。For example, if the observation target is the patient's postoperative survival, the starting event of the observation target may be the operation on the patient, and the outcome event of the observation target may be the death of the patient. In this case, the period from the operation to the patient's death can be called the postoperative survival time of the patient.
又例如,如果观测目标是设备/零件的使用寿命,则该观测目标的起点事件可以是对设备/零件生产完成,该观测目标的结局事件可以是设备/零件失效。这种情况下,设备/零件生产完成到设备/零件失效的这一段时间,即可称为该设备/零件的生存时间。For another example, if the observation target is the service life of equipment/parts, the starting event of the observation target may be the completion of the production of the equipment/parts, and the ending event of the observation target may be the failure of the equipment/parts. In this case, the period from the completion of the production of the equipment/part to the failure of the equipment/part can be called the survival time of the equipment/part.
3)生存风险率3) Survival hazard rate
生存风险率即为样本在单位时间内的死亡可能性。也即,样本的生存风险率用于表示样本的生存风险。The survival hazard rate is the probability of death of a sample within a unit of time. That is, the survival hazard ratio of the sample is used to express the survival risk of the sample.
上文中公式1所表示的风险函数中,exp(b)即为生存风险率。应理解,样本生存的风险率越高,即该样本的死亡率高,也即该目标的生存率越低。In the risk function represented by formula 1 above, exp(b) is the survival risk rate. It should be understood that the higher the risk rate of sample survival, that is, the higher the mortality rate of the sample, that is, the lower the survival rate of the target.
4)截断数据4) Truncated data
截断数据也可以称为时间事件数据(time-to-event data),是用于表示事件在某个时间是否发生的数据。Truncated data can also be called time-to-event data, which is data used to indicate whether an event occurs at a certain time.
例如,术后的患者,在术后一年病情复发,则患者病情复发以及复发的时间,可以称为截断数据。For example, if a postoperative patient relapses one year after the operation, the patient's relapse and the time of relapse can be called truncated data.
可以看出,截断数据包括两个维度上的数据,一个是时间维度,一个事件维度。在时间维度,截断数据包括连续的观察时间(time)。在事件维度,截断数据包括离散的事件状态。其中,事件状态包括两种状态,一种是事件发生的状态(即event=1),一种是事件未发生的状态(即event=0)。It can be seen that the truncated data includes data in two dimensions, one is the time dimension and the other is the event dimension. In the temporal dimension, truncated data consist of consecutive observation times. In the event dimension, truncated data includes discrete event states. Wherein, the event state includes two states, one is a state where an event occurs (ie, event=1), and the other is a state where an event does not occur (ie, event=0).
5)、生存分析5), survival analysis
生存分析指的是一系列用来探究目标事件发生的时间的统计方法。例如探究目标事件在某一时间的发生概率。Survival analysis refers to a family of statistical methods used to explore the timing of an event of interest. For example, explore the probability of occurrence of a target event at a certain time.
在对目标事件进行生存分析时,通常可以通过根据实验(或调查)的多个已知样本中影响目标事件发生的特征数据和该多个已知样本的生存数据建立分析模型,并通过分析模型预测出待预测样本的发生目标事件的风险函数h(t),该风险函数h(t)可以用于确定在不同时间,目标事件发生的风险。其中,生存数据一般是截断数据,例如是包括时间和该时间点是否发生目标事件的数据。应理解,这里所述的时间可以生存时间,也可以是任意的观测时间,对此不作限定。When performing survival analysis on the target event, it is usually possible to establish an analysis model based on the characteristic data affecting the occurrence of the target event in multiple known samples of the experiment (or survey) and the survival data of the multiple known samples, and through the analysis model The risk function h(t) of the occurrence of the target event of the sample to be predicted is predicted, and the risk function h(t) can be used to determine the risk of the target event occurring at different times. Wherein, the survival data is generally truncated data, for example, data including time and whether a target event occurs at this time point. It should be understood that the time mentioned here may be a survival time or any observation time, which is not limited.
生存分析的方法可以应用但不限于以下真实场景:The method of survival analysis can be applied but not limited to the following real scenarios:
A、医疗健康方面:通过对疾病病程进行生存分析,实现疾病的预后分析。其中,预后是对于某种疾病发展过程和后果的预测。按照疾病发生或发展过程中是否接受治疗,预后可分为自然预后和治疗预后。A. Medical and health: Through the survival analysis of the disease course, the prognosis analysis of the disease is realized. Among them, prognosis is the prediction of the development process and consequences of a certain disease. According to whether treatment is received during the occurrence or development of the disease, the prognosis can be divided into natural prognosis and treatment prognosis.
B、城市建设方面:通过对城轨设备进行生存分析,实现对城轨设备未来发生故障的风险率进行预测。或者,通过对城市供水网管道进行生存分析,实现对城市供水网管道爆管的风险率进行预测。等等。B. Urban construction: Through the survival analysis of urban rail equipment, the risk rate of future failure of urban rail equipment can be predicted. Or, through survival analysis of urban water supply network pipelines, the risk rate of urban water supply network pipeline bursts can be predicted. etc.
C、金融服务方面:通过对消费者的消费分期进行生存分析,实现对消费分期违约的风险率进行预测。C. In terms of financial services: through the survival analysis of consumers' consumption installments, the risk rate of default in consumption installments can be predicted.
6)、非欧几里德数据(non-euclidean space data)6), non-Euclidean data (non-euclidean space data)
非欧几里德数据也可以称为非欧数据,非欧数据是排列不整齐、排列没有规律的数据。在由非欧数据构成的样本中,数据的排列顺序或位置,不影响该样本的特性。Non-Euclidean data can also be called non-Euclidean data. Non-Euclidean data is data that is not neatly arranged and arranged irregularly. In a sample composed of non-Euclidean data, the order or position of the data does not affect the characteristics of the sample.
在实际中,许多领域都存在非欧数据。例如社会科学领域中的社交网络数据,通信技术领域中的传感器网络,基因组领域的调控网,或计算机图形中的网格曲面等。In practice, non-European data exist in many fields. For example, social network data in the field of social science, sensor networks in the field of communication technology, regulatory networks in the field of genomics, or mesh surfaces in computer graphics, etc.
可以理解,实际场景中的非欧数据的数据量非常庞大,且结构复杂。It can be understood that the amount of non-European data in actual scenarios is very large and the structure is complex.
7)残差网络(residual network,ResNet)和残差全连接网络(residual fully-connected neural network,RFCN)7) Residual network (ResNet) and residual fully-connected neural network (RFCN)
ResNet是一种神经网络,ResNet中包括有跳跃连接或捷径连接,这些连接可以使网络层之间的数据传递跳过一些网络层,从而避免了深度神经网络中的网络退化现象和梯度消失现象,并能提高网络的训练速度,同时还可以使网络的层数变的很深。ResNet is a kind of neural network. ResNet includes skip connections or shortcut connections. These connections can make data transfer between network layers skip some network layers, thereby avoiding network degradation and gradient disappearance in deep neural networks. And it can improve the training speed of the network, and at the same time, it can make the number of layers of the network very deep.
应理解,层数很深的深度神经网络更有利于处理结构复杂的数据。It should be understood that a deep neural network with a deep number of layers is more conducive to processing data with complex structures.
RFCN是一种以全连接层为基础单元,并引入跳跃连接或捷径连接的神经网络。RFCN is a neural network based on a fully connected layer and introducing skip connections or shortcut connections.
8)其他术语8) Other terms
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete manner.
在本申请的实施例中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。In the embodiments of the present application, the terms "first" and "second" are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of the present application, unless otherwise specified, "plurality" means two or more.
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. The term "and/or" is an association relationship describing associated objects, which means that there may be three kinds of relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists independently. situation. In addition, the character "/" in this application generally indicates that the contextual objects are an "or" relationship.
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。It should be understood that determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。It should be understood that "one embodiment", "an embodiment" and "a possible implementation" mentioned throughout the specification mean that specific features, structures or characteristics related to the embodiment or implementation are included in this application. In at least one embodiment of . Therefore, appearances of "in one embodiment" or "in an embodiment" or "one possible implementation" throughout the specification do not necessarily refer to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
当影响目标事件的特征变量和目标事件的发生之间的关系是非线性关系时,在实现对目标事件的生存分析时,在一种可能的实现方式中,可以通过多层感知机(multi-layer perception,MLP)建立深度生存分析(deepsurv)模型。然而MLP的前馈全连接神经网络忽视了网络之间的层次关系,并且过深的前馈全连接神经网络容易出现梯度消失的问题。这样,会导致深度生存分析模型的预测准确率不高。When the relationship between the characteristic variables affecting the target event and the occurrence of the target event is a nonlinear relationship, when realizing the survival analysis of the target event, in a possible implementation, a multi-layer perceptron (multi-layer perception, MLP) to establish a deep survival analysis (deepsurv) model. However, the feedforward fully connected neural network of MLP ignores the hierarchical relationship between the networks, and the too deep feedforward fully connected neural network is prone to the problem of gradient disappearance. In this way, the prediction accuracy of the deep survival analysis model will be low.
在另一种可能的实现方式中,可以通过基于相同的样本集预先训练得到多个不同的弱模型(例如多个coxPH模型),再将该多个弱模型进行集成融合,从而得到一个相比弱模型而言,准确率与泛化能力均更好的集成模型(例如coxPH集成模型)。然而,对模型进行集成融合的流程通常比较复杂,且通过这种获得集成模型的方式,不是端到端获得模型的方式。此外,由于集成模型包括多个弱模型,而该多个弱模型之间的差异会对该集成模型的解释造成一定影响。In another possible implementation, multiple different weak models (such as multiple coxPH models) can be pre-trained based on the same sample set, and then integrated and fused to obtain a comparison For weak models, an integrated model with better accuracy and generalization ability (such as the coxPH integrated model). However, the process of integrating and merging models is usually complicated, and this way of obtaining an integrated model is not an end-to-end way of obtaining a model. In addition, since the integrated model includes multiple weak models, the differences among the multiple weak models will affect the interpretation of the integrated model to a certain extent.
基于此,本申请实施例提供一种预测生存风险率的方法,该方法可以基于预先训练得到的预设模型来预测待预测样本的生存风险率,该生存风险率即用于表示该待预测样本的生存风险,基于该风险率和基准风险函数,即可确定出反映待预测样本生存曲线的风险函数,从而实现了对待预测样本的生存分析。其中,待预测样本的生存风险率用于表示该待预测样本的生存风险。Based on this, the embodiment of the present application provides a method for predicting the survival risk rate, which can predict the survival risk rate of the sample to be predicted based on the pre-trained model obtained in advance, and the survival risk rate is used to represent the sample to be predicted Based on the risk rate and the baseline risk function, the risk function reflecting the survival curve of the sample to be predicted can be determined, thereby realizing the survival analysis of the sample to be predicted. Wherein, the survival risk rate of the sample to be predicted is used to represent the survival risk of the sample to be predicted.
上述预设模型包括门控网络和多个专家网络。其中,门控网络用于根据待预测样本获得每个专家网络对应的权重系数。而待预测样本的生存风险率即为根据预设模型中的每个专家网络对应的权重系数对上述多个专家网络输出值的加权求和获得的结果。The above preset model includes a gated network and multiple expert networks. Among them, the gating network is used to obtain the weight coefficient corresponding to each expert network according to the samples to be predicted. The survival risk rate of the sample to be predicted is the weighted summation of the output values of the above-mentioned multiple expert networks according to the weight coefficient corresponding to each expert network in the preset model.
其中,本申请实施例提供的预设模型可以基于端到端的方法训练得到,并且该预设模型可以看作是多个专家网络根据门控网络产生的权重系数进行集成融合后的集成模型。因此,基于该预设模型预测得到的待预测样本的生存风险率的准确率较高,进而提高了基于该风险 率对待预测样本进行生存分析的准确率。其中,该预设模型具体的训练方法可以参考下文描述,这里不作赘述。Wherein, the preset model provided in the embodiment of the present application can be trained based on an end-to-end method, and the preset model can be regarded as an integrated model after integration and fusion of multiple expert networks according to the weight coefficients generated by the gating network. Therefore, the accuracy of the survival risk rate of the sample to be predicted based on the prediction of the preset model is relatively high, thereby improving the accuracy of the survival analysis of the sample to be predicted based on the risk rate. Wherein, the specific training method of the preset model can refer to the description below, and will not be repeated here.
此外,上述预设模型中的专家网络可以通过RFCN实现,这样,可以使得该预设模型可以基于具有非线性特性的非欧数据训练得到。由于现实场景中的非欧数据数量非常大,且结构复杂,因此,基于非欧数据训练得到的预设模型,具有更强的学习能力和较高的预测准确率。In addition, the expert network in the above preset model can be implemented by RFCN, so that the preset model can be trained based on non-Euclidean data with nonlinear characteristics. Since the amount of non-European data in the real scene is very large and the structure is complex, the preset model trained based on non-European data has stronger learning ability and higher prediction accuracy.
本申请实施例还提供一种预测生存风险率的装置(以下简称预测装置),该预测装置可以是任意具有计算能力的计算设备或者多个计算设备组成的计算设备集合。例如,该预测装置可以是笔记本电脑、台式计算机等计算设备,该预测装置也可以是服务器或者服务器集合等。The embodiment of the present application also provides a device for predicting survival risk (hereinafter referred to as the predicting device). The predicting device may be any computing device with computing capability or a computing device set composed of multiple computing devices. For example, the predicting device may be a computing device such as a notebook computer or a desktop computer, and the predicting device may also be a server or a collection of servers.
需要说明的是,该预测装置中可以预置有上述的预设模型。作为示例,该预设模型可以以应用程序的形式被存储在预测装置中。在另一些实施例中,该预测装置也可以不预置上述预测模型,例如:预测装置可以通过应用程序接口(application programming interface,API)调用的方式调用部署在云上的所述预设模型。It should be noted that the above-mentioned preset model may be preset in the predicting device. As an example, the preset model may be stored in the prediction device in the form of an application program. In some other embodiments, the predicting device may not preset the above-mentioned predicting model, for example: the predicting device may call the preset model deployed on the cloud through an application programming interface (application programming interface, API) call.
参考图2,图2示出了本申请实施例提供的一种预测装置的结构示意图。如图2所示,预测装置20包括处理器21、主存储器(main memory)22、存储介质23、通信接口24以及总线25。处理器21、主存储器22、存储介质23以及通信接口24之间可以通过总线25连接。Referring to FIG. 2 , FIG. 2 shows a schematic structural diagram of a prediction device provided by an embodiment of the present application. As shown in FIG. 2 , the prediction device 20 includes a processor 21 , a main memory (main memory) 22 , a storage medium 23 , a communication interface 24 and a bus 25 . The processor 21 , the main memory 22 , the storage medium 23 and the communication interface 24 may be connected through a bus 25 .
处理器21是预测装置20控制中心,可以是一个通用中央处理单元(central processing unit,CPU),处理器21还可以是其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件、图形处理器(graphics processing unit,GPU)、神经网络处理单元(neural processing unit,NPU)、张量处理器(tensor processing unit,TPU)或人工智能(artificial intelligent)芯片等。The processor 21 is the control center of the prediction device 20, which can be a general central processing unit (central processing unit, CPU), and the processor 21 can also be other general processors, digital signal processing (digital signal processing, DSP), dedicated integrated Application-specific integrated circuit (ASIC), field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, graphics processing unit , GPU), neural network processing unit (neural processing unit, NPU), tensor processing unit (tensor processing unit, TPU) or artificial intelligence (artificial intelligent) chips, etc.
作为一个示例,处理器21可以包括一个或多个CPU,例如图2中所示的CPU 0和CPU 1。此外,本申请并不限定每个处理器中处理器核的个数。As an example, the processor 21 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 2 . In addition, the present application does not limit the number of processor cores in each processor.
主存储器22用于存储程序指令,处理器21可以通过执行主存储器22中的程序指令,以实现本申请实施例提供的预测生存风险率的方法。The main memory 22 is used to store program instructions, and the processor 21 can execute the program instructions in the main memory 22 to implement the method for predicting the survival risk rate provided by the embodiment of the present application.
在一种可能的实现方式中,主存储器22可以独立于处理器21存在。主存储器22可以通过总线25与处理器21相连接,用于存储数据、指令或者程序代码。处理器21调用并执行主存储器22中存储的指令或程序代码时,能够实现本申请实施例提供的预测生存风险率的方法。In a possible implementation manner, the main memory 22 may exist independently of the processor 21 . The main memory 22 can be connected with the processor 21 through the bus 25, and is used for storing data, instructions or program codes. When the processor 21 invokes and executes the instructions or program codes stored in the main memory 22, the method for predicting the survival risk rate provided by the embodiment of the present application can be realized.
在另一种可能的实现方式中,主存储器22也可以和处理器21集成在一起。In another possible implementation manner, the main memory 22 may also be integrated with the processor 21 .
存储介质23可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器 (synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。作为示例,存储介质23可以用于本申请实施例中的训练样本数据。 Storage medium 23 may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. Among them, the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory can be random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of RAM are available such as static random access memory (static RAM, SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), Double data rate synchronous dynamic random access memory (double data date SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synchlink DRAM, SLDRAM) and direct Memory bus random access memory (direct rambus RAM, DR RAM). As an example, the storage medium 23 may be used for the training sample data in the embodiment of the present application.
通信接口24,用于预测装置20与其他设备(如终端等)通过通信网络连接,所述通信网络可以是以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口24可以包括用于接收数据的接收单元,以及用于发送数据的发送单元。The communication interface 24 is used to connect the prediction device 20 with other devices (such as terminals, etc.) through a communication network, and the communication network can be Ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks) , WLAN) etc. The communication interface 24 may include a receiving unit for receiving data, and a sending unit for sending data.
总线25,可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 25 may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 2 , but it does not mean that there is only one bus or one type of bus.
需要指出的是,图2中示出的结构并不构成对预测装置20的限定,除图2所示部件之外,预测装置20可以包括比图2所示更多或更少的部件,或者组合某些部件,或者不同的部件布置。It should be pointed out that the structure shown in FIG. 2 does not constitute a limitation to the prediction device 20. In addition to the components shown in FIG. 2, the prediction device 20 may include more or fewer components than those shown in FIG. 2, or Combining certain parts, or different arrangements of parts.
需要说明的是,当上述的预测装置是服务器时,本申请实施例还提供一种预测生存风险率的系统(以下简称预测系统),该预测系统可以包括终端和服务器,终端和该服务器之间通过有线或无线的方式连接通信。其中,该服务器中预置有上文所述的预设模型。It should be noted that when the above-mentioned prediction device is a server, the embodiment of the present application also provides a system for predicting the survival risk rate (hereinafter referred to as the prediction system). The prediction system may include a terminal and a server. Connect and communicate via wired or wireless. Wherein, the preset model mentioned above is preset in the server.
其中,终端可以用于接收用户输入的待预测样本的数据,服务器可以用于从终端接收待预测样本的数据,并在对接收到的待预测样本的数据处理完成后,向终端返回预测结果。Wherein, the terminal can be used to receive the data of the sample to be predicted input by the user, and the server can be used to receive the data of the sample to be predicted from the terminal, and return the prediction result to the terminal after processing the received data of the sample to be predicted.
可选的,终端可以是手机、笔记本电脑、台式计算机等终端设备,本申请实施例对此不作限定。Optionally, the terminal may be a terminal device such as a mobile phone, a notebook computer, or a desktop computer, which is not limited in this embodiment of the present application.
本申请实施例还提供一种预设模型的训练装置(以下简称训练装置),该训练装置可以是任意具有计算能力的计算设备。该训练装置的硬件说明可以参考上述预测装置的硬件描述,这里不再赘述。The embodiment of the present application also provides a training device for a preset model (hereinafter referred to as the training device), and the training device may be any computing device with computing capability. For the hardware description of the training device, reference may be made to the hardware description of the prediction device above, and details are not repeated here.
应理解,该训练装置可以和上述的预测装置是同一个设备,也可以是不同的设备,本申请实施例对此不作限定。It should be understood that the training device may be the same device as the prediction device described above, or may be a different device, which is not limited in this embodiment of the present application.
下面结合附图,对本申请实施例提供的方法予以详细说明。The method provided by the embodiment of the present application will be described in detail below with reference to the accompanying drawings.
下面,首先对本申请实施例提供的预设模型的训练方法予以说明。In the following, the training method of the preset model provided by the embodiment of the present application will be firstly described.
参考图3,图3示出了本申请实施例提供的预设模型的训练方法的流程示意图。该方法可以由上文所述的训练装置执行。该方法可以包括:Referring to FIG. 3 , FIG. 3 shows a schematic flowchart of a training method for a preset model provided by an embodiment of the present application. The method may be performed by the training device described above. The method can include:
S101、获取训练样本集。S101. Obtain a training sample set.
这里,训练样本集中包括多个训练样本的数据和每个训练样本的生存数据。Here, the training sample set includes data of multiple training samples and survival data of each training sample.
例如,如果训练样本是患者,则训练样本的数据即可以为该患者的病例数据。再例如,如果训练样本是一台设备,则训练样本的数据即可以为该设备相关的任何数据,例如设备的属性数据,设备的生产数据,等等。For example, if the training sample is a patient, the data of the training sample can be the case data of the patient. For another example, if the training sample is a device, the data of the training sample can be any data related to the device, such as attribute data of the device, production data of the device, and so on.
其中,每个训练样本的数据可以包括训练样本的多个特征的数据,且每个训练样本的数据中所包括的特征的数据可以包括非欧数据。可以理解,每个训练样本所包括的特征的数量,可以相同,也可以不同。Wherein, the data of each training sample may include data of multiple features of the training sample, and the data of the features included in the data of each training sample may include non-Euclidean data. It can be understood that the number of features included in each training sample may be the same or different.
可选的,对于训练样本的任一个特征,训练样本集中包括该任一个特征的训练样本的数量大于第一阈值。本申请实施例对第一阈值的取值不作具体限定。这样,可以保证有足够数量的训练样本都包括该任一个特征,这样,在解释训练得到的预设模型时所确定的该任一个特征对该预设模型的贡献度更加准确。Optionally, for any feature of the training samples, the number of training samples including the feature in the training sample set is greater than a first threshold. The embodiment of the present application does not specifically limit the value of the first threshold. In this way, it can be ensured that a sufficient number of training samples include any one feature, so that the contribution of any one feature to the preset model determined when interpreting the trained preset model is more accurate.
表1示出了训练样本集的一个示例。如表1所示,该训练样本集包括n个训练样本的数据,该n个训练样本分别为训练样本1、训练样本2、训练样本3、…、以及训练样本n。每个训练样本包括m个特征的数据,该m个特征分别为特征1、特征2、特征3、…、以及特征m。其中,n和m均为正整数。Table 1 shows an example of a training sample set. As shown in Table 1, the training sample set includes data of n training samples, and the n training samples are respectively training sample 1, training sample 2, training sample 3, . . . , and training sample n. Each training sample includes data of m features, and the m features are feature 1, feature 2, feature 3, . . . , and feature m. Wherein, both n and m are positive integers.
表1Table 1
 the 特征1 feature 1 特征2 feature 2 特征3 feature 3 特征mfeature m
训练样本1training sample 1 0.740.74 0.310.31 0.200.20 0.580.58
训练样本2 training sample 2 0.080.08 0.340.34 0.200.20 0.120.12
训练样本3 training sample 3 0.490.49 0.740.74 0.180.18 0.780.78
训练样本ntraining sample n 0.780.78 0.840.84 0.560.56 …..... 0.620.62
应理解,训练样本中所包括的特征与该训练样本训练得到的预设模型的应用场景相关。It should be understood that the features included in the training samples are related to the application scenarios of the preset model trained by the training samples.
示例性的,如果预设模型应用于医院场景下的生存分析,则一个训练样本的特征数据可以包括:患者的基础数据(包括患者的年龄和身体质量指数(Body Mass Index,BMI)等)、患者的血检数据(包括患者的血常规数据、血细胞数据、肝功能数据及肾功能数据等)、患者的生命体征数据(包括患者的体温、脉搏、心率、血压、呼吸及血氧等)、或患者的治疗记录(包括药物治疗时的药物名、药物种类和药物剂量,以及血浆治疗、通氧治疗等)等数据中的多种数据。Exemplarily, if the preset model is applied to the survival analysis in the hospital setting, the feature data of a training sample may include: the patient's basic data (including the patient's age and body mass index (Body Mass Index, BMI) etc.), The patient's blood test data (including the patient's blood routine data, blood cell data, liver function data and kidney function data, etc.), the patient's vital sign data (including the patient's body temperature, pulse, heart rate, blood pressure, respiration and blood oxygen, etc.), Or a variety of data in the patient's treatment records (including drug name, drug type and drug dosage during drug treatment, as well as plasma therapy, oxygen therapy, etc.).
再示例性的,如果预设模型应用于工业场景下零件质量的生存分析,则一个训练样本的特征数据可以包括:零件的材料数据、生产零件的工艺数据、零件的出厂时间等数据中的多种数据。As another example, if the preset model is applied to the survival analysis of part quality in an industrial scene, the feature data of a training sample may include: the material data of the part, the process data of the production part, and the time of the part's delivery. kinds of data.
可以看出,一个训练样本中的特征的种类可以有上百种,并且这些特征的数据包括非欧数据。It can be seen that there can be hundreds of types of features in a training sample, and the data of these features include non-European data.
此外,每个训练样本的生存数据,即作为预设模型训练过程中计算损失函数时的真实值(或称为标签值)。其中,每个训练样本包括唯一的生存数据,该生存数据为截断数据(即时间事件数据)。In addition, the survival data of each training sample is used as the real value (or label value) when calculating the loss function during the training process of the preset model. Wherein, each training sample includes unique survival data, and the survival data is truncated data (ie, time event data).
示例性的,表2示出了一组生存数据。其中,生存数据1可以是表1中的训练样本1的生存数据,生存数据2可以是表1中的训练样本2的生存数据,生存数据3可以是表1中的训练样本3的生存数据,…,生存数据n可以是表1中的训练样本n的生存数据。Exemplarily, Table 2 shows a set of survival data. Wherein, survival data 1 may be the survival data of training sample 1 in Table 1, survival data 2 may be the survival data of training sample 2 in Table 1, and survival data 3 may be the survival data of training sample 3 in Table 1, ..., the survival data n can be the survival data of the training sample n in Table 1.
以生存数据1为例,生存数据1中包括训练样本1在第10天时,训练样本1的结局事件发生,即事件状态值为“True/1”。以生存数据2为例,生存数据2中包括训练样本1在第14天时,训练样本2的结局事件未发生,即事件状态值为“False/0”。不再赘述。Taking survival data 1 as an example, survival data 1 includes training sample 1 on the 10th day, and the ending event of training sample 1 occurs, that is, the event status value is "True/1". Taking the survival data 2 as an example, when the survival data 2 includes the training sample 1 on the 14th day, the ending event of the training sample 2 does not occur, that is, the event status value is "False/0". No longer.
表2Table 2
 the 时间(time)/天time (time)/day 事件(event) event
生存数据1Survival Data 1 1010 True/(1)True/(1)
生存数据2 Survival Data 2 1414 False/(0)False/(0)
生存数据3 Survival Data 3 99 False/(0)False/(0)
生存数据n Survival data n 2020 True/(1)True/(1)
可选的,训练装置可以从外接存储器件上获取到训练样本集。其中,该外接存储器件中 预存有训练样本集。Optionally, the training device can obtain the training sample set from an external storage device. Wherein, the training sample set is pre-stored in the external storage device.
可选的,训练装置也可以通过通信接口(例如图2所示的通信接口24),从其他设备接收到训练样本集。其中,该其他设备中预存有训练样本集。Optionally, the training device may also receive the training sample set from other devices through a communication interface (such as the communication interface 24 shown in FIG. 2 ). Wherein, the training sample set is pre-stored in the other device.
需要说明的是,训练装置获取的训练样本集,可以是预处理过的训练样本集,也可以是未进行预处理过的训练样本集。It should be noted that the training sample set acquired by the training device may be a preprocessed training sample set or a non-preprocessed training sample set.
当训练装置获取的训练样本集,是未进行预处理过的训练样本集,则训练装置可以在获取到训练样本集后,对该训练样本集进行预处理,本申请实施例在此对预处理的具体内容不作限定。When the training sample set obtained by the training device is a training sample set that has not been preprocessed, the training device can preprocess the training sample set after obtaining the training sample set. In the embodiment of the present application, the preprocessing The specific content is not limited.
作为示例,对训练样本集的预处理,可以是删除训练样本集中异常的训练样本(例如该训练样本中的特征数量小于第一阈值),可以是删除训练样本中异常的特征数据(例如某个病例样本中,患者的身高是10m),可以是删除训练样本集里所有训练样本中缺失值大于第二阈值的特征(缺失值大于阈值的特征,是指训练样本集里大于第二阈值数量的训练样本都不包括的特征),也可以是对特征数据进行归一的数据,等等,本申请实施例对此不作具体限定。As an example, the preprocessing of the training sample set may be to delete abnormal training samples in the training sample set (for example, the number of features in the training sample is less than the first threshold), and may be to delete abnormal feature data in the training sample (for example, a certain In the case sample, the height of the patient is 10m), which can be to delete the features whose missing value is greater than the second threshold in all training samples in the training sample set (the feature whose missing value is greater than the threshold refers to the number of features in the training sample set greater than the second threshold features that are not included in the training samples), or data that normalizes the feature data, etc., which are not specifically limited in this embodiment of the present application.
S102、通过训练样本集中的训练样本的数据对初始模型进行训练,得到预设模型。S102. Train the initial model by using the data of the training samples in the training sample set to obtain a preset model.
具体的,训练装置可以基于获取到的训练样本集中的训练样本的数据,对初始模型进行迭代训练,从而得到预设模型。这里,初始模型可以是设计人员预先设计的模型,该初始模型被设计为用于预测样本的生存风险率的模型。Specifically, the training device may iteratively train the initial model based on the acquired data of the training samples in the training sample set, so as to obtain the preset model. Here, the initial model may be a model pre-designed by the designer, and the initial model is designed as a model for predicting the survival risk rate of the sample.
其中,该初始模型可以包括门控网络和多个专家网络。该门控网络例如可以是神经网络分类器,该专家网络例如可以是RFCN。Wherein, the initial model may include a gating network and multiple expert networks. The gating network can be, for example, a neural network classifier, and the expert network can be, for example, RFCN.
其中,在每一次对模型进行训练时,门控网络用于根据接收到的训练样本获得每个专家网络对应的权重系数。多个专家网络分别用于对该训练样本进行学习,以输出各自学习的结果。这样,多个专家网络学习的结果按照门控网络所获得的该多个专家网络的权重系数进行加权求和,即可得到当前模型的输出值(或称为输出结果),该输出值即为当前模型对训练样本学习后得到的预测值,该预测值即为当前模型预测到的该训练样本的生存风险率。Wherein, when the model is trained each time, the gating network is used to obtain the weight coefficient corresponding to each expert network according to the received training samples. Multiple expert networks are respectively used to learn the training samples to output their own learning results. In this way, the learning results of multiple expert networks are weighted and summed according to the weight coefficients of the multiple expert networks obtained by the gating network, and the output value (or output result) of the current model can be obtained, and the output value is The predicted value obtained after the current model learns the training sample, the predicted value is the survival risk rate of the training sample predicted by the current model.
具体的,门控网络可以通过对接收的训练样本进行类型学习,进而根据学习到的类型为每个专家网络分配对应的权重。这里,门控网络是一个通过对训练样本进行自主学习并对训练样本进行分类的网络,门控网络对训练样本进行分类后的类型数量,等于初始模型中所包括的专家网络的数量。Specifically, the gating network can learn the type of the received training samples, and then assign corresponding weights to each expert network according to the learned type. Here, the gating network is a network that learns and classifies the training samples autonomously, and the number of types after the gating network classifies the training samples is equal to the number of expert networks included in the initial model.
可以看出,在门控网络为每个专家网络分配对应的权重时,训练样本的一种类型,对应一个专家网络。It can be seen that when the gating network assigns corresponding weights to each expert network, one type of training sample corresponds to one expert network.
进一步的,门控网络所确定的多个专家网络的权重,可以通过预设函数进行归一处理,以得到多个专家网络对应的权重系数。其中,经归一处理的多个权重系数的和为1。Further, the weights of the plurality of expert networks determined by the gating network may be normalized by a preset function to obtain weight coefficients corresponding to the plurality of expert networks. Wherein, the sum of the normalized weight coefficients is 1.
示例性的,门控网络所确定的多个专家网络的权重,可以通过softmax函数进行指数归一处理。这里,通过softmax函数对多个数据进行指数归一的过程不作详述。Exemplarily, the weights of the plurality of expert networks determined by the gating network can be exponentially normalized through the softmax function. Here, the process of exponentially normalizing multiple data through the softmax function will not be described in detail.
这样,根据每个专家网络对应的权重系数对每个专家网络输出的结果加权相乘并求和(即加权和),即可得到当前模型对接收到的训练样本学习后得到的预测值。这一过程可以通过下述公式(2)表示。其中,每个专家网络均用于对训练样本进行学习,并预测出该训练样本的生存风险率。In this way, the results output by each expert network are weighted, multiplied and summed (ie weighted sum) according to the weight coefficient corresponding to each expert network, and the predicted value obtained by the current model after learning the received training samples can be obtained. This process can be represented by the following formula (2). Wherein, each expert network is used to learn the training samples and predict the survival risk rate of the training samples.
公式(2)    
Figure PCTCN2022081403-appb-000001
Formula (2)
Figure PCTCN2022081403-appb-000001
其中,x表示训练样本,N为专家网络的数量,i表示N个专家网络中的第i个专家网络。 F(x)表示模型对训练样本x学习后输出的生存风险率(即预测值)。G(x)表示门控网络输出的N个专家网络的权重,τ表示温度系数,用于指示Softmax对多个权重进行指数归一时的归一结果的平滑度,通常是预先设定的。Softmax(G(x),τ) i表示对门控网络输出的权重G(x)进行指数归一处理后得到的第i个专家网络对应的权重系数,f i(x)表示第i个专家网络对训练样本x学习处理后的结果。 Among them, x represents the training samples, N is the number of expert networks, and i represents the i-th expert network among the N expert networks. F(x) represents the survival risk rate (that is, the predicted value) output by the model after learning the training sample x. G(x) represents the weight of the N expert networks output by the gating network, and τ represents the temperature coefficient, which is used to indicate the smoothness of the normalized result when Softmax performs exponential normalization on multiple weights, which is usually preset. Softmax(G(x), τ) i represents the weight coefficient corresponding to the i-th expert network obtained after exponentially normalizing the weight G(x) output by the gating network, f i (x) represents the i-th expert network Learn the processed results for the training sample x.
作为示例,以用于对初始模型进行训练的训练样本集中包括两种类型的训练样本(例如包括男性训练样本和女性训练样本)为例,参考图4,图4示出了本申请实施例提供的一种初始模型的结构示意图。如图4所示,初始模型40包括门控网络41和2个专家网络,2个专家网络分别为专家网络421和专家网络422。As an example, taking the training sample set used to train the initial model including two types of training samples (for example, including male training samples and female training samples) as an example, refer to FIG. 4, which shows that the embodiment of the present application provides A schematic diagram of the structure of an initial model. As shown in FIG. 4 , the initial model 40 includes a gating network 41 and two expert networks, and the two expert networks are respectively an expert network 421 and an expert network 422 .
其中,初始模型40接收到输入的训练样本1后,专家网络421对训练样本1进行学习处理后得到结果1,专家网络422对训练样本1进行学习处理得到结果2。Wherein, after the initial model 40 receives the input training sample 1, the expert network 421 performs learning processing on the training sample 1 to obtain the result 1, and the expert network 422 performs learning processing on the training sample 1 to obtain the result 2.
门控网络41对训练样本1学习处理后,可以基于学习到的训练样本1的类型,为专家网络421输出权重1,以及为专家网络422输出权重2。然后,softmax函数对门控网络输出的两个权重进行指数归一,以得到专家网络421的权重系数1和专家网络422的权重系数2。After learning and processing the training sample 1 , the gating network 41 can output weight 1 for the expert network 421 and output weight 2 for the expert network 422 based on the learned type of the training sample 1 . Then, the softmax function exponentially normalizes the two weights output by the gating network to obtain the weight coefficient 1 of the expert network 421 and the weight coefficient 2 of the expert network 422 .
这样,初始模型将权重系数1和专家网络421输出的结果1相乘得到的结果,以及将权重系数2和专家网络422输出的结果2相乘得到的结果进行加和,即可得到该初始模型对训练样本1学习后输出的预测值,该预测值即为该初始模型对训练样本1学习后预测到的训练样本1的生存风险率。In this way, the initial model can be obtained by adding the result obtained by multiplying the weight coefficient 1 and the result 1 output by the expert network 421 and the result obtained by multiplying the weight coefficient 2 and the result 2 output by the expert network 422 The predicted value output after learning the training sample 1, the predicted value is the survival risk rate of the training sample 1 predicted by the initial model after learning the training sample 1.
需要说明的是,上述初始模型里多个专家网络中的任一个专家网络,可以包括至少一个候选RFCN。It should be noted that any expert network among the plurality of expert networks in the above initial model may include at least one candidate RFCN.
当该任一个专家网络中包括1个候选RFCN时,则该候选RFCN对训练样本学习处理后输出的结果,即为该任一个专家网络对训练样本学习处理后输出的结果。When any one of the expert networks includes one candidate RFCN, the output result of the candidate RFCN after learning and processing the training samples is the output result of any expert network after learning and processing the training samples.
当该任一个专家网络中包括多个候选RFCN时,则该任一个专家网络中还包括评估模块,该评估模块用于对每个候选RFCN对训练样本学习后得到的结果进行评估,并将满足预设条件的结果作为该任一个专家网络的输出结果。其中,“满足预设条件”的输出结果,可以是多个候选RFCN输出的结果中,最接近样本标签值的输出结果即为满足预设条件的输出结果。这样,可以提高模型预测的准确率。When multiple candidate RFCNs are included in any one of the expert networks, an evaluation module is also included in the any one of the expert networks, and the evaluation module is used to evaluate the results obtained after each candidate RFCN learns the training samples, and will satisfy The result of the pre-set conditions is used as the output of any expert network. Wherein, the output result of "satisfying the preset condition" may be the output result closest to the sample label value among the output results of multiple candidate RFCNs, which is the output result satisfying the preset condition. In this way, the accuracy of model prediction can be improved.
作为示例,对于任一个专家网络而言,评估模块可以基于该专家网络中的每个候选RFCN对训练样本a学习后得到结果(即每个候选RFCN输出的训练样本a的预测值),以及训练样本a的生存数据,计算每个候选RFCN的损失函数。然后将值最小的损失函数(损失函数最小即表示预测值最接近真实值)对应的候选RFCN所输出的结果,作为该任一个专家网络的输出值。As an example, for any expert network, the evaluation module can obtain the result after learning the training sample a based on each candidate RFCN in the expert network (that is, the predicted value of the training sample a output by each candidate RFCN), and the training The survival data of sample a, calculate the loss function of each candidate RFCN. Then, the output result of the candidate RFCN corresponding to the loss function with the smallest value (the smallest loss function means that the predicted value is closest to the real value) is used as the output value of any expert network.
其中,本申请实施例对从多个候选RFCN的学习结果中评估性能最优的结果的具体实现方式不作具体限定。Wherein, the embodiment of the present application does not specifically limit the specific implementation manner of evaluating the result with the best performance from the learning results of multiple candidate RFCNs.
应理解,同一个专家网络中的多个候选RFCN的网络结构均不相同。其中,候选RFCN的不同网络结构,例如可以是候选RFCN的跳跃连接或捷径连接跳过的网络结构/层数不同,本申请实施例对此不作限定。It should be understood that the network structures of multiple candidate RFCNs in the same expert network are different. The different network structures of the candidate RFCNs, for example, may be different network structures/layers skipped by the skip connections or shortcut connections of the candidate RFCNs, which is not limited in this embodiment of the present application.
还应理解,对于初始模型中所包括的多个专家网络而言,每个专家网络所包括的候选RFCN集合互不相同。可选的,该多个专家网络中的每个专家网络所包括的候选RFCN集合之间可以存在交集。It should also be understood that for the multiple expert networks included in the initial model, the candidate RFCN sets included in each expert network are different from each other. Optionally, there may be an intersection among candidate RFCN sets included in each of the plurality of expert networks.
以初始模型包括3个专家网络为例,示例性的,专家网络1可以包括候选RFCN 1、候选 RFCN 2以及候选RFCN3。专家网络2可以包括RFCN 1和候选RFCN 2。专家网络3可以包括RFCN 3和候选RFCN 4。Taking the initial model including 3 expert networks as an example, for example, expert network 1 may include candidate RFCN 1, candidate RFCN 2, and candidate RFCN 3. Expert network 2 may include RFCN 1 and candidate RFCN 2. The expert network 3 may include RFCN 3 and candidate RFCN 4.
作为示例,参考图5,图5示出了本申请实施例提供的一种专家网络的结构示意图。如图5所示,专家网络421包括3个候选RFCN,分别为候选RFCN 511、候选RFCN 512以及候选RFCN 513。专家网络421还包括评估模块52。As an example, refer to FIG. 5 , which shows a schematic structural diagram of an expert network provided by an embodiment of the present application. As shown in FIG. 5 , the expert network 421 includes three candidate RFCNs, which are candidate RFCN 511 , candidate RFCN 512 and candidate RFCN 513 . The expert network 421 also includes an evaluation module 52 .
如图5所示,当专家网络421接收到训练样本1,候选RFCN 511可以对训练样本1进行学习处理,得到结果1。类似的,候选RFCN 512可以对训练样本1进行学习处理,得到结果2,候选RFCN 513可以对训练样本1进行学习处理,得到结果3。As shown in Figure 5, when the expert network 421 receives the training sample 1, the candidate RFCN 511 can learn and process the training sample 1 to obtain the result 1. Similarly, the candidate RFCN 512 can learn and process the training sample 1 to obtain the result 2, and the candidate RFCN 513 can learn and process the training sample 1 to obtain the result 3.
然后,评估模块52可以对结果1、结果2以及结果3进行评估,并确定出性能最优的结果。例如评估模块52确定性能最优的结果是结果2,则专家网络421将结果2输出。Then, the evaluation module 52 may evaluate the result 1, the result 2 and the result 3, and determine the result with the best performance. For example, the evaluation module 52 determines that the result with the best performance is the result 2, and the expert network 421 outputs the result 2.
这样,训练装置基于获取的训练样本对具有上文所述结构的初始模型进行迭代训练,即可得到预设模型。具体的,训练装置基于获取的训练样本对具有上文所述结构的初始模型进行迭代训练,得到预设模型的过程,可以描述如下:In this way, the training device iteratively trains the initial model with the structure described above based on the acquired training samples to obtain the preset model. Specifically, the training device performs iterative training on the initial model with the structure described above based on the obtained training samples, and the process of obtaining the preset model can be described as follows:
训练装置将训练样本集中的训练样本1输入待训练模型。这里,当训练装置第一次向待训练模型输入训练样本时,则该待训练模型即为上文所述的初始模型。The training device inputs the training sample 1 in the training sample set into the model to be trained. Here, when the training device inputs a training sample to the model to be trained for the first time, the model to be trained is the initial model described above.
这样,待训练模型接收到训练装置输入的训练样本1后,待训练模型中的每个专家网络可以对训练样本1进行学习处理,并输出各自的学习结果。每个专家网络所输出的学习结果,即为每个专家网络对训练样本1学习后输出的预测值。In this way, after the model to be trained receives the training sample 1 input by the training device, each expert network in the model to be trained can perform learning processing on the training sample 1 and output respective learning results. The learning result output by each expert network is the predicted value output by each expert network after learning the training sample 1.
待训练模型中的门控网络对训练样本1进行学习并分类,并基于学到的类型输出每个专家网络对应的权重。接着,训练装置对每个专家网络对应的权重进行归一处理后,从而确定出每个专家网络对应的权重系数。The gating network in the model to be trained learns and classifies the training sample 1, and outputs the weight corresponding to each expert network based on the learned type. Next, the training device performs normalization processing on the weights corresponding to each expert network, so as to determine the weight coefficients corresponding to each expert network.
可以理解,当待预测模型是初始模型时,即训练装置对初始模型的第一次训练,门控网络对训练样本1进行学习后,可以按照学习的结果随机的输出每个专家网络对应的权重。其中,权重最大的专家网络可以看做是与当前的训练样本1的类型对应专家网络。It can be understood that when the model to be predicted is the initial model, that is, the first training of the initial model by the training device, after the gating network learns the training sample 1, it can randomly output the weight corresponding to each expert network according to the learning result . Among them, the expert network with the largest weight can be regarded as the expert network corresponding to the type of the current training sample 1 .
接着,训练装置根据每个专家网络的权重系数,对多个专家网络输出的预测值进行加权相乘并求和,从而得到该待训练模型输出的训练样本1的预测值。应理解,训练样本1的预测值,即为该待预测模型预测到的训练样本1的生存风险率。Next, the training device weights, multiplies and sums the predicted values output by multiple expert networks according to the weight coefficient of each expert network, so as to obtain the predicted value of the training sample 1 output by the model to be trained. It should be understood that the predicted value of the training sample 1 is the survival risk rate of the training sample 1 predicted by the model to be predicted.
然后,训练装置可以基于待训练模型输出的预测值,以及训练样本1的生存数据(即真实值,或称为训练样本的标签值)计算损失函数。由于生存数据是截断数据。因此可选的,本申请实施例可以基于负对数似然(negative log-likelihood,NLL)分数来计算截断数据的损失函数。Then, the training device can calculate the loss function based on the predicted value output by the model to be trained and the survival data of the training sample 1 (that is, the real value, or the label value of the training sample). Since the survival data are truncated data. Therefore, optionally, in this embodiment of the present application, a loss function for truncated data may be calculated based on a negative log-likelihood (NLL) score.
应理解,训练装置可以基于计算待预测模型输出的预测值和训练样本1的生存数据,计算该待预测模型的损失函数。其中,待预测模型的损失函数通过反向传递,并按照每个专家网络的权重系数对每个专家网络的网络参数进行调节。可以理解,专家网络的网络参数调节量与该专家网络的权重系数成正比。例如,权重系数大的专家网络的网络参数调节量较大,权重系数小的专家网络的网络参数调节量较小。It should be understood that the training device may calculate the loss function of the model to be predicted based on the predicted value output by the model to be predicted and the survival data of the training sample 1 . Among them, the loss function of the model to be predicted is passed backwards, and the network parameters of each expert network are adjusted according to the weight coefficient of each expert network. It can be understood that the adjustment amount of the network parameters of the expert network is directly proportional to the weight coefficient of the expert network. For example, the network parameter adjustment amount of the expert network with a large weight coefficient is relatively large, and the network parameter adjustment amount of the expert network with a small weight coefficient is small.
还应理解,训练装置还可以基于每个专家网络的输出值和训练样本1的生存数据,计算每个专家网络对应的损失函数。基于多个专家网络的损失函数中最小的损失函数对应的专家网络,以及门控网络为该专家网络分配的权重系数,对门控网络的参数进行调节,从而使门控网络在下一次接收到与训练样本1具有相同或相近特征的训练样本后,为前述最小损失函数对应的专家网络分配较大的权重,从而实现使该专家网络在后续训练过程中,专门用于对 与训练样本1具有相同或相近特征的训练样本进行学习。这样,通过多次学习,可以使得一个专家网络仅对一类具有相同或相近特征的样本进行学习。应理解,由于权重大的专家网络的输出值在待预测模型输出值中占比大,这样基于待预测模型的损失函数调节权重大的专家网络的网络参数时,调节量也是比较大的,因此相当于权重大的专家网络可以更多的学习到训练样本的特征。It should also be understood that the training device may also calculate the loss function corresponding to each expert network based on the output value of each expert network and the survival data of the training sample 1 . Based on the expert network corresponding to the smallest loss function among the loss functions of multiple expert networks, and the weight coefficient assigned by the gating network to the expert network, the parameters of the gating network are adjusted, so that the next time the gating network receives and trains After sample 1 has a training sample with the same or similar characteristics, assign a larger weight to the expert network corresponding to the aforementioned minimum loss function, so that the expert network can be used exclusively for training samples with the same or similar characteristics as training sample 1 in the subsequent training process. Training samples with similar characteristics are used for learning. In this way, through multiple learning, an expert network can only learn a class of samples with the same or similar characteristics. It should be understood that since the output value of the expert network with heavy weight accounts for a large proportion of the output value of the model to be predicted, when adjusting the network parameters of the expert network with heavy weight based on the loss function of the model to be predicted, the adjustment amount is relatively large, so An expert network with a large weight can learn more features of the training samples.
这样,基于待预测样本对训练样本1的处理所获得的预测值计算得到损失函数对待预测模型的网络参数进行调节后,训练样本1即完成了对待训练模型的一次训练。In this way, after the loss function is calculated based on the predicted value obtained from the processing of the training sample 1 by the sample to be predicted and the network parameters of the model to be predicted are adjusted, the training sample 1 completes a training of the model to be trained.
然后,训练装置可以向新的待训练模型中输入训练样本2,并参考训练样本1对待训练模型的训练过程,完成训练样本2对新的待训练模型的一次训练。Then, the training device can input the training sample 2 into the new model to be trained, and refer to the training process of the model to be trained in the training sample 1 to complete a training of the training sample 2 for the new model to be trained.
需要说明的是,在门控网络对训练样本2学习后为每个专家网络分配权重时,可以参考对训练样本1学习时的分类,为与训练样本2的类型对应专家网络分配较大的权重。It should be noted that when the gating network assigns weights to each expert network after learning the training sample 2, it can refer to the classification when learning the training sample 1, and assign a larger weight to the expert network corresponding to the type of training sample 2 .
类似的,训练装置可以基于训练样本集中的训练样本,多次执行上述过程以实现对初始模型的迭代训练。当训练收敛,即得到本申请实施例所提供的预设模型。其中,该预设模型中的门控网络,用于对样本进行分类。以及该预设模型中的专家网络,用于对不同类型的样本进行生存风险率预测。可以理解,该预设模型的框架结构和上文所述的初始模型的框架结构相同。Similarly, the training device may execute the above process multiple times based on the training samples in the training sample set to implement iterative training of the initial model. When the training converges, the preset model provided by the embodiment of the present application is obtained. Among them, the gating network in the preset model is used to classify samples. And the expert network in the preset model is used to predict the survival risk rate of different types of samples. It can be understood that the frame structure of the preset model is the same as the frame structure of the above-mentioned initial model.
通过上述S101-S102所述的方法训练得到的预设模型对待预测样本进行处理,即可预测测到待预测样本的生存风险率,进而根据待预测样本的生存风险率,即可确定出待预测样本的生存曲线,从而实现了对待预测样本的生存分析。The preset model trained by the method described in S101-S102 above can process the samples to be predicted to predict the survival risk rate of the samples to be predicted, and then determine the survival risk rate of the samples to be predicted according to the survival risk rate of the samples to be predicted. The survival curve of the sample realizes the survival analysis of the sample to be predicted.
参考图6,图6示出了本申请实施例提供的一种预测生存风险率的方法的流程示意图。该方法可以由图2所示的预测装置执行,该预测装置中预置有通过S101-S102所述方法训练得到的预测模型。该方法可以包括:Referring to FIG. 6 , FIG. 6 shows a schematic flowchart of a method for predicting survival risk provided by an embodiment of the present application. The method can be executed by the prediction device shown in FIG. 2 , and the prediction model trained by the method described in S101-S102 is preset in the prediction device. The method can include:
S201、获取待预测样本的数据。S201. Obtain data of samples to be predicted.
其中,预测装置获取待预测样本的数据的详细说明可以参考上文S101中训练装置获取训练样本的描述,这里不作赘述。Wherein, the detailed description of the prediction device acquiring the data of the sample to be predicted can refer to the description of the training device acquiring the training sample in S101 above, which will not be repeated here.
S202、通过预设模型对上述待预测样本的数据进行处理,以得到待预测样本的生存风险率。S202. Process the data of the sample to be predicted by using a preset model to obtain the survival risk rate of the sample to be predicted.
具体的,预测装置可以将获取到的待预测样本的数据输入至预设模型,通过该预设模型对该待预测样本的数据进行处理,得到该待预测样本的生存风险率。Specifically, the prediction device may input the acquired data of the sample to be predicted into a preset model, and process the data of the sample to be predicted through the preset model to obtain the survival risk rate of the sample to be predicted.
其中,待预测样本的生存风险率可以用于确定待预测样本的生存曲线,从而能够对待预测样本的生存分析。Wherein, the survival risk rate of the sample to be predicted can be used to determine the survival curve of the sample to be predicted, so that the survival analysis of the sample to be predicted can be performed.
其中,预设模型对上述待预测样本的数据进行处理,以得到待预测样本的生存风险率的过程,可以参考上文S102中待预测模型对训练样本1进行处理,得到训练样本1的预测值的过程的描述,这里不作赘述。Among them, the preset model processes the data of the above-mentioned samples to be predicted to obtain the survival risk rate of the samples to be predicted. You can refer to the process of processing the training sample 1 by the model to be predicted in S102 above to obtain the predicted value of the training sample 1. The description of the process is not repeated here.
这样,基于预测模型预测到的待预测样本的生存风险率,和上文所述的公式(1),即可确定出待预测样本的风险函数。In this way, the risk function of the sample to be predicted can be determined based on the survival risk rate of the sample to be predicted predicted by the prediction model and the above-mentioned formula (1).
应理解,可以将待预测样本的特征数据作为公式(1)中x 1、x 2、…x p表示协变量,expb 1·expb 2·...·expb p即为预设模型预测到的待预测样本的生存风险率。 It should be understood that the characteristic data of the sample to be predicted can be used as x 1 , x 2 , ... x p in the formula (1) represent covariates, and expb 1 ·expb 2 ·...·expb p is predicted by the preset model The survival hazard rate of the sample to be predicted.
这样,当确定出待预测样本的风险函数后,该风险函数即可以反映出待预测样本的生存曲线。例如,如果待预测样本在某个时间的风险值较高,则说明在该时间待预测样本的生存率低。In this way, after the risk function of the sample to be predicted is determined, the risk function can reflect the survival curve of the sample to be predicted. For example, if the risk value of the sample to be predicted is high at a certain time, it means that the survival rate of the sample to be predicted at this time is low.
这样,在本申请实施例提供的预测生存风险率的方法中,由于用于预测待预测样本的预设模型,是基非欧数据训练得到的,且该预设模型相当于是多个专家网络的集成融合,因此,通过本申请实施例提供的预测生存风险率的方法预测到的待预测样本的生存风险率的准确率较高。从而提高了基于生存风险率确定的待预测样本的风险函数的准确度,进而能够准确的反映出待预测样本的生存曲线。In this way, in the method for predicting the survival risk rate provided by the embodiment of the present application, since the preset model used to predict the sample to be predicted is obtained by training based on non-European data, and the preset model is equivalent to a plurality of expert networks Integration and fusion, therefore, the accuracy of the survival risk rate of the sample to be predicted predicted by the method for predicting the survival risk rate provided by the embodiment of the present application is relatively high. Therefore, the accuracy of the risk function of the sample to be predicted based on the survival risk rate is improved, and the survival curve of the sample to be predicted can be accurately reflected.
此外,由上文训练模型的方法可知,用于训练预设模型的每个训练样本均包括很多特征,并且,训练样本的每个特征对训练得到的预设模型所输出的预测值的贡献也不尽相同。因此在实际应用中,如果能够确定出训练样本中每个特征对预设模型输出的预测值的贡献度,则可以确定出训练样本中每个特征对目标事件的发生的影响程度。例如不同的治疗对患者生存时间的影响程度。这样,基于每个特征对目标事件的发生的影响程度,可以指导真实场景中样本的优化改良。In addition, it can be seen from the above method of training the model that each training sample used to train the preset model includes many features, and the contribution of each feature of the training sample to the predicted value output by the trained preset model is also not exactly. Therefore, in practical applications, if the contribution of each feature in the training sample to the predicted value output by the preset model can be determined, then the degree of influence of each feature in the training sample on the occurrence of the target event can be determined. For example, the degree of impact of different treatments on the survival time of patients. In this way, based on the degree of influence of each feature on the occurrence of target events, the optimization and improvement of samples in real scenes can be guided.
为实现上述目的,本申请实施例可以通过对预设模型进行解释,来确定样本中每个特征对该样本预测值的贡献度。或者,本申请实施例还可以通过对样本进行解释,来分析该样本预测值的成因。这里,本申请实施例所述的对预设模型进行解释的方法,或对样本进行解释的方法,均可以由任意具有计算能力、且预置上文中所述的预设模型的设备执行。为简化描述,本申请实施例在下文中以预测装置执行对预设模型和样本进行解释的方法为例进行说明。To achieve the above purpose, the embodiment of the present application can determine the contribution of each feature in the sample to the predicted value of the sample by interpreting the preset model. Alternatively, the embodiment of the present application may also analyze the cause of the predicted value of the sample by explaining the sample. Here, the method for interpreting the preset model described in the embodiments of the present application, or the method for interpreting the sample, can be executed by any device that has computing power and is preset with the preset model described above. To simplify the description, the embodiments of the present application are described below by taking the method of interpreting the preset model and samples performed by the prediction device as an example.
其中,预测装置对预设模型进行解释,可以包括对预设模型本身进行解释,对预设模型中的专家网络进行解释,或者对预设模型中的门控网络进行解释中的一种或多种。Wherein, the prediction device interprets the preset model, which may include explaining the preset model itself, explaining the expert network in the preset model, or explaining the gating network in the preset model. kind.
以预测装置对预设模型本身进行解释为例,预测装置可以根据预设模型和多个训练样本,获取用于解释预设模型的蜂群图(beeswarm)。这里,蜂群图即用于展示样本中每个特征对预设模型输出的预测值的贡献度。Taking the prediction device interpreting the preset model itself as an example, the prediction device can obtain a bee swarm diagram (beeswarm) for explaining the preset model according to the preset model and a plurality of training samples. Here, the bee colony diagram is used to show the contribution of each feature in the sample to the predicted value output by the preset model.
具体的,预测装置可以将多个训练样本分别输入预设模型,从而得到该多个训练样本各自对应的预测值。然后,预测装置可以基于该多个训练样本的特征数据、以及该多个训练样本各自对应的预测值,绘制蜂群图。其中,预测装置可以基于形状值(shap value)方法来绘制蜂群图。这里,本申请实施例对shap value方法的具体实现过程不作具体详述。Specifically, the predicting device may respectively input multiple training samples into the preset model, so as to obtain respective predicted values corresponding to the multiple training samples. Then, the prediction device may draw a bee colony diagram based on the feature data of the plurality of training samples and the respective predicted values corresponding to the plurality of training samples. Wherein, the predicting device may draw the bee colony diagram based on a shape value (shape value) method. Here, the embodiment of the present application does not describe the specific implementation process of the shape value method in detail.
参考图7,图7示出了本申请实施例提供的一种对预设模型进行解释的方法示意图。如图7所示,在预设装置的显示屏上的界面70上,可以显示有预设模型的框架图。应理解,界面70可以是预设模型的用户端界面中的模型解释界面,界面70上的框架图中包括预设模型中门控网络71和2个专家网络的接口按钮(专家网络711和专家网络712)。Referring to FIG. 7 , FIG. 7 shows a schematic diagram of a method for explaining a preset model provided by an embodiment of the present application. As shown in FIG. 7 , on the interface 70 on the display screen of the preset device, a frame diagram of the preset model may be displayed. It should be understood that the interface 70 can be a model interpretation interface in the user interface of the preset model, and the frame diagram on the interface 70 includes the interface buttons of the gated network 71 and two expert networks in the preset model (the expert network 711 and the expert network network 712).
如图7中的(a)所示,当用户通过鼠标点击界面70上的“输入”按钮后,即可在输入样本界面选择需要输出的样本,并在确定后实现向预设模型输入本地存储的多个训练样本的目的。然后,用户可以通过点击界面70上的“输出”按钮后,预测装置的显示屏即可显示用于解释该预设模型的蜂群图,例如图7中的(b)所示的蜂群图。As shown in (a) in Figure 7, when the user clicks the "input" button on the interface 70 with the mouse, the sample to be output can be selected on the input sample interface, and the local storage can be input to the preset model after confirmation. The purpose of multiple training samples. Then, after the user clicks the "output" button on the interface 70, the display screen of the prediction device can display a bee colony diagram for explaining the preset model, such as the bee colony diagram shown in (b) in Figure 7 .
如图7中的(b)所示,在界面71所显示的蜂群图中,灰色越深,表示特征值越大,灰色越浅,表示特征值越小。并且,该蜂群图的横坐标用于表示特征对预设模型输出的预测值的贡献度。As shown in (b) of FIG. 7 , in the bee colony diagram displayed on the interface 71 , the darker the gray, the larger the eigenvalue, and the lighter the gray, the smaller the eigenvalue. In addition, the abscissa of the bee colony diagram is used to represent the contribution of the feature to the predicted value output by the preset model.
可以看出,对于特征1,当特征1的特征值较大时,特征1对预设模型输出的预测值的贡献度为负,且特征1的特征值越大(即灰色越深),特征1对预设模型输出的预测值的贡献度越小(负值的绝对值越大,贡献度越小);相反,当特征1的特征值较小时,特征1对预设模型输出的预测值的贡献度为正,且特征1的特征值越小(即灰色越浅),特征1对预设模型输出的预测值的贡献度越大(正值越大,贡献度越大)。It can be seen that for feature 1, when the feature value of feature 1 is large, the contribution of feature 1 to the predicted value output by the preset model is negative, and the larger the feature value of feature 1 (that is, the darker the gray), the feature The smaller the contribution of 1 to the predicted value output by the preset model (the larger the absolute value of the negative value, the smaller the contribution); on the contrary, when the eigenvalue of feature 1 is small, the predicted value of feature 1 to the output of the preset model The contribution of is positive, and the smaller the feature value of feature 1 (that is, the lighter the gray), the greater the contribution of feature 1 to the predicted value output by the preset model (the larger the positive value, the greater the contribution).
类似的,对于特征9,当特征9的特征值较大时,特征9对预设模型输出的预测值的贡献度为正,且特征9的特征值越大(即灰色越深),特征9对预设模型输出的预测值的贡献度越大(正值越大,贡献度越大);相反,当特征9的特征值较小时,特征9对预设模型输出的预测值的贡献度为负,且特征9的特征值越小(即灰色越浅),特征9对预设模型输出的预测值的贡献度越小(负值的绝对值越大,贡献度越小)。Similarly, for feature 9, when the feature value of feature 9 is larger, the contribution of feature 9 to the predicted value output by the preset model is positive, and the larger the feature value of feature 9 (that is, the darker the gray), the feature 9 The greater the contribution to the predicted value output by the preset model (the larger the positive value, the greater the contribution); on the contrary, when the eigenvalue of feature 9 is small, the contribution of feature 9 to the predicted value output by the preset model is Negative, and the smaller the feature value of feature 9 (that is, the lighter the gray), the smaller the contribution of feature 9 to the predicted value output by the preset model (the larger the absolute value of the negative value, the smaller the contribution).
可以理解,如果需要对预设模型中的门控网络进行解释时,用户即可在通过操作“输入”按钮以实现样本输入后,可以在界面70上点击“门控网络71”按钮,然后再点击“输出”按钮,这样,即可得到展示样本特征对门控网络输出值的贡献度。类似的,如果需要对预设模型中的任一个专家网络进行解释时,用户即可在通过操作“输入”按钮以实现样本输入后,可以在界面70上点击该任一个专家网络对应的按钮,然后再点击“输出”按钮,这样,即可得到展示样本特征对该任一个专家网络输出值的贡献度。It can be understood that if it is necessary to explain the gated network in the preset model, the user can click the button "gated network 71" on the interface 70 after operating the "input" button to realize sample input, and then Click the "Output" button, so that the contribution of the sample features to the output value of the gating network can be obtained. Similarly, if it is necessary to explain any expert network in the preset model, the user can click the button corresponding to any expert network on the interface 70 after operating the "input" button to realize sample input, Then click the "Output" button, so that the contribution of the sample characteristics to any expert network output value can be obtained.
这样,通过使用多个样本对预设模型的解释,确定出样本的每个特征对预设模型的输出值的影响程度,即可实现对真实场景中的相关指导。In this way, by using multiple samples to explain the preset model and determining the degree of influence of each feature of the sample on the output value of the preset model, relevant guidance in the real scene can be realized.
例如,如果用于对预设模型解释的多个样本的数据是多个癌症患者的病例样本数据,那么通过模型解释,当确定使用某种药物治疗(使用药物治疗即为样本的一个特征)对降低癌症患者的生存风险率贡献较大,即表明该药物治疗可以提高癌症患者的生存率。这样,即可指导临床医生对癌症患者的用药。For example, if the data of multiple samples used to explain the preset model is the case sample data of multiple cancer patients, then through model interpretation, when it is determined to use a certain drug treatment (the use of drug treatment is a feature of the sample) to The greater contribution to reducing the survival risk rate of cancer patients indicates that the drug treatment can improve the survival rate of cancer patients. In this way, clinicians can be guided in the medication of cancer patients.
此外,当需要对任一个样本进行解释,预测装置可以根据该任一个样本和预设模型,获取用于解释该任一个样本的蜂群图。这里,蜂群图即用于展示该任一个样本中每个特征对预设模型输出的该样本的预测值的贡献度,这样即可以分析出该任一个样本预测值的成因。In addition, when any sample needs to be explained, the predicting device can obtain a bee colony diagram for explaining any sample according to the any sample and a preset model. Here, the bee colony diagram is used to show the contribution of each feature in any sample to the predicted value of the sample output by the preset model, so that the cause of the predicted value of any sample can be analyzed.
具体的,预测装置可以将待解释样本输入预设模型,从而得到该待解释样本的预测值。然后,预测装置可以基于该待解释样本的特征数据、以及该待解释样本的预测值,绘制蜂群图。Specifically, the predicting device may input the sample to be explained into a preset model, so as to obtain the predicted value of the sample to be explained. Then, the predicting device may draw a bee colony diagram based on the characteristic data of the sample to be explained and the predicted value of the sample to be explained.
作为示例,以表3所示的样本即为待解释样本为例:As an example, take the sample shown in Table 3 as the sample to be explained:
表3table 3
特征feature PFSPFS AgeAge RMRM NOXNOX RADRAD LSTATLSTAT
待解释样本sample to be explained 15.315.3 65.265.2 6.5756.575 0.5380.538 11 4.984.98
则参考图7,当用户通过鼠标点击界面70上的“输入”按钮,将表3所示的待解释样本输入预设模型。接着,用户可以通过点击界面70上的“输出”按钮,预测装置的显示屏即可显示用于解释该待解释样本的样本解释图,例如图7中的(c)所示的样本解释图。Referring to FIG. 7 , when the user clicks the “input” button on the interface 70 with the mouse, the samples to be explained shown in Table 3 are input into the preset model. Next, the user can click the "Output" button on the interface 70, and the display screen of the prediction device can display a sample explanation diagram for explaining the sample to be explained, such as the sample explanation diagram shown in (c) in FIG. 7 .
如图7中的(c)所示,在界面72所显示的样本解释图中,待解释样本的预测值为24.1。其中,黑色区域箭头指向预测值增大的方向,白色区域箭头指向预测值减小的方向。可以看出,特征LSTAT对提高待解释样本的预测值的贡献最大(即最长黑色条框所示),特征RM的值对降低待解释样本的预测值的贡献最大(即最长白色条框所示)。As shown in (c) of FIG. 7 , in the sample interpretation diagram displayed on the interface 72 , the predicted value of the sample to be explained is 24.1. Among them, the arrows in the black area point to the direction in which the predicted value increases, and the arrows in the white area point to the direction in which the predicted value decreases. It can be seen that the feature LSTAT contributes the most to improving the predicted value of the sample to be explained (that is, the longest black bar), and the value of the feature RM contributes the most to reducing the predicted value of the sample to be explained (that is, the longest white bar shown).
这样,通过对单个样本的解释,即可确定出单个样本中不同的特征对该单个样本的生存风险率的影响程度,进而可以对该样本进行相关指导。例如,如果该单个样本是零件,通过预设模型对该零件解释后,当确定该样本的材料是材料a时,对提高该零件的生存风险率贡献较大,即表明基于材料a制造的零件的生存率低,即该零件的寿命最短。这样即可指导厂家避免使用材料a来制造零件。In this way, through the interpretation of a single sample, the degree of influence of different characteristics in a single sample on the survival risk rate of the single sample can be determined, and then relevant guidance can be given to the sample. For example, if the single sample is a part, after explaining the part through the preset model, when it is determined that the material of the sample is material a, it will make a greater contribution to increasing the survival risk rate of the part, which means that the part manufactured based on material a The survival rate is low, that is, the life of the part is the shortest. This will guide the manufacturer to avoid using material a to make the part.
可以看出,通过对模型或样本进行解释,可以在不同层次指导真实场景中对样本的优化改良。It can be seen that by interpreting the model or samples, the optimization and improvement of samples in real scenes can be guided at different levels.
在另一些实施例中,由于本申请实施例训练得到的预设模型中的门控网络,实质上是一个分类器。因此,本申请实施例还可以基于上述预设模型中的门控网络对样本集中的样本进行分类。这样,即可将样本集中的样本按照类型划分为多个组,即任一个组中的样本为同类样本。例如,门控网络可以将患者的电子病历样本划分为男性样本和女性样本。In other embodiments, the gating network in the preset model trained in the embodiments of the present application is essentially a classifier. Therefore, the embodiment of the present application can also classify the samples in the sample set based on the gating network in the above preset model. In this way, the samples in the sample set can be divided into multiple groups according to their types, that is, the samples in any group are samples of the same type. For example, a gating network can divide a sample of electronic medical records of patients into a sample of males and a sample of females.
这样,基于划分后的多组样本,可以绘制各组样本对应的生存曲线。可以理解,这里的每组样本中每个样本的生存数据是已知的。通过该方法,即实现了对不同类型样本生存曲线的对比分析。In this way, based on the divided groups of samples, the survival curve corresponding to each group of samples can be drawn. It can be understood that the survival data of each sample in each group of samples here is known. Through this method, the comparative analysis of the survival curves of different types of samples is realized.
作为示例,参考图8,图8示出了本申请实施例提供的一种预设模型将样本集中的样本分组后,各组样本的生存曲线的示意图。以样本是患者的电子病历样本,且电子病历样本的数量为177为例,本申请实施例提供的预设模型中的门控网络可以将患者的电子病历样本划分为样本组1和样本组2后,如果样本组1包括150个样本,样本组2包括27个样本,则基于样本组1和样本组2的生存数据,即可在同一坐标系系中绘制出样本组1和样本组2的生存曲线。如图所示,图8所示生存曲线1即为样本组1的生存曲线,生存曲线2即为样本组2的生存曲线。这样,从图中即可直观的看出样本组1和样本组2在相同时间的生存率的差异。As an example, refer to FIG. 8 , which shows a schematic diagram of survival curves of samples in each group after samples in a sample set are grouped by a preset model provided by an embodiment of the present application. Taking the sample as an example of a patient's electronic medical record, and the number of electronic medical record samples is 177, the gating network in the preset model provided by the embodiment of the present application can divide the patient's electronic medical record sample into sample group 1 and sample group 2 Finally, if sample group 1 includes 150 samples and sample group 2 includes 27 samples, based on the survival data of sample group 1 and sample group 2, the survival data of sample group 1 and sample group 2 can be drawn in the same coordinate system survival curve. As shown in the figure, survival curve 1 shown in FIG. 8 is the survival curve of sample group 1, and survival curve 2 is the survival curve of sample group 2. In this way, the difference in survival rate between sample group 1 and sample group 2 at the same time can be seen intuitively from the figure.
这样,通过对比不同类样本的生存曲线,可以看出不同类样本的生存曲线之间的差异。这样,通过领域专家对不同类的样本进行共同特征的查找分析,即可确定出决定生存曲线中生存率的原因,进而用于指导实践。In this way, by comparing the survival curves of different types of samples, we can see the difference between the survival curves of different types of samples. In this way, by searching and analyzing the common characteristics of different types of samples by domain experts, the reasons for determining the survival rate in the survival curve can be determined, and then used to guide practice.
以图8所示的样本组1的生存曲线和样本组2的生存曲线为例,如图8所示,表示样本组1生存曲线的生存曲线1的生存率,整体低于表示样本组2生存曲线的生存曲线2生存率。这样,领域专家(即临床医生)即可通过医疗的专业分析,查找出每组样本中的共同特征,这样,该共同特征即可能为影响该组样本生存率的决定性因素。这样,基于分析结果,即可指导临床医生对患者的治疗方案进行调整。Taking the survival curve of sample group 1 and the survival curve of sample group 2 shown in Figure 8 as an example, as shown in Figure 8, the survival rate of survival curve 1 representing the survival curve of sample group 1 is generally lower than that of survival curve 1 representing the survival curve of sample group 2. Survival Curve 2 Survival. In this way, experts in the field (ie, clinicians) can find out the common features in each group of samples through medical professional analysis, so that the common features may be a decisive factor affecting the survival rate of the group of samples. In this way, based on the analysis results, clinicians can be guided to adjust the patient's treatment plan.
为进一步对本申请实施例所提供方法中预设模型的一致性进行说明,下面通过具体示例进行描述:In order to further illustrate the consistency of the preset model in the method provided in the embodiment of the present application, the following is described with a specific example:
示例一、肺癌药物A疗效的预测模型Example 1. Prediction model for the efficacy of lung cancer drug A
具体的,以预先收集的肺癌药物A对385个患者的临床疗效数据为例,本申请实施例将该385个样本分为三个样本集。其中,样本集1包括177个样本,样本集2包括106个样本,样本集3包括102个样本。并且,样本集1的质量高于样本集2,样本集2的质量高于样本集3。这里,样本的质量高,例如可以是样本集中样本的缺失特征少,特征数量多,或者观测到样本结局事件(即患者死亡/康复)的样本数量多。Specifically, taking the pre-collected clinical curative effect data of lung cancer drug A on 385 patients as an example, the embodiment of the present application divides the 385 samples into three sample sets. Among them, sample set 1 includes 177 samples, sample set 2 includes 106 samples, and sample set 3 includes 102 samples. Moreover, the quality of sample set 1 is higher than that of sample set 2, and the quality of sample set 2 is higher than that of sample set 3. Here, the quality of the sample is high, for example, it can be that there are few missing features in the samples in the sample set, the number of features is large, or the number of samples with observed sample outcome events (ie, patient death/recovery) is large.
接着,本申请实施例将样本集1作为训练样本集,并基于上述S101-S102所述的方法训练得到预设模型1,并基于现有的coxPH方法训练得到模型2,以及基于DeepSurv方法训练得到模型3。Next, in the embodiment of the present application, the sample set 1 is used as the training sample set, and the preset model 1 is obtained by training based on the method described in S101-S102 above, and the model 2 is obtained by training based on the existing coxPH method, and obtained by training based on the DeepSurv method. Model 3.
然后,以样本集2和样本集3作为验证样本集,对预设模型1、模型2以及模型3进行验证。Then, take sample set 2 and sample set 3 as verification sample sets to verify the preset model 1, model 2 and model 3.
如表4所示,表4示出了预设模型1、模型2以及模型3经相同验证样本验证后的一致性(concordance index,C-index)指数。应理解,C-index指数用于来评价模型的预测能力。可以看出,基于相同验证样本,本申请实施例提供方法训练得到的预设模型1的C-index指数,高于现有coxPH方法训练得到的模型2的C-index指数,以及高于现有DeepSurv方法训练得到的模型3的C-index指数。As shown in Table 4, Table 4 shows the consistency (concordance index, C-index) index of the preset model 1, model 2 and model 3 after being verified by the same verification sample. It should be understood that the C-index index is used to evaluate the predictive ability of the model. It can be seen that based on the same verification sample, the C-index index of the preset model 1 obtained by the method provided in the embodiment of the present application is higher than the C-index index of the model 2 obtained by the existing coxPH method training, and higher than the existing The C-index index of model 3 obtained by DeepSurv method training.
表4Table 4
 the 样本集2Sample set 2 样本集2+样本集3Sample set 2+ Sample set 3
预设模型1(本申请方法)Preset model 1 (this application method) 0.66650.6665 0.57930.5793
模型2(coxPH方法)Model 2 (coxPH method) 0.54480.5448 0.49380.4938
模型3(DeepSurv方法)Model 3 (DeepSurv method) 0.61480.6148 0.56300.5630
示例2、临床疾病A的进展预测模型Example 2. A progression prediction model for clinical disease A
在该示例中,医院A记录有2700个患者的临床数据,即医院A包括2700个样本。此外,医院B记录有1400个患者的临床数据,即医院B包括1400个样本。In this example, hospital A has recorded clinical data for 2700 patients, ie hospital A includes 2700 samples. In addition, hospital B has recorded clinical data of 1400 patients, that is, hospital B includes 1400 samples.
这样,本申请实施例将医院A的样本作为训练样本,并通过上述S101-S102所述的方法训练得到预设模型,并基于10×交叉验证的方式,采用医院A的一部样本对模型进行内部验证,以及基于医院B的样本对模型进行外部验证。In this way, in the embodiment of the present application, the sample of hospital A is used as the training sample, and the preset model is obtained by training through the method described in S101-S102 above, and based on the 10× cross-validation method, a part of the sample of hospital A is used to perform the model training. Internal validation, and external validation of the model based on hospital B samples.
其中,10×交叉验证是指:将样本集划分为10组,并将其中的9组样本作为训练样本来训练得到模型,并以剩余一组样本作为验证样本,对前述9组样本训练得到的模型进行测试验证。该过程重复10次,保证每一组样本都曾作为验证样本对模型进行测试验证。这样,将10次验证的结果求取平均,即可得到10×交叉验证的结果。Among them, 10×cross-validation refers to: divide the sample set into 10 groups, and use 9 groups of samples as training samples to train the model, and use the remaining group of samples as verification samples to train the aforementioned 9 groups of samples. The model is tested and verified. This process is repeated 10 times to ensure that each group of samples has been used as a verification sample to test and verify the model. In this way, the results of 10 times of verification are averaged to obtain the result of 10×cross-validation.
参考图9,图9示出了基于本申请实施例提供的方法和现有方法对医院A的样本训练得到模型进行内部验证和外部验证后的指示模型一致性的柱状结果图。Referring to FIG. 9 , FIG. 9 shows a bar graph indicating the consistency of the model after internal verification and external verification of the model obtained from hospital A sample training based on the method provided by the embodiment of the present application and the existing method.
如图9所示,格纹柱用于表示基于现有DeepSurv方法对医院A的样本进行训练后得到的模型进行10×交叉验证后的C-index指数大小,以及对医院A的样本进行训练后得到的模型通过医院B的样本进行外部验证后的C-index指数大小。条纹柱用于表示基于现有coxnet方法(coxnet方法是coxPH方法改进后的方法)对医院A的样本进行训练后得到的模型进行10×交叉验证后的C-index指数大小,以及对医院A的样本进行训练后得到的模型通过医院B的样本进行外部验证后的C-index指数大小。白色柱用于表示基于本申请实施例所提供方法对医院A的样本进行训练后得到的模型进行10×交叉验证后的C-index指数大小,以及对医院A的样本进行训练后得到的模型通过医院B的样本进行外部验证后的C-index指数大小。As shown in Figure 9, the checkered column is used to indicate the C-index index size of the model obtained after training the samples of hospital A based on the existing DeepSurv method after 10× cross-validation, and the size of the C-index index after training the samples of hospital A The C-index index size of the obtained model after external validation through the sample of hospital B. The striped column is used to indicate the C-index index size of the model obtained after training the samples of hospital A based on the existing coxnet method (the coxnet method is an improved method of the coxPH method) after 10× cross-validation, and the C-index index of the hospital A. The C-index index size of the model obtained after the sample is trained is externally verified by the sample of hospital B. The white column is used to indicate the C-index index size of the model obtained after training the samples of Hospital A based on the method provided in the embodiment of this application after 10× cross-validation, and the model obtained after training the samples of Hospital A passed The size of the C-index index after the external validation of the sample of hospital B.
可以看出,基于相同的训练样本,相同的验证样本,本申请实施例所提供方法训练得到的预设模型的C-index指数,高于现有的DeepSurv方法和coxnet方法训练得到的模型的C-index指数。It can be seen that based on the same training samples and the same verification samples, the C-index index of the preset model trained by the method provided in the embodiment of the present application is higher than the C-index index of the model trained by the existing DeepSurv method and coxnet method. -index index.
综上,在本申请实施例提供的预测生存风险率的方法中,通过使用包括门控网络和多个专家网络的预设模型对待预测样本进行预测,使得该方法预测到的生存风险率的准确度更高,进而提高了基于生存风险率所确定的生存曲线的准确度。To sum up, in the method for predicting the survival risk rate provided by the embodiment of the present application, by using a preset model including a gating network and multiple expert networks to predict the samples to be predicted, the survival risk rate predicted by the method is accurate. The accuracy is higher, which in turn improves the accuracy of the survival curve determined based on the survival hazard ratio.
此外,由于本申请实施例方法中采用的预设模型,可以通过端到端的方法训练得到,因此便于对模型在不同层次(整体和局部)进行解释,进而可以基于解释得到的样本特征对样本预测值的贡献度来指导真实场景下的样本优良改进。In addition, since the preset model used in the method of the embodiment of the present application can be obtained through end-to-end training, it is convenient to explain the model at different levels (whole and local), and then the sample can be predicted based on the sample characteristics obtained from the explanation. The contribution of the value is used to guide the sample improvement in the real scene.
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions provided by the embodiments of the present application from the perspective of methods. In order to realize the above functions, it includes corresponding hardware structures and/or software modules for performing various functions. Those skilled in the art should easily realize that the present application can be implemented in the form of hardware or a combination of hardware and computer software in combination with the units and algorithm steps of each example described in the embodiments disclosed herein. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
本申请实施例可以根据上述方法示例对预测生存风险率的装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiment of the present application can divide the functional modules of the device for predicting the survival risk rate according to the above method example, for example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module . The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
如图10所示,图10示出了本申请实施例提供的一种预测生存风险率的装置100的结构示意图。预测生存风险率的装置100可以用于执行上述的预测生存风险率的方法,例如用于执行图6所示的方法。其中,预测生存风险率的装置100可以包括获取单元101和处理单元102。As shown in FIG. 10 , FIG. 10 shows a schematic structural diagram of an apparatus 100 for predicting survival risk provided by an embodiment of the present application. The apparatus 100 for predicting the survival risk rate may be used to implement the above-mentioned method for predicting the survival risk rate, for example, to perform the method shown in FIG. 6 . Wherein, the apparatus 100 for predicting the survival risk rate may include an acquisition unit 101 and a processing unit 102 .
获取单元101,用于获取待预测样本的数据。处理单元102,用于将待预测样本的数据输入至预设模型,通过该预设模型对待预测样本的数据进行处理,得到用于表示该待预测样本的生存风险的生存风险率。其中,预设模型包括门控网络和多个专家网络,该门控网络用于根据待预测样本的数据确定每个专家网络对应的权重系数,生存风险率为根据每个专家网络对应的权重系数对多个专家网络的输出值加权求和获得的结果。The obtaining unit 101 is configured to obtain data of samples to be predicted. The processing unit 102 is configured to input the data of the sample to be predicted into a preset model, and process the data of the sample to be predicted through the preset model to obtain a survival risk rate representing the survival risk of the sample to be predicted. Among them, the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate is based on the weight coefficient corresponding to each expert network The result obtained by weighted summing the output values of multiple expert networks.
作为示例,结合图6,获取单元101可以用于执行S201,处理单元102可以用于执行S202。As an example, with reference to FIG. 6 , the obtaining unit 101 may be used to execute S201, and the processing unit 102 may be used to execute S202.
可选的,预测生存风险率的装置100还包括:确定单元103,用于基于待预测样本的生存风险率和基准风险函数,确定待预测样本的风险函数,其中,待预测样本的风险函数用于指示待预测样本在不同时间的生存率。Optionally, the apparatus 100 for predicting the survival risk rate further includes: a determination unit 103, configured to determine the risk function of the sample to be predicted based on the survival risk rate of the sample to be predicted and the baseline risk function, wherein the risk function of the sample to be predicted is used Indicates the survival rate of the sample to be predicted at different times.
可选的,上述预测模型中所包括的多个专家网络中的任一个专家网络包括至少一个候选RFCN,任一个专家网络的输出值是至少一个候选RFCN的输出值中满足预设条件的输出值。Optionally, any one of the plurality of expert networks included in the prediction model includes at least one candidate RFCN, and the output value of any one expert network is an output value satisfying a preset condition among the output values of at least one candidate RFCN .
可选的,上述待预测样本的数据包括非欧几里德类型的数据。Optionally, the data of the samples to be predicted include non-Euclidean data.
可选的,预测生存风险率的装置100还包括:解释单元104,用于基于待预测样本的数据和待预测样本的生存风险率,对预设模型进行解释,以获得待预测样本的数据中不同特征数据对生存风险率的影响。Optionally, the apparatus 100 for predicting the survival risk rate further includes: an interpretation unit 104, configured to explain the preset model based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain the data of the sample to be predicted Effect of different characteristic data on survival hazard ratio.
可选的,当待预测样本的数据是患者的病例数据,则解释单元104具体用于:基于患者的病例数据和患者的生存风险率,对预设模型进行解释,以获得患者的病例数据中不同特征数据对患者的生存风险率影响。Optionally, when the data of the sample to be predicted is the case data of the patient, the interpretation unit 104 is specifically configured to: interpret the preset model based on the case data of the patient and the survival risk rate of the patient, so as to obtain the The impact of different characteristic data on the survival risk rate of patients.
可选的,当待预测样本是设备的数据,则解释单元104具体用于:基于设备的数据和设备的生存风险率,对预设模型进行解释,以获得设备的数据中不同特征数据对设备的生存风险率的影响。Optionally, when the sample to be predicted is the data of the device, the interpretation unit 104 is specifically configured to: interpret the preset model based on the data of the device and the survival risk rate of the device, so as to obtain the different feature data in the data of the device. impact on the survival risk.
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种预测生存风险率的装置100的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。For a specific description of the foregoing optional manners, reference may be made to the foregoing method embodiments, and details are not repeated here. In addition, the explanation and the description of the beneficial effects of any of the apparatus 100 for predicting the survival risk rate provided above may refer to the above corresponding method embodiments, and details are not repeated here.
作为示例,结合图2,预测生存风险率的装置100中的获取单元101实现的功能可以通过图2中的通信接口24实现,处理单元102、确定单元103以及解释单元104实现的功能,可以通过图2中的处理器11执行图2中的主存储器22中的程序代码实现。As an example, with reference to FIG. 2 , the function realized by the acquisition unit 101 in the apparatus 100 for predicting survival risk rate can be realized through the communication interface 24 in FIG. 2 , and the functions realized by the processing unit 102, the determination unit 103 and the interpretation unit 104 can be realized through The processor 11 in FIG. 2 executes the program code in the main memory 22 in FIG. 2 to realize.
图11示出本申请实施例提供的用于承载计算机程序产品的信号承载介质的结构示意图,该信号承载介质用于存储计算机程序产品或用于存储计算设备上执行计算机进程的计算机程序。Fig. 11 shows a schematic structural diagram of a signal-carrying medium for carrying a computer program product provided by an embodiment of the present application. The signal-carrying medium is used for storing a computer program product or a computer program for executing a computer process on a computing device.
如图11所示,信号承载介质110可以包括一个或多个程序指令,其当被一个或多个处理器运行时可以提供以上针对图6描述的功能或者部分功能。因此,例如,参考图6中S201~S202 的一个或多个特征可以由与信号承载介质110相关联的一个或多个指令来承担。此外,图11中的程序指令也描述示例指令。As shown in FIG. 11 , signal-bearing medium 110 may include one or more program instructions that, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to FIG. 6 . Thus, for example, one or more features referred to in S201 - S202 in FIG. 6 may be undertaken by one or more instructions associated with the signal bearing medium 110 . Additionally, the program instructions in FIG. 11 also describe example instructions.
在一些示例中,信号承载介质110可以包含计算机可读介质111,诸如但不限于,硬盘驱动器、紧密盘(CD)、数字视频光盘(DVD)、数字磁带、存储器、只读存储记忆体(read-only memory,ROM)或随机存储记忆体(random access memory,RAM)等等。In some examples, signal bearing medium 110 may comprise computer readable medium 111 such as, but not limited to, a hard drive, compact disc (CD), digital video disc (DVD), digital tape, memory, read-only memory (read only memory) -only memory, ROM) or random access memory (random access memory, RAM) and so on.
在一些实施方式中,信号承载介质110可以包含计算机可记录介质112,诸如但不限于,存储器、读/写(R/W)CD、R/W DVD、等等。In some implementations, signal bearing media 110 may comprise computer recordable media 112 such as, but not limited to, memory, read/write (R/W) CDs, R/W DVDs, and the like.
在一些实施方式中,信号承载介质110可以包含通信介质113,诸如但不限于,数字和/或模拟通信介质(例如,光纤电缆、波导、有线通信链路、无线通信链路、等等)。In some implementations, signal bearing medium 110 may include communication media 113 such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
信号承载介质110可以由无线形式的通信介质113(例如,遵守IEEE 1902.11标准或者其它传输协议的无线通信介质)来传达。一个或多个程序指令可以是,例如,计算机可执行指令或者逻辑实施指令。The signal bearing medium 110 may be conveyed by a wireless form of communication medium 113 (eg, a wireless communication medium conforming to the IEEE 1902.11 standard or other transmission protocol). One or more program instructions may be, for example, computer-executable instructions or logic-implementing instructions.
在一些示例中,诸如针对图6描述的预测生存风险率的装置可以被配置为,响应于通过计算机可读介质111、计算机可记录介质112、和/或通信介质113中的一个或多个程序指令,提供各种操作、功能、或者动作。In some examples, an apparatus for predicting survival risk, such as that described with respect to FIG. Instructions provide various operations, functions, or actions.
应该理解,这里描述的布置仅仅是用于示例的目的。因而,本领域技术人员将理解,其它布置和其它元素(例如,机器、接口、功能、顺序、和功能组等等)能够被取而代之地使用,并且一些元素可以根据所期望的结果而一并省略。另外,所描述的元素中的许多是可以被实现为离散的或者分布式的组件的、或者以任何适当的组合和位置来结合其它组件实施的功能实体。It should be understood that the arrangements described herein are for example purposes only. Accordingly, those skilled in the art will appreciate that other arrangements and other elements (e.g., machines, interfaces, functions, sequences, and groups of functions, etc.) can be used instead, and some elements may be omitted altogether depending on the desired result. . In addition, many of the described elements are functional entities that may be implemented as discrete or distributed components, or implemented in conjunction with other components in any suitable combination and location.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using a software program, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present application are generated in whole or in part when the computer executes the instructions on the computer. A computer can be a general purpose computer, special purpose computer, computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or may contain one or more data storage devices such as servers and data centers that can be integrated with the medium. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)), etc.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the protection scope of the claims.

Claims (19)

  1. 一种预测生存风险率的方法,其特征在于,包括:A method for predicting survival risk, characterized in that it comprises:
    获取待预测样本的数据;Obtain the data of the sample to be predicted;
    将所述待预测样本的数据输入至预设模型,通过所述预设模型对所述待预测样本的数据进行处理,得到所述待预测样本的生存风险率HR;所述生存风险率用于表示所述待预测样本的生存风险;The data of the sample to be predicted is input into a preset model, and the data of the sample to be predicted is processed through the preset model to obtain the survival risk rate HR of the sample to be predicted; the survival risk rate is used for Indicates the survival risk of the sample to be predicted;
    其中,所述预设模型包括门控网络和多个专家网络,所述门控网络用于根据所述待预测样本的数据确定每个专家网络对应的权重系数,所述生存风险率为根据每个专家网络对应的权重系数对所述多个专家网络的输出值加权求和获得的结果。Wherein, the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate is based on each The weight coefficient corresponding to each expert network is the result obtained by weighting and summing the output values of the plurality of expert networks.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    基于所述生存风险率和基准风险函数,确定所述待预测样本的风险函数,所述风险函数用于指示所述待预测样本在不同时间的生存率。A risk function of the sample to be predicted is determined based on the survival risk rate and a reference risk function, and the risk function is used to indicate the survival rate of the sample to be predicted at different times.
  3. 根据权利要求1或2所述的方法,其特征在于,所述多个专家网络中的任一个专家网络包括至少一个候选残差全连接神经网络RFCN,所述任一个专家网络的输出值是所述至少一个候选RFCN的输出值中满足预设条件的输出值。The method according to claim 1 or 2, wherein any expert network in the plurality of expert networks comprises at least one candidate residual fully connected neural network RFCN, and the output value of any one expert network is the Among the output values of the at least one candidate RFCN, an output value that satisfies a preset condition.
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述待预测样本的数据包括非欧几里德类型的数据。The method according to any one of claims 1-3, wherein the data of the sample to be predicted includes non-Euclidean data.
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-4, wherein the method further comprises:
    基于所述待预测样本的数据和所述待预测样本的生存风险率,对所述预设模型进行解释,以获得所述待预测样本的数据中不同特征数据对所述生存风险率的影响。Based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, the preset model is interpreted to obtain the influence of different feature data in the data of the sample to be predicted on the survival risk rate.
  6. 根据权利要求5所述的方法,其特征在于,当所述待预测样本的数据是患者的病例数据,则基于所述待预测样本的数据和所述待预测样本的生存风险率,对所述预设模型进行解释,以获得所述待预测样本的数据中不同特征数据对所述生存风险率的影响,包括:The method according to claim 5, wherein when the data of the sample to be predicted is patient case data, based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, the The preset model is explained to obtain the influence of different characteristic data in the data of the sample to be predicted on the survival risk rate, including:
    基于所述患者的病例数据和所述患者的生存风险率,对所述预设模型进行解释,以获得所述患者的病例数据中不同特征数据对所述患者的生存风险率的影响。Based on the patient's case data and the patient's survival risk rate, the preset model is explained to obtain the influence of different feature data in the patient's case data on the patient's survival risk rate.
  7. 根据权利要求5所述的方法,其特征在于,当所述待预测样本是设备的数据,则基于所述待预测样本的数据和所述待预测样本的生存风险率,对所述预设模型进行解释,以获得所述待预测样本的数据中不同特征数据对所述生存风险率的影响,包括:The method according to claim 5, wherein when the sample to be predicted is the data of equipment, based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, the preset model Explain to obtain the influence of different characteristic data in the data of the sample to be predicted on the survival risk rate, including:
    基于所述设备的数据和所述设备的生存风险率,对所述预设模型进行解释,以获得所述设备的数据中不同特征数据对所述设备的生存风险率的影响。Based on the data of the device and the survival risk rate of the device, the preset model is interpreted to obtain the influence of different feature data in the data of the device on the survival risk rate of the device.
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-7, further comprising:
    利用训练样本的数据对初始模型进行训练,得到所述预设模型;其中,所述初始模型包括初始门控网络和多个初始专家网络。The initial model is trained by using the data of the training samples to obtain the preset model; wherein, the initial model includes an initial gating network and a plurality of initial expert networks.
  9. 根据权利要求8所述的方法,其特征在于,所述利用训练样本的数据对初始模型进行训练,包括:The method according to claim 8, wherein the training of the initial model using the data of the training samples comprises:
    将所述训练样本的数据输入所述初始模型中的所述初始门控网络和所述多个初始专家网络;inputting data of the training samples into the initial gating network and the plurality of initial expert networks in the initial model;
    根据所述初始门控网络得到每个初始专家网络的权重系数,并根据每个初始专家网络对应的权重系数对所述多个初始专家网络的输出值加权求和,得到所述训练样本的预测生存风险率;Obtain the weight coefficient of each initial expert network according to the initial gating network, and weight and sum the output values of the multiple initial expert networks according to the weight coefficient corresponding to each initial expert network to obtain the prediction of the training sample Survival hazard rate;
    基于所述训练样本的预测生存风险率和所述训练样本的生存数据确定损失函数;determining a loss function based on the predicted survival hazard rate of the training samples and the survival data of the training samples;
    基于所述损失函数调节所述初始门控网络和所述多个初始专家网络的网络参数。Network parameters of the initial gating network and the plurality of initial expert networks are adjusted based on the loss function.
  10. 一种预测生存风险率的装置,其特征在于,包括:A device for predicting survival risk, characterized by comprising:
    获取单元,用于获取待预测样本的数据;An acquisition unit, configured to acquire data of samples to be predicted;
    处理单元,用于将所述待预测样本的数据输入至预设模型,通过所述预设模型对所述待预测样本的数据进行处理,得到所述待预测样本的生存风险率HR;所述生存风险率用于表示所述待预测样本的生存风险;a processing unit, configured to input the data of the sample to be predicted into a preset model, and process the data of the sample to be predicted through the preset model to obtain the survival risk rate HR of the sample to be predicted; The survival risk rate is used to represent the survival risk of the sample to be predicted;
    其中,所述预设模型包括门控网络和多个专家网络,所述门控网络用于根据所述待预测样本的数据确定每个专家网络对应的权重系数,所述生存风险率为根据每个专家网络对应的权重系数对所述多个专家网络的输出值加权求和获得的结果。Wherein, the preset model includes a gating network and a plurality of expert networks, the gating network is used to determine the weight coefficient corresponding to each expert network according to the data of the sample to be predicted, and the survival risk rate is based on each The weight coefficient corresponding to each expert network is the result obtained by weighting and summing the output values of the plurality of expert networks.
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括:The device according to claim 10, further comprising:
    确定单元,用于基于所述生存风险率和基准风险函数,确定所述待预测样本的风险函数,所述风险函数用于指示所述待预测样本在不同时间的生存率。The determining unit is configured to determine a risk function of the sample to be predicted based on the survival risk rate and a reference risk function, and the risk function is used to indicate the survival rate of the sample to be predicted at different times.
  12. 根据权利要求10或11所述的装置,其特征在于,所述多个专家网络中的任一个专家网络包括至少一个候选残差全连接神经网络RFCN,所述任一个专家网络的输出值是所述至少一个候选RFCN的输出值中满足预设条件的输出值。The device according to claim 10 or 11, wherein any expert network in the plurality of expert networks includes at least one candidate residual fully connected neural network RFCN, and the output value of any one expert network is the Among the output values of the at least one candidate RFCN, an output value that satisfies a preset condition.
  13. 根据权利要求10-12中任一项所述的装置,其特征在于,所述待预测样本的数据包括非欧几里德类型的数据。The device according to any one of claims 10-12, wherein the data of the sample to be predicted includes non-Euclidean data.
  14. 根据权利要求10-13中任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 10-13, wherein the device further comprises:
    解释单元,用于基于所述待预测样本的数据和所述待预测样本的生存风险率,对所述预设模型进行解释,以获得所述待预测样本的数据中不同特征数据对所述生存风险率的影响。An interpretation unit, configured to explain the preset model based on the data of the sample to be predicted and the survival risk rate of the sample to be predicted, so as to obtain the impact of different feature data in the data of the sample to be predicted on the survival rate impact on risk.
  15. 根据权利要求14所述的装置,其特征在于,当所述待预测样本的数据是患者的病例数据,则所述解释单元具体用于:The device according to claim 14, wherein when the data of the sample to be predicted is patient case data, the interpretation unit is specifically used for:
    基于所述患者的病例数据和所述患者的生存风险率,对所述预设模型进行解释,以获得所述患者的病例数据中不同特征数据对所述患者的生存风险率影响。Based on the patient's case data and the patient's survival risk rate, the preset model is interpreted to obtain the impact of different feature data in the patient's case data on the patient's survival risk rate.
  16. 根据权利要求14所述的装置,其特征在于,当所述待预测样本是设备的数据,则所述解释单元具体用于:The device according to claim 14, wherein when the sample to be predicted is device data, the interpretation unit is specifically used for:
    基于所述设备的数据和所述设备的生存风险率,对所述预设模型进行解释,以获得所述设备的数据中不同特征数据对所述设备的生存风险率的影响。Based on the data of the device and the survival risk rate of the device, the preset model is interpreted to obtain the influence of different feature data in the data of the device on the survival risk rate of the device.
  17. 一种预测生存风险率的装置,其特征在于,包括:一个或多个处理器和存储器,所述一个或多个处理器被配置为调用存储在所述存储器中的程序指令,以执行如权利要求1-9中任一项所述的方法。A device for predicting survival risk, characterized by comprising: one or more processors and memory, the one or more processors are configured to invoke program instructions stored in the memory to execute the The method described in any one of claims 1-9.
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括程序指令,当所述程序指令在计算机或处理器上运行时,使得所述计算机或所述处理器执行权利要求1-9中任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium includes program instructions, and when the program instructions are run on a computer or a processor, the computer or the processor executes claim 1 - the method described in any one of 9.
  19. 一种计算机程序产品,其特征在于,当所述计算机程序产品在预测生存风险率的装置上运行时,使得所述装置执行如权利要求1-9中任一项所述的方法。A computer program product, characterized in that when the computer program product is run on a device for predicting survival risk, the device is made to execute the method according to any one of claims 1-9.
PCT/CN2022/081403 2021-07-15 2022-03-17 Method and device for predicting survival hazard ratio WO2023284321A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110801028 2021-07-15
CN202110801028.5 2021-07-15
CN202210028933.6 2022-01-11
CN202210028933.6A CN115620902A (en) 2021-07-15 2022-01-11 Method and device for predicting survival risk rate

Publications (1)

Publication Number Publication Date
WO2023284321A1 true WO2023284321A1 (en) 2023-01-19

Family

ID=84857572

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/081403 WO2023284321A1 (en) 2021-07-15 2022-03-17 Method and device for predicting survival hazard ratio

Country Status (2)

Country Link
CN (1) CN115620902A (en)
WO (1) WO2023284321A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243736A (en) * 2019-10-24 2020-06-05 中国人民解放军海军军医大学第三附属医院 Survival risk assessment method and system
CN111640510A (en) * 2020-04-09 2020-09-08 之江实验室 Disease prognosis prediction system based on deep semi-supervised multitask learning survival analysis
CN112561030A (en) * 2020-06-15 2021-03-26 中国电力科学研究院有限公司 Method and device for determining insulation state of mutual inductor based on neural network
WO2021114974A1 (en) * 2019-12-14 2021-06-17 支付宝(杭州)信息技术有限公司 User risk assessment method and apparatus, electronic device, and storage medium
CN113040711A (en) * 2021-03-03 2021-06-29 吾征智能技术(北京)有限公司 Cerebral stroke attack risk prediction system, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243736A (en) * 2019-10-24 2020-06-05 中国人民解放军海军军医大学第三附属医院 Survival risk assessment method and system
WO2021114974A1 (en) * 2019-12-14 2021-06-17 支付宝(杭州)信息技术有限公司 User risk assessment method and apparatus, electronic device, and storage medium
CN111640510A (en) * 2020-04-09 2020-09-08 之江实验室 Disease prognosis prediction system based on deep semi-supervised multitask learning survival analysis
CN112561030A (en) * 2020-06-15 2021-03-26 中国电力科学研究院有限公司 Method and device for determining insulation state of mutual inductor based on neural network
CN113040711A (en) * 2021-03-03 2021-06-29 吾征智能技术(北京)有限公司 Cerebral stroke attack risk prediction system, equipment and storage medium

Also Published As

Publication number Publication date
CN115620902A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
Xiao et al. Readmission prediction via deep contextual embedding of clinical concepts
Uddin et al. Comparing different supervised machine learning algorithms for disease prediction
Rashidi et al. Artificial intelligence and machine learning in pathology: the present landscape of supervised methods
Srinivasu et al. From blackbox to explainable AI in healthcare: existing tools and case studies
Jerlin Rubini et al. Efficient classification of chronic kidney disease by using multi‐kernel support vector machine and fruit fly optimization algorithm
Hatwell et al. Ada-WHIPS: explaining AdaBoost classification with applications in the health sciences
JP2019526851A (en) Distributed machine learning system, apparatus, and method
Davazdahemami et al. An explanatory machine learning framework for studying pandemics: The case of COVID-19 emergency department readmissions
Guo et al. The use of synthetic electronic health record data and deep learning to improve timing of high-risk heart failure surgical intervention by predicting proximity to catastrophic decompensation
US20240242791A1 (en) Data processing systems and methods for identifying new indications for drugs
Mukherjee Malignant mesothelioma disease diagnosis using data mining techniques
Wen et al. Time-to-event modeling for hospital length of stay prediction for COVID-19 patients
Chi et al. Deep semisupervised multitask learning model and its interpretability for survival analysis
Rong et al. Diagnostic classification of lung cancer using deep transfer learning technology and multi‐omics data
Kumar et al. Deep-learning-enabled multimodal data fusion for lung disease classification
Xu et al. Machine learning models for 180-day mortality prediction of patients with advanced cancer using patient-reported symptom data
WO2023284321A1 (en) Method and device for predicting survival hazard ratio
Zhang et al. Semi‐supervised graph convolutional networks for the domain adaptive recognition of thyroid nodules in cross‐device ultrasound images
US20230253116A1 (en) Estimating patient risk of cytokine storm using biomarkers
KR20200023916A (en) Computing device for providing prediction information for bone density
Li et al. Multiview deep forest for overall survival prediction in cancer
US20240274286A1 (en) Clinical Outcome Prediction By Application Of Machine Learning Models To Clinical Data
Shu et al. Drugs Resistance Analysis from Scarce Health Records via Multi-task Graph Representation
Huang et al. Sparse‐coding‐based autoencoder and its application for cancer survivability prediction
Al-Bwana Coronavirus (COVID-19) Detection using Ensemble Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22840966

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22840966

Country of ref document: EP

Kind code of ref document: A1