CN114822741A

CN114822741A - Processing device, computer equipment and storage medium of patient classification model

Info

Publication number: CN114822741A
Application number: CN202210447273.5A
Authority: CN
Inventors: 徐卓扬; 赵越; 孙行智; 胡岗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-07-29

Abstract

The embodiment of the application belongs to the field of artificial intelligence and digital medical treatment, and relates to a processing device, computer equipment and a storage medium of a patient classification model, wherein the device inputs training sample data of a patient into an initial patient classification model to obtain a short-term reward prediction parameter, a doctor tendency prediction parameter and a long-term reward prediction parameter aiming at each candidate classification result; acquiring short-term reward parameters and doctor tendency parameters from the label; calculating joint loss based on the short-term reward prediction parameters and the short-term reward parameters, the doctor tendency prediction parameters and the doctor tendency parameters, the short-term reward prediction parameters and the long-term reward prediction parameters, and the doctor tendency prediction parameters and the long-term reward prediction parameters so as to adjust the initial patient classification model to obtain a patient classification model; and inputting the sample data of the target patient into the patient classification model to obtain a patient classification result. The present application also relates to blockchain techniques, where training sample data may be stored in the blockchain. The accuracy of the patient classification model is improved.

Description

Processing device, computer equipment and storage medium of patient classification model

Technical Field

The present application relates to the field of digital medical technology, and in particular, to a processing apparatus, a computer device, and a storage medium for a patient classification model.

Background

With the development of computer technology, medical institutions such as hospitals increasingly use computers for medical diagnosis and medical management. Patient classification is an important issue in the medical field, and is often associated with treatment, diagnosis, risk assessment, etc., and accurate patient classification is of great significance.

Patient classification techniques implemented by computers typically input patient sample data into a neural network-based patient classification model, which typically predicts from a long-term time dimension and a short-term time dimension and outputs patient classification results. However, when prediction is performed from the long-term time dimension, more errors are often caused by the long-term time, and the accuracy of the patient classification model is low.

Disclosure of Invention

An object of the embodiments of the present application is to provide a processing apparatus, a computer device and a storage medium for a patient classification model, so as to solve the problem of low accuracy of the patient classification model.

In order to solve the above technical problem, an embodiment of the present application further provides a processing apparatus for a patient classification model, which adopts the following technical solutions:

the training acquisition module is used for acquiring training sample data with a label of a patient;

a prediction acquisition module, configured to input the training sample data into an initial patient classification model, so as to output a short-term reward prediction parameter and a doctor tendency prediction parameter of the training sample data for each candidate classification result through a first network in the initial patient classification model, and output a long-term reward prediction parameter of the training sample data for each candidate classification result through a second network in the initial patient classification model;

the label acquisition module is used for acquiring the short-term reward parameters and the doctor tendency parameters of the training sample data from the label;

a loss calculation module for calculating a combined loss based on the short-term reward prediction parameter and the short-term reward parameter, the physician propensity prediction parameter and the physician propensity parameter, the short-term reward prediction parameter and the long-term reward prediction parameter, and the physician propensity prediction parameter and the long-term reward prediction parameter;

the model adjusting module is used for adjusting the initial patient classification model according to the joint loss until the joint loss meets a training stopping condition to obtain a patient classification model;

and the patient classification module is used for acquiring sample data of the target patient and inputting the sample data into the patient classification model to obtain a patient classification result of the target patient.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which includes a memory and a processor, where the memory stores computer readable instructions, and the processor implements the functions of the modules in the processing apparatus of the patient classification model when executing the computer readable instructions.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, on which computer-readable instructions are stored, and the computer-readable instructions, when executed by a processor, implement the functions of the modules in the processing device of the patient classification model.

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects: acquiring training sample data with a label of a patient and inputting the training sample data into an initial patient classification model, wherein the initial patient classification model comprises a first network and a second network and can respectively output a short-term reward prediction parameter, a doctor tendency prediction parameter and a long-term reward prediction parameter of the training sample data aiming at each candidate classification result; the short-term reward parameter measures the reward obtained by selecting each candidate classification result from a shorter time dimension, the long-term reward parameter measures the reward obtained by selecting each candidate classification result from a longer time dimension, and the doctor tendency prediction parameter represents the probability of selecting each candidate classification result according to the doctor experience; acquiring short-term reward parameters and doctor tendency parameters of training sample data from the label; calculating combined loss according to the short-term reward prediction parameters and the short-term reward parameters, the doctor tendency prediction parameters and the doctor tendency parameters, the short-term reward prediction parameters and the long-term reward prediction parameters, and the doctor tendency prediction parameters and the long-term reward prediction parameters, wherein the short-term reward prediction parameters and the doctor tendency prediction parameters are supervised and learned, the long-term reward prediction parameters are deeply strengthened and are limited by the short-term reward prediction parameters and the doctor tendency prediction parameters, so that the model can make prediction more in line with the experience of doctors, errors in strengthened learning are reduced, and the calculation of the combined loss is more accurate; therefore, the accuracy of the patient classification model obtained after adjustment according to the joint loss is improved, and the accuracy of target patient classification according to the patient classification model is further improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a schematic diagram of one embodiment of a processing device for a patient classification model according to the present application;

FIG. 3 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that the processing device of the patient classification model provided in the embodiment of the present application is generally disposed in a server.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a schematic structural diagram of one embodiment of a processing device of a patient classification model according to the present application is shown. The processing device 200 of the patient classification model may include: a training acquisition module 201, a prediction acquisition module 202, a label acquisition module 203, a loss calculation module 204, a model adjustment module 205, and a patient classification module 206,

wherein:

a training obtaining module 201, configured to obtain training sample data with a label for a patient.

In this embodiment, the electronic device (e.g. the server shown in fig. 1) on which the processing means of the patient classification model operates may communicate with the terminal by means of a wired connection or a wireless connection. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G/5G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.

Specifically, the present application aims to classify a target patient through a patient classification model, and before classifying the target patient, the patient classification model needs to be obtained through model training in advance. In the model training phase, training sample data of a patient needs to be acquired, wherein the training sample data is related to the patient and is provided with a label.

The training sample data used in the training phase and the sample data used in the application phase may be follow-up data generated by follow-up visits to patients. The follow-up refers to an observation method for the hospital to regularly know the disease condition change of the patient and guide the patient to recover by communication or other means to the patient who has been on a visit in the hospital.

Multiple follow-up visits may be made to the patient in advance, with follow-up data being generated for each visit. Follow-up data of multiple follow-up visits of a patient can be acquired at one time, and then the acquired follow-up data is determined as training sample data or sample data.

The follow-up data may include patient basic information, disease examination information, historical medication information, and doctor prescription information, and the disease examination information may include symptom information. The basic information of the patient can be information related to the demographics of the patient, including sex, age, occupation, marital status, entrance and exit places and the like; the disease examination information may be information obtained by performing medical examination on a patient, including symptom information describing symptoms of the patient, a medical examination report, and the like; the historical medication information is the medication information of the patient before the follow-up visit corresponding to the follow-up visit data, and the doctor prescription information is the prescription information given by the patient after the follow-up visit corresponding to the follow-up visit data.

The follow-up data comprises basic information of the patient, disease examination information, historical medication information and doctor prescription information, the data dimensionality is rich, and the accuracy of the trained patient classification model is ensured, so that the accuracy of patient classification is ensured.

It should be emphasized that, in order to further ensure the privacy and security of the training sample data, the training sample data may also be stored in a node of a block chain. It will be appreciated that sample data may also be stored in nodes of a blockchain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The prediction obtaining module 202 is configured to input training sample data into the initial patient classification model, output the short-term reward prediction parameters and the doctor tendency prediction parameters of the training sample data for each candidate classification result through a first network in the initial patient classification model, and output the long-term reward prediction parameters of the training sample data for each candidate classification result through a second network in the initial patient classification model.

Specifically, the patient classification model in the present application belongs to a DQN (Deep Q-Learning) model, which is a Deep reinforcement Learning model for optimizing a long-term accumulation objective for optimizing a sequential decision problem. Reinforcement learning is an artificial intelligence method in which an agent (agent) takes an action (action) for a state (state) based on a certain policy (policy) and obtains a reward (reward), and then optimizes the policy (policy) through the obtained reward (reward). Where policy means that some action should be taken at a certain state to maximize the expected jackpot rewarded. The DQN model utilizes the neural network to fit the strategy policy, the state is input into the neural network, the neural network outputs the Q value (expected accumulated reward) corresponding to each action, and the action corresponding to the maximum Q value is the action which the DQN model considers to be selected. Taking the classification example of the diabetic patient, the input state is a multidimensional vector formed by sample data (basic information of the patient, disease examination information, historical medication information and doctor prescription information), and the action is a one-hot code of the classification result of the patient. The jackpot parameter Q includes a long-term award parameter that evaluates the outcomes from a longer time dimension and a short-term award parameter that evaluates the outcomes from a shorter time dimension. For example, when the patient is diabetic, the long-term reward parameter is defined as sign (whether complications occur at the last visit) 5, and the short-term reward parameter is defined as sign (whether glycated hemoglobin reaches the next visit) 1, wherein the value of the long-term reward parameter is generally larger than that of the short-term reward parameter so as to meet the conventional setting in deep reinforcement learning.

In the application, training sample data is input into an initial patient classification model, the initial patient classification model improves the network structure of a traditional DQN model, and a Long-Short-split-Q-network (LSSQN) model is provided. The LSSQN model comprises a first network and a second network. After data input into the first Network passes through two shared Neural Networks (NN), one independent Neural Network layer outputs a short-term prediction reward parameter R, and the other independent Neural Network layer outputs a doctor tendency prediction parameter P. And after the second network passes through the two neural network layers, outputting a long-term reward prediction parameter F.

N candidate classification results are preset in the initial patient classification model, each candidate classification result represents a disease sub-category under a certain disease category, and each candidate classification result can correspond to a short-term prediction reward parameter R, a doctor tendency prediction parameter P and a long-term reward prediction parameter F. Wherein the short term reward prediction parameter R is a reward parameter (reward) measured from the shorter time dimension that results when a candidate classification result (action) is taken based on input sample data (state). The long-term reward prediction parameter F is a reward parameter (reward) measured from a longer time dimension that results when candidate classification results (action) are taken from input sample data (state). The doctor tendency prediction parameter P is the probability of taking a candidate classification result (action) from input sample data (state) measured from the doctor dimension. The larger the value of the reward parameter is, the higher the reward is, and the more possible the corresponding candidate classification result is taken.

Three sets of concepts in this application: the long-term reward prediction parameters and the long-term reward parameters, the short-term reward prediction parameters and the short-term reward parameters, the doctor tendency prediction parameters and the doctor tendency parameters are contained in the same group of concepts, and the long-term reward prediction parameters, the short-term reward prediction parameters and the doctor tendency prediction parameters are used in a model training stage; in the model application phase, long term reward parameters, short term reward parameters and physician propensity parameters are used.

Patient classification in this application is the further classification of patients with a certain disease to determine the disease sub-category of patients under a certain disease category. For example, diabetic patients are divided into 10 subcategories.

Each disease category has a corresponding patient classification model. That is, for diabetes and heart disease, different patient classification models need to be used. Therefore, in training and application, a disease identifier may be obtained first, and the disease identifier may be an identifier of a disease suffered by a patient, specifically, a disease name or a disease code. Then, a corresponding initial patient classification model or patient classification model is selected according to the disease identification.

And the label acquiring module 203 is used for acquiring the short-term reward parameters and the doctor tendency parameters of the training sample data from the label.

Specifically, the training sample data has a label, and real short-term reward parameters and physician tendency parameters of the recorded training sample data can be acquired from the label. The physician-disposition parameter may be annotated by a physician or medical professional.

A loss calculation module 204 for calculating a combined loss based on the short-term reward prediction parameter and the short-term reward parameter, the physician's disposition prediction parameter and the physician's disposition parameter, the short-term reward prediction parameter and the long-term reward prediction parameter, and the physician's disposition prediction parameter and the long-term reward prediction parameter.

Specifically, the loss function in the present application is a joint loss function, and includes several sub-loss functions. The loss brought by each sub-loss function can be calculated according to the short-term reward prediction parameter and the short-term reward parameter, the doctor tendency prediction parameter and the doctor tendency parameter, the short-term reward prediction parameter and the long-term reward prediction parameter, the doctor tendency prediction parameter and the long-term reward prediction parameter, and then the joint loss brought by the joint loss function is calculated.

Further, the loss calculating module 204 may include: a first computation submodule, a second computation submodule, a third computation submodule, a fourth computation submodule, and a joint computation submodule, wherein:

a first calculation sub-module for calculating a first loss based on the short-term reward prediction parameter and the short-term reward parameter.

Specifically, the first loss may be expressed as:

L _R ＝(R _true -R _predict ) ² (1)

wherein R is _true Is a short-term reward parameter. R _predict Parameters are predicted for the short-term rewards.

And the second calculating submodule is used for calculating second loss according to the doctor tendency prediction parameter and the doctor tendency parameter.

Specifically, the second loss may be expressed as:

L _P ＝-P _true *log(P _predict )-(1-P _true )*log(1-P _predict ) (2)

wherein, P _true A physician-oriented parameter representing the patient's training sample data s _t (status state) whether the candidate classification result a (action) is adopted, if so, 1, otherwise, 0. P _predict Parameters are predicted for physician trends.

And the third calculation submodule is used for calculating a third loss according to the short-term reward forecasting parameter and the long-term reward forecasting parameter.

Specifically, the third loss may be expressed as:

wherein s is _t 、a _t Respectively training sample data and candidate classification results at a time point t. F(s) _t ,a _t ) At time t, input training sample data s _t The candidate classification result is a _t The long-term reward prediction parameters obtained.

At time t, input training sample data s _t And the candidate classification result is the short-term reward prediction parameter without gradient form obtained in the case of a. F(s) _t+1 And a) at time point t +1, inputting training sample data s _t+1 And the candidate classification result is a long-term reward prediction parameter obtained.

For construction of the long-term reward parameters, from each

The maximum value is selected as a learning target in model training.

In this embodiment, a long-term reward parameter F is constructed, and the long-term reward prediction parameter F is trained and fitted according to the short-term reward prediction parameter R, so that the error of the long-term reward prediction parameter is reduced, and the model error in a long-term dimension is reduced.

And the fourth calculating submodule is used for calculating fourth loss according to the doctor tendency prediction parameter and the long-term reward prediction parameter.

Specifically, the fourth loss may be expressed as:

wherein s is _t 、a _t Respectively training sample data and candidate classification results at a time point t.

At time t, input training sample data s _t The candidate classification result is a _t The obtained doctor tendency prediction parameter without gradient form is a fixed estimation value. F(s) _t ,a _t ) At time point t, input training sample data s _t The candidate classification result is a _t Temporal long-term reward prediction parameterAnd (4) counting.

It is to be understood that the training of the first network is supervised learning and the training of the second network is reinforcement learning.

In the embodiment, the long-term reward prediction parameter F is limited and regularized by using the doctor tendency prediction parameter P, the influence of doctor knowledge and experience is introduced, the long-term reward prediction parameter output by the model is enabled to be more accordant with the doctor experience, the reinforcement learning is enabled to be extrapolated in the direction more accordant with the doctor experience, the risk of the model learning to unreasonable decision is reduced, and the learned decision logic is enabled to be safer and more reasonable.

And the joint calculation submodule is used for performing linear operation on the first loss, the second loss, the third loss and the fourth loss to obtain joint loss.

Specifically, the joint loss may be obtained by performing a linear operation on the first loss, the second loss, the third loss, and the fourth loss. In one embodiment, the joint loss may be specifically expressed as:

L＝L _R +L _P +L _F +L _reg (5)

in the embodiment, the combined loss takes long-term reward, short-term reward and doctor tendency into account, the loss measurement is more accurate, and the accuracy of the finally obtained patient classification model is ensured.

And the model adjusting module 205 is configured to adjust the initial patient classification model according to the joint loss until the joint loss meets the training stopping condition, so as to obtain the patient classification model.

Specifically, with the minimum joint loss as a target, model parameters in the initial patient classification model are adjusted, iterative training is performed on the initial patient classification model after parameter adjustment, and the training is stopped until the obtained joint loss meets a preset training stop condition, so that the patient classification model is obtained. Wherein the training stop condition may be that the joint loss is less than a preset loss threshold.

And the patient classification module 206 is configured to obtain sample data of the target patient, input the sample data into the patient classification model, and obtain a patient classification result of the target patient.

Specifically, when model application is performed, sample data of a target patient is acquired and input into a patient classification model. And processing the sample data by the trained patient classification model, determining a candidate classification result corresponding to the target patient, and generating a patient classification result according to the determined candidate classification result.

Further, the patient classification module 206 may include: the system comprises a sample obtaining submodule, an award obtaining submodule, an accumulation calculating submodule and a result determining submodule, wherein:

the sample acquisition submodule is used for acquiring sample data of the target patient, and the sample data comprises basic information of the patient, disease examination information, historical medication information and doctor prescription information.

And the reward acquisition sub-module is used for inputting the sample data into the patient classification model, outputting the short-term reward parameters of the sample data aiming at each candidate classification result through a first network in the patient classification model, and outputting the long-term reward parameters of the sample data aiming at each candidate classification result through a second network in the patient classification model.

And the accumulative calculation sub-module is used for calculating the accumulative reward parameters of the sample data aiming at each candidate classification result according to the short-term reward parameters and the long-term reward parameters.

And the result determining submodule is used for selecting the candidate classification result corresponding to the maximum accumulated reward parameter as the patient classification result.

Specifically, sample data of the target patient is acquired, and the sample data can be follow-up data of the patient, including basic information of the patient, disease examination information, historical medication information and doctor prescription information.

And inputting the sample data into the patient classification model, wherein a first network in the patient classification model outputs the short-term reward parameters of the sample data under each candidate classification result, and a second network outputs the long-term reward parameters of the sample data under each candidate classification result.

Both the short-term reward parameters and the long-term reward parameters need to be considered when patient classification is performed. And adding the short-term reward parameters and the long-term reward parameters under each candidate classification result to obtain the cumulative reward parameters of the sample data aiming at each candidate classification result.

The jackpot parameter corresponds to a candidate classification result, and the larger the value of the jackpot parameter, the larger the reward obtained when taking the candidate classification result (action) according to the input sample data (status). Therefore, the candidate classification result corresponding to the maximum jackpot parameter may be selected, and the patient classification result may be generated according to the selected candidate classification result and the corresponding jackpot parameter.

In one embodiment, the first network may also output physician-preference parameters, which may provide confidence levels when the model is applied. The smaller the physician-oriented parameter, the less accurate the output results representing the model. The physician-oriented parameters can also be put into the patient classification results for displaying the accuracy of the patient classification results. In one embodiment, if the value of the physician-preference parameter is less than a preset physician-preference threshold, an alarm instruction is triggered for the relevant personnel to review or retrain the model.

In the embodiment, the patient classification model outputs the short-term reward parameters and the long-term reward parameters, the cumulative reward parameters are calculated according to the short-term reward parameters and the long-term reward parameters, all candidate classification results are comprehensively measured from the long-term time dimension and the short-term time dimension, and the accuracy of the patient classification results is improved.

Further, the processing device 200 of the patient classification model may further include: medicine prescription inquiry module and show module, wherein:

and the prescription inquiry module is used for inquiring the prescription information associated with the patient classification result.

And the display module is used for displaying the patient classification result and the prescription information of the target patient through the terminal.

In particular, the patient classification result may be an identification, e.g. may be a code, different codes representing different disease subcategories, and different disease subcategories corresponding to different prescription information.

Inquiring the prescription information corresponding to the patient classification result, and then displaying the patient classification result and the prescription information of the target patient through a terminal, wherein the terminal can be a terminal used by the target patient or a doctor, so that the patient classification of the target patient and the confirmation of the prescription information are automatically realized.

In this embodiment, the prescription information corresponding to the patient classification result is queried and displayed, so that the prescription information of the target patient is automatically determined.

In the embodiment, training sample data with labels of patients is acquired and input into an initial patient classification model, wherein the initial patient classification model comprises a first network and a second network, and can respectively output short-term reward prediction parameters, doctor tendency prediction parameters and long-term reward prediction parameters of the training sample data aiming at each candidate classification result; the short-term reward parameter measures the reward obtained by selecting each candidate classification result from a shorter time dimension, the long-term reward parameter measures the reward obtained by selecting each candidate classification result from a longer time dimension, and the doctor tendency prediction parameter represents the probability of selecting each candidate classification result according to the doctor experience; acquiring short-term reward parameters and doctor tendency parameters of training sample data from the label; calculating joint loss according to the short-term reward prediction parameters and the short-term reward parameters, the doctor tendency prediction parameters and the doctor tendency parameters, the short-term reward prediction parameters and the long-term reward prediction parameters, and the doctor tendency prediction parameters and the long-term reward prediction parameters, wherein the short-term reward prediction parameters and the doctor tendency prediction parameters are supervised and learned, the long-term reward prediction parameters are deeply and intensively learned, and the long-term reward prediction parameters are limited through the short-term reward prediction parameters and the doctor tendency prediction parameters, so that the model can make predictions which are more in line with the experience of doctors, errors in the intensive learning are reduced, and the calculation of the joint loss is more accurate; therefore, the accuracy of the patient classification model obtained after adjustment according to the joint loss is improved, and the accuracy of target patient classification according to the patient classification model is further improved.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The application can be applied to the field of intelligent medical treatment, and therefore the construction of a smart city is promoted.

Those skilled in the art will appreciate that the functions of the modules in the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the computer readable instructions can include the functions of the modules in the embodiments described above. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 3, fig. 3 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 3 comprises a memory 31, a processor 32, a network interface 33 communicatively connected to each other via a system bus. It is noted that only the computer device 3 having the components 31-33 is shown in the figure, but it is to be understood that not all of the shown components are required to be implemented, and that more or less components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 31 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 31 may be an internal storage unit of the computer device 3, such as a hard disk or a memory of the computer device 3. In other embodiments, the memory 31 may also be an external storage device of the computer device 3, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 3. Of course, the memory 31 may also comprise both an internal storage unit of the computer device 3 and an external storage device thereof. In this embodiment, the memory 31 is generally used for storing an operating system and various types of application software installed in the computer device 3, so as to implement the functions of the modules in the processing device of the patient classification model. Further, the memory 31 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 32 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 32 is typically used to control the overall operation of the computer device 3. In this embodiment, the processor 32 is configured to execute the computer readable instructions or processing data stored in the memory 31 to implement the functions of the modules in the processing device of the patient classification model.

The network interface 33 may comprise a wireless network interface or a wired network interface, and the network interface 33 is generally used for establishing communication connection between the computer device 3 and other electronic devices.

The present embodiment implements the functions of the modules in the processing device of the patient classification model as described in the above embodiments by the processor executing the computer readable instructions stored in the memory,

The present application further provides another embodiment, which is a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the functions of the modules in the processing device of the patient classification model as described above.

In the embodiment, training sample data with labels of patients is acquired and input into an initial patient classification model, wherein the initial patient classification model comprises a first network and a second network, and can respectively output short-term reward prediction parameters, doctor tendency prediction parameters and long-term reward prediction parameters of the training sample data aiming at each candidate classification result; the short-term reward parameter measures the reward obtained by selecting each candidate classification result from a shorter time dimension, the long-term reward parameter measures the reward obtained by selecting each candidate classification result from a longer time dimension, and the doctor tendency prediction parameter represents the probability of selecting each candidate classification result according to the doctor experience; acquiring short-term reward parameters and doctor tendency parameters of training sample data from the label; calculating combined loss according to the short-term reward prediction parameters and the short-term reward parameters, the doctor tendency prediction parameters and the doctor tendency parameters, the short-term reward prediction parameters and the long-term reward prediction parameters, and the doctor tendency prediction parameters and the long-term reward prediction parameters, wherein the short-term reward prediction parameters and the doctor tendency prediction parameters are supervised and learned, the long-term reward prediction parameters are deeply strengthened and are limited by the short-term reward prediction parameters and the doctor tendency prediction parameters, so that the model can make prediction more in line with the experience of doctors, errors in strengthened learning are reduced, and the calculation of the combined loss is more accurate; therefore, the accuracy of the patient classification model obtained after adjustment according to the joint loss is improved, and the accuracy of target patient classification according to the patient classification model is further improved.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. A processing apparatus for a patient classification model, comprising:

2. The patient classification model processing apparatus of claim 1, wherein the loss calculation module comprises:

a first calculation sub-module for calculating a first loss based on the short-term reward prediction parameter and the short-term reward parameter;

the second calculation submodule is used for calculating second loss according to the doctor tendency prediction parameter and the doctor tendency parameter;

a third calculation submodule for calculating a third loss based on the short-term reward prediction parameter and the long-term reward prediction parameter;

a fourth calculation submodule for calculating a fourth loss based on the doctor tendency prediction parameter and the long-term reward prediction parameter;

and the joint calculation submodule is used for carrying out linear operation on the first loss, the second loss, the third loss and the fourth loss to obtain joint loss.

3. The patient classification model processing apparatus of claim 2, wherein the first loss is expressed as:

L _R ＝(R _true -R _predict ) ²

wherein R is _true A short-term reward parameter for the user; r _predict Parameters are predicted for the short-term rewards.

4. The patient classification model processing apparatus of claim 2, wherein the second loss is expressed as:

L _P ＝-P _true *log(P _predict )-(1-P _true )*log(1-P _predict )

wherein, P _true To the physician-inclined parameter, P _predict Predicting parameters for the doctor's predisposition.

5. The patient classification model processing apparatus of claim 2, wherein the third loss is expressed as:

wherein s is _t 、a _t The training sample data and the candidate classification result at the time point t are respectively; f(s) _t ,a _t ) At time t, input training sample data s _t The candidate classification result is a _t Long-term reward prediction parameters obtained in real time;

at time t, input training sample data s _t When the candidate classification result is a, obtaining a short-term reward prediction parameter without a gradient form; f(s) _t+1 And a) at time point t +1, inputting training sample data s _t+1 And the long-term reward prediction parameter is obtained when the candidate classification result is a;

a constructed long-term reward parameter.

6. The patient classification model processing apparatus of claim 2, wherein the fourth loss is expressed as:

wherein s is _t 、a _t The training sample data and the candidate classification result at the time point t are respectively;

at time t, input training sample data s _t The candidate classification result is a _t The obtained doctor tendency prediction parameters without gradient form; f(s) _t ,a _t ) At time t, input training sample data s _t The candidate classification result is a _t The long-term reward prediction parameters obtained.

7. The patient classification model processing apparatus of claim 1, wherein the patient classification module comprises:

the system comprises a sample acquisition submodule and a data processing submodule, wherein the sample acquisition submodule is used for acquiring sample data of a target patient, and the sample data comprises basic information of the patient, disease examination information, historical medication information and doctor prescription information;

a reward obtaining sub-module for inputting the sample data into the patient classification model, for outputting short-term reward parameters of the sample data for the candidate classification results through a first network in the patient classification model, and for outputting long-term reward parameters of the sample data for the candidate classification results through a second network in the patient classification model;

the cumulative calculation submodule is used for calculating the cumulative reward parameters of the sample data aiming at the candidate classification results according to the short-term reward parameters and the long-term reward parameters;

8. The patient classification model processing apparatus as claimed in claim 1, further comprising:

a prescription inquiry module for inquiring prescription information associated with the patient classification result;

and the display module is used for displaying the patient classification result of the target patient and the prescription information through a terminal.

9. A computer device comprising a memory having stored therein computer readable instructions which, when executed by a processor, carry out the functions of the modules in the processing means of the patient classification model according to any one of claims 1 to 8.

10. A computer readable storage medium having computer readable instructions stored thereon which, when executed by a processor, implement the functions of the modules in a processing apparatus of a patient classification model according to any one of claims 1 to 8.