CN112017788B

CN112017788B - Disease ordering method, device, equipment and medium based on reinforcement learning model

Info

Publication number: CN112017788B
Application number: CN202010929683.4A
Authority: CN
Inventors: 唐蕊
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-09-07
Filing date: 2020-09-07
Publication date: 2023-07-04
Anticipated expiration: 2040-09-07
Also published as: CN112017788A; WO2021151355A1

Abstract

The invention is applied to the technical field of artificial intelligence, relates to the technical field of blockchain, and is applied to the technical field of intelligent medical treatment, and discloses a disease sorting method, device, equipment and medium based on a reinforcement learning model, wherein the method comprises the steps of inputting disease data of a patient into an auxiliary diagnosis model by acquiring the disease data of the patient, acquiring a disease sorting result output by the auxiliary diagnosis model, determining weights of a plurality of suspected diseases in a region where the patient belongs to according to a preset weight model, updating the suspected disease sorting result according to the weights of the plurality of suspected diseases in the region where the patient belongs to, so as to obtain an updated disease sorting result, determining a suspected disease sorting result of the patient according to the updated disease sorting result, and outputting the suspected disease sorting result; based on the existing auxiliary diagnosis model, the invention considers the actual disease conditions of different regions, so that the finally obtained suspected disease sequencing result is more optimized, and the accuracy of the suspected disease output result is improved.

Description

Disease ordering method, device, equipment and medium based on reinforcement learning model

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a disease ordering method, device, equipment and medium based on a reinforcement learning model.

Background

With the rapid development of artificial intelligence technology, auxiliary diagnosis techniques in clinical decision support systems are generally implemented by establishing an auxiliary diagnosis model through a machine learning or deep learning method. Namely, the disease information of the patient is input into an auxiliary diagnosis model, the auxiliary diagnosis model outputs a suspected disease list aiming at the patient, and a doctor can perform referential diagnosis on the disease of the patient by referring to the suspected disease list given by the auxiliary diagnosis model, so that the auxiliary diagnosis model assists in diagnosis of the doctor.

Generally, the existing auxiliary diagnosis model supports a plurality of diseases, the performance of the plurality of diseases in the model is basically stable, and a plurality of disease types are determined as dominant disease types according to the performance of the plurality of diseases in the model, namely, in the existing auxiliary diagnosis model, the dominant disease types and the corresponding disease performances are unchanged, so that doctors in various regions have uniform judgment standards when using the auxiliary diagnosis model.

However, the probability of obtaining different diseases for patients in different regions is different, that is, the dominant disease types (a plurality of disease types with higher local occurrence frequency) in different regions are different, and in the existing auxiliary diagnosis model, the performance of each disease in the auxiliary diagnosis model is uniformly determined, the dominant disease type requirements of different regions are not considered, the diagnosis performance of the auxiliary diagnosis model is not optimized enough, the obtained disease output result is different from the local actual disease diagnosis situation, and the accuracy is reduced.

Disclosure of Invention

The invention provides a disease ordering method, device, equipment and medium based on a reinforcement learning model, which are used for solving the problem that in the prior art, the accuracy of disease output results is low because disease conditions in different areas are not considered by an auxiliary diagnosis model.

A method of disease ordering based on reinforcement learning models, comprising:

acquiring patient condition data and inputting the patient condition data into an auxiliary diagnosis model;

obtaining a disease ordering result output by the auxiliary diagnosis model, wherein the disease ordering result is a result of ordering a plurality of suspected diseases according to the probability of obtaining each disease by the patient;

determining weights of the plurality of suspected diseases in the region where the patient belongs according to a preset weight model, wherein the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region where the patient belongs;

updating the suspected disease sorting result according to the weights of the suspected diseases in the region where the patient belongs to so as to obtain an updated disease sorting result;

and determining a suspected disease sequencing result of the patient according to the updated disease sequencing result, and outputting the suspected disease sequencing result.

A reinforcement learning model-based disease ordering apparatus comprising:

the first acquisition module is used for acquiring the disease data of the patient and inputting the disease data of the patient into the auxiliary diagnosis model;

the second acquisition module is used for acquiring a disease ordering result output by the auxiliary diagnosis model, wherein the disease ordering result is a result of ordering a plurality of suspected diseases according to the probability of each disease obtained by the patient;

the first determining module is used for determining weights of the plurality of suspected diseases in the region where the patient belongs according to a preset weight model, wherein the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region where the patient belongs;

the updating module is used for updating the suspected disease sequencing results according to the weights of the suspected diseases in the region where the patient belongs to so as to obtain updated disease sequencing results;

and the second determining module is used for determining the suspected disease ordering result of the patient according to the updated disease ordering result and outputting the suspected disease ordering result.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the reinforcement learning model based disease ordering method described above when the computer program is executed.

A computer readable storage medium storing a computer program which when executed by a processor performs the steps of the reinforcement learning model-based disease ordering method described above.

In one scheme provided by the reinforcement learning model-based disease ordering method, device, equipment and medium, disease ordering results output by an auxiliary diagnosis model are obtained by obtaining disease data of a patient and inputting the disease data of the patient into the auxiliary diagnosis model, wherein the disease ordering results are obtained by ordering a plurality of suspected diseases according to the probability of each disease obtained by the patient, weights of the suspected diseases in areas where the patient belongs are determined according to a preset weight model, the reinforcement learning model obtained by performing disease weight learning according to the disease diagnosis data of the areas where the patient belongs is provided by the preset weight model, the suspected disease ordering results are updated according to the weights of the suspected diseases in the areas where the patient belongs, updated disease ordering results are obtained, and finally the suspected disease ordering results of the patient are determined according to the updated disease ordering results and are output; according to the invention, the preset weight model based on the disease diagnosis data of each region is obtained through training, then the weight of each suspected disease in the region where the patient belongs is determined according to the preset weight model, and then the disease sequencing results are reordered according to the weight of each suspected disease, and on the basis of the existing auxiliary diagnosis model, the actual disease conditions of different regions are considered, so that the finally obtained suspected disease sequencing results are more optimized, and the accuracy of the suspected disease output results is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an application environment of a reinforcement learning model-based disease ordering method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for enhancing learning model-based disease ordering in accordance with an embodiment of the present invention;

FIG. 3 is a flow chart illustrating the implementation of step S30 in FIG. 2 according to the present invention;

FIG. 4 is a flow chart illustrating the implementation of step S40 in FIG. 2 according to the present invention;

FIG. 5 is a flowchart illustrating the implementation of step S50 in FIG. 2 according to the present invention;

FIG. 6 is a schematic diagram of an acquisition process of a preset weight model according to an embodiment of the invention;

FIG. 7 is a flow chart illustrating the implementation of step S04 in FIG. 6 according to the present invention;

FIG. 8 is a schematic diagram of a disease ordering apparatus based on reinforcement learning model according to an embodiment of the invention;

FIG. 9 is a schematic diagram of a computer device according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The disease ordering method based on the reinforcement learning model provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein terminal equipment communicates with a server through a network. The server acquires disease data of a patient in the terminal equipment, inputs the disease data of the patient into the auxiliary diagnosis model, acquires a disease sorting result output by the auxiliary diagnosis model, sorts a plurality of suspected diseases according to the probability of each disease obtained by the patient, determines weights of the suspected diseases in areas where the patient belongs according to a preset weight model, and updates the suspected disease sorting result according to the weights of the suspected diseases in the areas where the patient belongs to by a reinforcement learning model obtained by performing disease weight learning according to the disease diagnosis data of the areas where the patient belongs to so as to obtain an updated disease sorting result, determines the suspected disease sorting result of the patient according to the updated disease sorting result, and outputs the suspected disease sorting result to the terminal equipment, thereby improving the accuracy of the suspected disease output result.

The terminal device may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In this embodiment, the auxiliary diagnostic model, the preset weight model, and the relevant data of the model input and output are all stored in the blockchain network. Blockchains are novel application modes of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer. In the embodiment, the auxiliary diagnosis model and the preset weight model, namely the related data, are stored in the blockchain network, so that the auxiliary diagnosis model, the preset weight model and the related data can be conveniently and rapidly inquired and processed, and the processing speed is improved.

In one embodiment, as shown in fig. 2, a disease ordering method based on reinforcement learning model is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

s10: patient condition data is acquired and entered into an auxiliary diagnostic model.

Patient condition data is acquired and entered into an auxiliary diagnostic model. The disease data is medical record data of a patient, and comprises basic information of the patient, self-described disease information of the patient and examination data. Wherein, the basic information comprises conventional data such as age, region, sex and the like of the patient, and the examination data comprises image data, image data and the like.

S20: and obtaining a disease ordering result output by the auxiliary diagnosis model, wherein the disease ordering result is a result of ordering a plurality of suspected diseases according to the probability of obtaining the disease by the patient.

In the conventional method, after the disease data of the patient is input into the auxiliary diagnosis model, the auxiliary diagnosis model outputs a disease sorting result for the patient, that is, a result of sorting a plurality of suspected diseases according to the probability of obtaining the disease by the patient, so that a doctor performs auxiliary diagnosis according to the disease sorting result output by the auxiliary diagnosis model, and finally determines the disease obtained by the patient. In this embodiment, after the disease sorting result output by the auxiliary diagnostic model, the disease sorting result output by the auxiliary diagnostic model needs to be obtained, so that the disease sorting result is optimized according to the weights of a plurality of suspected diseases in the region where the patient belongs, and the accuracy of the disease output result of the patient is further improved, thereby improving the assistance to doctors.

S30: and determining weights of a plurality of suspected diseases in the region to which the patient belongs according to a preset weight model, wherein the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region to which the patient belongs.

After a disease ordering result output by the auxiliary diagnosis model is obtained, determining weights of a plurality of suspected diseases in the region where the patient belongs according to a preset weight model, wherein the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region where the patient belongs. The region to which the patient belongs can be a long residence region of the patient, a household region of the patient, or a treatment region of the patient.

S40: updating the suspected disease sequencing result according to the weights of the suspected diseases in the region of the patient so as to obtain the updated disease sequencing result.

After determining weights of the plurality of suspected diseases in the region where the patient belongs according to the preset weight model, updating the suspected disease sorting results according to the weights of the plurality of suspected diseases in the region where the patient belongs, and reordering the plurality of suspected diseases according to the updated acquired disease probability to acquire updated disease sorting results, so that the updated disease sorting results have higher accuracy.

S50: and determining a suspected disease sequencing result of the patient according to the updated disease sequencing result, and outputting the suspected disease sequencing result.

After the updated disease ordering result is obtained, determining a suspected disease ordering result of the patient according to the updated disease ordering result, and outputting the suspected disease ordering result of the patient, so that a doctor can diagnose the disease of the patient with the aid of the suspected disease ordering result with higher accuracy.

In this embodiment, the disease data of the patient is input into the auxiliary diagnosis model to obtain the disease sorting result output by the auxiliary diagnosis model, on this basis, the weights of the multiple suspected diseases in the region to which the patient belongs are determined according to the preset weight model, and the disease sorting result is optimized and updated according to the weights of the multiple suspected diseases, so that the suspected disease sorting result which is more optimized and is close to the actual disease condition in each region can be automatically obtained, the automated processing process of artificial intelligence and disease identification is realized, the better suspected disease sorting result can be obtained without manual participation, and the doctor can conveniently use the optimized suspected disease sorting result as a reference in the subsequent diagnosis, thereby improving the accuracy of disease diagnosis. The scheme can be applied to the intelligent medical field, so that the construction of the intelligent city is promoted.

In the disease sorting method based on the reinforcement learning model, the disease sorting result output by the auxiliary diagnosis model is obtained by obtaining the disease data of the patient, inputting the disease data of the patient into the auxiliary diagnosis model, then determining the weights of a plurality of suspected diseases in the region of the patient according to the preset weight model, updating the suspected disease sorting result according to the weights of the plurality of suspected diseases in the region of the patient, so as to obtain an updated disease sorting result, and finally determining the suspected disease sorting result of the patient according to the updated disease sorting result; the method comprises the steps of obtaining a preset weight model based on disease diagnosis data of each region through training, determining the weight of each suspected disease in the region to which a patient belongs according to the preset weight model, and then reordering the disease ordering results according to the weight of each suspected disease, wherein on the basis of the existing auxiliary diagnosis model, the actual disease conditions of different regions are considered, so that the finally obtained suspected disease ordering results are more optimized, and the accuracy of the suspected disease output results is improved.

In one embodiment, as shown in fig. 3, in step S30, weights of a plurality of suspected diseases in the region to which the patient belongs are determined according to a preset weight model, which specifically includes the following steps:

S31: and taking the state output by the preset weight model of the region to which the patient belongs as the weight of a plurality of dominant disease types in the region to which the patient belongs.

After the disease ordering result output by the auxiliary diagnosis model is obtained, a trained preset weight model of the region to which the patient belongs needs to be obtained, and the state output by the preset weight model of the region to which the patient belongs is used as the weights of a plurality of dominant disease types in the region to which the patient belongs, so that the disease ordering result of the auxiliary diagnosis model is updated according to the weights of the dominant disease types in the region to which the patient belongs.

S32: the disease type of each of the plurality of suspected diseases is determined.

After the weights of a plurality of dominant disease species in the region to which the patient belongs are determined, the plurality of diseases are classified into different disease types according to the disease similarity of the plurality of suspected diseases, namely, the disease type of each suspected disease in the plurality of suspected diseases is determined. The classification of diseases by similarity is to reduce the impact on the performance of other similar disease types of diseases when the ranking result of suspected diseases is subsequently updated according to their weights.

For example, the plurality of suspected diseases include four diseases, namely a disease a, a disease B, a disease C and a disease D, wherein the disease B and the disease D are different disease types, the disease a and the disease C are the same disease type, and the disease types are different from the disease types of the disease B and the disease D, and the disease ordering result output by the auxiliary diagnostic model includes three disease types.

In this embodiment, the determination process of the plurality of suspected diseases and the disease types is only illustrated as an example, and in other embodiments, the disease types of the plurality of suspected diseases may be determined in other manners, which will not be described herein.

S33: it is determined whether the disease type of each suspected disease is a plurality of dominant disease types in the region to which the patient belongs.

After determining the disease type of each suspected disease, determining whether the disease type of each suspected disease is a plurality of dominant disease types of the region to which the patient belongs, so as to determine the weight of each disease according to the determination result.

S34: if the disease types of the suspected diseases are a plurality of dominant disease types in the region of the patient, the weights of the dominant disease types are used as the weights of the corresponding suspected diseases so as to obtain the weights of the plurality of suspected diseases.

After determining whether the disease type of each suspected disease is a plurality of dominant disease types in the region to which the patient belongs, if the disease type of the suspected disease is a plurality of dominant disease types in the region to which the patient belongs, the weights of the dominant disease types are used as the weights of the corresponding suspected diseases so as to obtain the weights of the plurality of suspected diseases.

After determining whether the disease type of each suspected disease is a plurality of dominant disease types in the region to which the patient belongs, if the disease type of the suspected disease is a plurality of dominant disease types in the region to which the patient belongs, the weight of the disease type is the weight of the matched dominant disease type, and the weight of the suspected disease corresponding to the disease type is the weight of the dominant disease type, so that the weight of the plurality of diseases is obtained. Since similarity exists between different suspected diseases, by classifying a plurality of suspected diseases according to the disease types, weighting the suspected diseases according to the disease types reduces the influence of the similar suspected diseases on each other. That is, the uniform consideration of similar diseases is performed instead of the consideration of single diseases, and the weights of the similar diseases are updated simultaneously, so that the influence on other diseases caused by optimizing a certain disease is reduced.

For example, the plurality of suspected diseases includes four diseases, disease a, disease B, disease C and disease D, the disease type of disease a and disease C is disease type 1, the disease type of disease B is disease type 2, the disease D is disease type 3, if disease type 1 is a dominant disease type in the region to which the patient belongs, the weight of the dominant disease type is the weight of disease type 1, and the weight of disease a and disease C is the weight of disease type 1; if disease 2 is the dominant disease in the region where the patient belongs, the weight of the dominant disease is the weight of disease 1, and the weight of disease B is the weight of disease 1.

In this embodiment, the disease type 1 is a dominant disease type in the region where the patient belongs or the disease type 2 is a dominant disease type in the region where the patient belongs, which is only an exemplary illustration, and in other embodiments, the disease type 3 may be a dominant disease type.

After determining whether the disease type of each suspected disease is a plurality of dominant disease types in the region to which the patient belongs, if the disease type of the suspected disease is not a plurality of dominant disease types in the region to which the patient belongs, the probability of obtaining the suspected disease is not updated.

In this embodiment, the state output by the preset weight model of the region is used as the weights of a plurality of dominant disease types in the region, and then the disease types of each suspected disease in the plurality of suspected diseases are determined, and further whether the disease types of each suspected disease are the plurality of dominant disease types in the region where the patient belongs is determined, if the disease types of the suspected disease are the plurality of dominant disease types in the region where the patient belongs, the weights of the dominant disease types are used as the weights of the corresponding suspected disease types, so as to obtain the weights of the plurality of suspected diseases, the process of determining the weights of the plurality of suspected diseases in the region where the patient belongs according to the preset weight model is thinned, and the weights of the disease types are used as the weights of the corresponding diseases according to the types of the diseases, so that the influence on similar suspected diseases is reduced, and the weight accuracy is higher, and the subsequently updated disease sequencing result is higher.

In one embodiment, as shown in fig. 4, in step S40, the updating of the suspected disease sorting result according to the weights of the suspected diseases in the region to which the patient belongs specifically includes the following steps:

s41: and determining the obtaining probability of each suspected disease according to the suspected disease sequencing result.

After weights of a plurality of suspected diseases in the region of the patient are determined according to a preset weight model, the obtaining probability of each suspected disease is determined according to the suspected disease sequencing result. Namely, the suspected disease sorting result comprises a plurality of suspected diseases and the obtaining probability of each suspected disease, and the obtaining probability of each suspected disease is extracted from the suspected disease sorting result.

S42: the product between the weight of the suspected disease in the region to which the patient belongs and the probability of obtaining the suspected disease is determined as the final probability of obtaining the suspected disease.

After the probability of obtaining each suspected disease is determined according to the suspected disease sequencing result, the product between the weight of the suspected disease in the region of the patient and the probability of obtaining the suspected disease is determined as the final probability of obtaining the suspected disease.

For example, the first column in table 1 is the probability of obtaining a suspected disease and each suspected disease output by the auxiliary diagnostic model, the third column and the fourth column are the disease type of the suspected disease and the weight of the suspected disease, and the fifth column is the probability of obtaining a updated suspected disease, i.e., the final probability of obtaining the suspected disease.

TABLE 1

As can be seen from table 1, after the probability of obtaining each suspected disease outputted from the auxiliary diagnostic model is obtained according to the weight of the suspected disease, the probability of obtaining a part of the suspected disease is changed, and the suspected disease with the highest probability is changed from the disease 3 to the disease 2, so that the updated result is closer to the actual situation of the region to which the patient belongs.

S43: and updating the ordering of the multiple suspected diseases according to the final obtained probability of each suspected disease.

After determining the final obtaining probability of the suspected diseases, updating the ordering of the suspected diseases according to the final obtaining probability of each suspected disease to obtain updated disease ordering results. For example, the multiple suspected diseases may be ranked in order of the probability of obtaining from large to small according to the magnitude of the final probability of obtaining, so as to obtain an updated disease ranking result.

In this embodiment, the order of the plurality of suspected diseases according to the order of the obtained probabilities from large to small is merely illustrative, and in other embodiments, the plurality of suspected diseases may be ordered in other manners, for example, the plurality of suspected diseases may be ordered according to the average obtained probability of the disease types, different disease types may be ordered according to the order of the average obtained probability of the disease types from large to small, and then the suspected diseases in the same disease type may be ordered according to the obtained probability of the suspected diseases, thereby obtaining the updated disease ordering result.

In this embodiment, the step of updating the suspected disease ordering result according to the weights of the suspected diseases in the region where the patient belongs is refined by determining the obtaining probability of each suspected disease according to the suspected disease ordering result, determining the product between the weights of the suspected diseases in the region where the patient belongs and the obtaining probability of the suspected diseases as the final obtaining probability of the suspected diseases, and updating the ordering of the plurality of suspected diseases according to the final obtaining probability of each suspected disease.

In one embodiment, as shown in fig. 5, in step S50, a suspected disease ordering result of the patient is determined according to the updated disease ordering result, which specifically includes the following steps:

s51: and determining the probability of obtaining suspected diseases in the updated disease ordering result.

After the updated disease ordering result is obtained, the probability of obtaining the suspected disease is determined in the updated disease ordering result, i.e. the final probability of obtaining after updating according to the weights is determined.

S52: and sorting the suspected diseases from high to low according to the probability of obtaining the suspected diseases, and obtaining a suspected disease sorting list.

After the probability of obtaining the suspected diseases is determined, the suspected diseases are ranked from high to low according to the probability of obtaining the suspected diseases, and a suspected disease ranking list is obtained.

S53: and selecting the preset number of suspected diseases and the probability of obtaining the suspected diseases from the suspected disease sorting list as the suspected disease sorting result of the patient.

And after the suspected disease sorting list is obtained, selecting the preset number of suspected diseases and the obtaining probability of the suspected diseases in the suspected disease sorting list as the suspected disease sorting result of the patient.

For example, if the preset number is 10, the first 10 suspected diseases and the probability of obtaining the suspected diseases are selected from the suspected disease sorting list as the suspected disease sorting result of the patient, so that the first 10 suspected diseases and the probability of obtaining the suspected diseases are output, the final disease sorting result is clear at a glance, the doctor can conveniently and quickly browse and refer to the result, the doctor can be assisted in diagnosing the actual illness condition of the patient, and the output efficiency of the final disease sorting result is improved.

In this embodiment, the preset number of 10 is only an exemplary illustration, and in other embodiments, the preset number may also be other values, which are not described herein.

In this embodiment, the probability of obtaining the suspected disease is determined in the updated disease sorting result, then the suspected disease is sorted from high to low according to the probability of obtaining the suspected disease, then a suspected disease sorting list is obtained, and finally the previous preset number of suspected diseases and the probability of obtaining the suspected disease are selected from the suspected disease sorting list to be used as the suspected disease sorting result of the patient, so that the step of determining the suspected disease sorting result of the patient according to the updated disease sorting result is refined, the output efficiency of the final disease sorting result is improved, the final disease sorting result is clear at a glance, and the doctor can quickly browse and refer to the suspected disease sorting result conveniently.

In an embodiment, before determining weights of the plurality of suspected diseases in the region to which the patient belongs according to the preset weight model, the disease weight learning is further required to be performed according to the disease diagnosis data of the region to which the patient belongs to obtain the preset weight model, so that more accurate weights of the plurality of suspected diseases can be obtained according to the preset weight model. As shown in fig. 6, before step S30, the preset weight model is specifically obtained by:

s01: and determining k dominant disease types of the region to which the patient belongs, wherein the dominant disease types are a plurality of disease types with the occurrence frequency of the diseases higher than the preset frequency in the region to which the patient belongs.

Determining k dominant disease types of the region to which the patient belongs, wherein the dominant disease types are a plurality of disease types with the occurrence frequency of the disease higher than the preset frequency in the region to which the patient belongs, and the dominant disease types are disease types in the auxiliary diagnosis model, namely determining k disease types with the occurrence frequency of the disease higher than the preset frequency in the region to which the patient belongs, and taking the k disease types with the occurrence frequency higher than the preset frequency as the dominant disease types so as to train the preset weight model.

S02: the weights of k dominant disease species are defined as the states of the pre-trained model, which are vectors of k dimensions.

After determining k dominant disease species of the region to which the patient belongs, the weights of the k dominant disease species are defined as the state of a pre-training model, wherein the state of the pre-training model is a vector of k dimensions.

The pre-training model may be a DQN (Deep Q-training Network) model, and in other embodiments, the pre-training model may be another reinforcement learning model, which is not described herein. In this embodiment, a pre-training model is taken as an example for explanation.

S03: the vectors of the k dimensions are input into a neural network of the pre-training model to obtain actions of the pre-training model.

After determining the actions of the pre-trained model, vectors representing weights of k dominant disease species are input into the neural network of the DQN model as actions of the DQN model. Namely, for k dominant disease species in the region to which the patient belongs, the weight of each dominant disease species is increased or decreased, and the vector representation of k dimensions is also used as the action.

For example, in the DQN model, the state is a k-dimensional vector representing the weight of the current k-class disease; action, represented by a k-dimensional one-hot vector, for example, k is 3, the three-dimensional vector of action ([ disease category 1, disease category 2, disease category 3 ]), the three-dimensional vector of action [0,1,0] represents the weight increase of disease category 2, the three-dimensional vector of action [0, -1] represents the weight decrease of disease category 3, each action has only 1 corresponding disease category changed, and the update of the state in the DQN model is performed according to the current state and action.

In this embodiment, k is 3, which is only illustrated as an example, and in other embodiments, k may be other values, which are not described herein.

S04: determining rewards of the pre-training model according to disease diagnosis data of the region to which the patient belongs.

Determining rewards of the pre-training model according to disease diagnosis data of the region to which the patient belongs. Rewards review are active during training of the pre-training model, and update the current state of the pre-training model with rewards.

For example, the disease diagnosis data of the region to which the patient belongs includes the disease sequencing result of the auxiliary diagnosis model for the patient to be diagnosed, one of the states changes in updating in the process of training the pre-training model, after each state update, the disease sequencing result of the auxiliary diagnosis model is updated according to the updated state to obtain the disease performance in different states, and if the disease performance in the current state is improved, the reward is 1; if the disease performance in the current state is unchanged, rewarding is 0; if the disease performance in the current state is reduced, the reward is-1.

In this embodiment, the determination of the rewards is only illustrated by way of example, and in other embodiments, the rewards may be set to be other than described herein.

S05: and adjusting the state, the action and the rewards to learn the weight of the pre-training model, so as to obtain a preset weight model.

And continuously adjusting states, actions and rewards in the weight learning putting process of the pre-training model so that the loss function of the pre-training model is not changed, wherein the states of the pre-training model reach stable at the moment, the performance of the pre-training model is not changed compared with disease diagnosis data of the region to which a patient belongs, the pre-training model is trained, the pre-training model in the stable state is taken as a preset weight model, at the moment, the k-dimensional vector represented by the stable state is the output result of the preset weight model, namely the k-dimensional vector output by the preset weight model is the weight of k dominant disease types.

In this embodiment, k dominant disease types in the region to which the patient belongs are determined, the dominant disease types are a plurality of disease types with the occurrence frequency of the disease in the region to which the patient belongs being higher than the preset frequency, the weights of the k dominant disease types are defined as the states of the pre-training model, the states are vectors of k dimensions, then the vectors of k dimensions are input into a neural network of the pre-training model to obtain actions of the pre-training model, rewards of the pre-training model are determined according to disease diagnosis data of the region to which the patient belongs, finally the states, actions and rewards are adjusted to perform weight learning on the pre-training model to obtain the pre-training weight model, the process of obtaining the pre-training weight model is clarified, the pre-training weight model is obtained according to the disease diagnosis data of the region to which the patient belongs is enabled to be close to the data condition of the region to which the patient belongs, the accuracy of the pre-training weight model is improved, and a basis is provided for optimizing the disease sequencing result of the auxiliary diagnosis model.

In one embodiment, the disease diagnosis data of the region to which the patient belongs includes disease diagnosis results of a plurality of diagnosed patients and disease sequencing results of the auxiliary diagnosis model for the plurality of diagnosed patients, as shown in fig. 7, in step S04, that is, rewards of the pre-training model are determined according to the disease diagnosis data of the region to which the patient belongs, and specifically includes the following steps:

s041: and updating the auxiliary diagnosis model according to the weights of the dominant disease types in each state, aiming at the disease sequencing results of a plurality of diagnosed patients, so as to determine updated disease results of the plurality of diagnosed patients in each state, wherein the updated disease results are the diseases with highest acquisition probability after updating the disease sequencing results of the diagnosed patients.

The disease diagnosis data of the region to which the patient belongs includes disease diagnosis results of a plurality of diagnosed patients and disease sorting results of the auxiliary diagnosis model for the plurality of diagnosed patients, in the process of updating the state of the pre-training model, weights of dominant disease species in each state need to be acquired, then the auxiliary diagnosis model is updated for the disease sorting results of the plurality of diagnosed patients according to the weights of dominant disease species in each state, so as to obtain updated sorting results of the disease sorting results of each state, and then updated disease results of the plurality of diagnosed patients in each state are determined according to the updated sorting results. Wherein, updating the disease result is updating the disease with highest probability after the disease sequencing result of the patient is diagnosed.

S042: and determining the accuracy of updating the disease results of the plurality of diagnosed patients in each state according to the disease diagnosis results of the plurality of diagnosed patients so as to obtain the accuracy of the disease results in each state.

After updated disease results of the plurality of diagnosed patients in each state are obtained, the accuracy of the updated disease results of the plurality of diagnosed patients in each state is determined based on the disease diagnosis results of the plurality of diagnosed patients to obtain the accuracy of the disease results in each state.

For example, in a certain state, there are m disease diagnosis results of the diagnosed patient, and in the state, there are m updated disease results of the diagnosed patient, where the updated disease results of the n diagnosed patients are consistent with the disease diagnosis results of the diagnosed patient, and the accuracy of the updated disease results in the state is n/m, and the above steps are repeated, so as to finally obtain the accuracy of the updated disease results in different states.

The k-dimensional vector of the initial state of the pre-training model is the average accuracy of various dominant disease types, namely, the average accuracy of a plurality of updated disease results obtained by updating a plurality of disease sequencing results of the auxiliary diagnosis model in the initial state.

S043: and determining rewards of the next state in the pre-training model according to the accuracy rate of the disease result in the front and rear states.

After obtaining the accuracy of the disease results in each state, determining rewards of the next state in the pre-training model according to the accuracy of the disease results in the front and back states.

For example, accu _before Represents the accuracy of the disease result according to the last state, accu _now The accuracy of the disease result in the current state is represented, the threshold is 0.01, and the process of determining rewards by a preset training model is as follows: if |accu _before -accu _now |>threshold, and accu _before <accu _now The accuracy of the updated disease result is improved, and the reward value of the preset training model is 1; if |accu _before -accu _now |<threshold, which indicates that the accuracy of the updated disease result is unchanged, a reward value of a preset training model is 0; i accu _before —accu _now |>threshold, and accu _before >accu _now And if the accuracy of the updated disease result is reduced, the reward value of the preset training model is-1.

In this embodiment, the threshold is 0.01, which is only exemplary, and in other embodiments, the threshold may be other values less than 0.01, which is not described herein.

In this embodiment, the auxiliary diagnosis model is updated according to the weight of the dominant disease in each state to determine updated disease results of a plurality of diagnosed patients in each state, and then the accuracy of the updated disease results of a plurality of diagnosed patients in each state is determined according to the disease diagnosis results of a plurality of diagnosed patients, so as to obtain the accuracy of the disease results in each state, and the reward of the next state in the pre-training model is determined according to the accuracy of the disease results in the previous and subsequent states, thereby refining the process of determining the reward of the pre-training model according to the disease diagnosis data of the region to which the patient belongs, providing a basis for determining the reward, and enabling the disease performance output by the pre-training model combined with the auxiliary diagnosis model in the training process to be close to the actual disease diagnosis result of the diagnosed patient, thereby improving the accuracy of the pre-training weight model.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In an embodiment, a reinforcement learning model-based disease ordering apparatus is provided, where the reinforcement learning model-based disease ordering apparatus corresponds to the reinforcement learning model-based disease ordering method in the above embodiment one by one. As shown in fig. 8, the reinforcement learning model-based disease ordering apparatus includes a first acquisition module 801, a second acquisition module 802, a first determination module 803, an update module 804, and a second determination module 805. The functional modules are described in detail as follows:

a first obtaining module 801, configured to obtain patient condition data, and input the patient condition data into an auxiliary diagnostic model;

a second obtaining module 802, configured to obtain a disease ordering result output by the auxiliary diagnostic model, where the disease ordering result is a result of ordering a plurality of suspected diseases according to a probability that the patient obtains each disease;

a first determining module 803, configured to determine weights of the plurality of suspected diseases in the region to which the patient belongs according to a preset weight model, where the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region to which the patient belongs;

An updating module 804, configured to update the suspected disease sorting result according to the weights of the multiple suspected diseases in the region to which the patient belongs, so as to obtain an updated disease sorting result;

a second determining module 805, configured to determine a suspected disease ordering result of the patient according to the updated disease ordering result, and output the suspected disease ordering result.

Further, the first determining module 803 is specifically configured to:

taking the state output by a preset weight model of the region to which the patient belongs as the weight of a plurality of dominant disease species in the region to which the patient belongs;

determining a disease type for each of the plurality of suspected diseases;

determining whether the disease type of each of the suspected diseases is a plurality of dominant disease types of the region to which the patient belongs;

and if the disease types of the suspected diseases are a plurality of dominant disease types in the region where the patient belongs, taking the weights of the dominant disease types as the weights of the corresponding suspected diseases so as to obtain the weights of the plurality of suspected diseases.

Further, the updating module 804 is specifically configured to:

determining the acquisition probability of each suspected disease according to the suspected disease sequencing result;

determining a product between a weight of the suspected disease in a region to which the patient belongs and an obtained probability of the suspected disease as a final obtained probability of the suspected disease;

And updating the ranking of the plurality of suspected diseases according to the final obtained probability of each suspected disease.

Further, the second determining module 805 is specifically configured to:

determining an acquisition probability of the suspected disease in the updated disease ordering result;

sorting the suspected diseases from high to low according to the probability of obtaining the suspected diseases to obtain a suspected disease sorting list;

and selecting the preset number of suspected diseases and the probability of obtaining the suspected diseases from the suspected disease sorting list as the suspected disease sorting result of the patient.

Further, the reinforcement learning model-based disease ordering apparatus further includes a model training module 806, where the model training module 806 is specifically configured to:

determining k dominant disease species of a region to which the patient belongs, wherein the dominant disease species are a plurality of disease species with higher occurrence frequency of diseases in the region to which the patient belongs;

defining the weights of k dominant disease species as the state of a pre-training model, wherein the state is a vector of k dimensions;

inputting the k-dimensional vector into a neural network of the pre-training model to obtain an action of the pre-training model;

Determining rewards of the pre-training model according to disease diagnosis data of the region to which the patient belongs;

and adjusting the state, the action and the rewards to perform weight learning on the pre-training model to obtain the preset weight model.

Further, the disease diagnosis data of the region to which the patient belongs includes disease diagnosis results of a plurality of diagnosed patients and disease sequencing results of the auxiliary diagnosis model for the plurality of diagnosed patients, and the model training module 806 is specifically further configured to:

updating the disease sequencing results of the auxiliary diagnosis model for a plurality of diagnosed patients according to the weights of dominant disease types in each state so as to determine updated disease results of the plurality of diagnosed patients in each state, wherein the updated disease results are diseases with highest acquisition probability after updating the disease sequencing results of the diagnosed patients;

determining the accuracy of updating the disease results of the plurality of diagnosed patients in each state according to the disease diagnosis results of the plurality of diagnosed patients so as to obtain the accuracy of the disease results in each state;

and determining rewards of the next state in the pre-training model according to the accuracy rate of the disease result in the front and rear states.

For specific limitations on the reinforcement learning model-based disease ordering apparatus, reference may be made to the above limitations on the reinforcement learning model-based disease ordering method, and no further description is given here. The above-described respective modules in the reinforcement learning model-based disease sorting apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing auxiliary diagnosis models, preset weight models, disease sequencing results and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a method for disease ordering based on reinforcement learning models.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A method of disease ordering based on reinforcement learning models, comprising:

determining weights of the plurality of suspected diseases in the region where the patient belongs according to a preset weight model, wherein the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region where the patient belongs, and the preset weight model is obtained by the following steps:

determining k dominant disease species of the region to which the patient belongs, wherein the dominant disease species are a plurality of disease species with the occurrence frequency of the disease higher than a preset frequency in the region to which the patient belongs;

adjusting the state, the action and the rewards to perform weight learning on the pre-training model to obtain the preset weight model;

2. The reinforcement learning model-based disease ordering method of claim 1, wherein the disease diagnosis data of the region to which the patient belongs includes disease diagnosis results of a plurality of diagnosed patients and disease ordering results of the auxiliary diagnosis model for a plurality of diagnosed patients, wherein determining rewards of the pre-training model based on the disease diagnosis data of the region to which the patient belongs comprises:

3. The reinforcement learning model-based disease ordering method of claim 1, wherein the determining weights of the plurality of suspected diseases in the region to which the patient belongs according to a preset weight model comprises:

determining a disease type for each of the plurality of suspected diseases;

4. The reinforcement learning model-based disease ordering method of any one of claims 1-3, wherein updating the suspected disease ordering result according to the weights of the plurality of suspected diseases in the region to which the patient belongs comprises:

5. The reinforcement learning model-based disease ordering method of any one of claims 1-3, wherein said determining a suspected disease ordering result for the patient from the updated disease ordering result comprises:

6. A reinforcement learning model-based disease ordering apparatus, comprising:

the first determining module is configured to determine weights of the plurality of suspected diseases in the region where the patient belongs according to a preset weight model, where the preset weight model is a reinforcement learning model obtained by performing disease weight learning according to disease diagnosis data of the region where the patient belongs, and the preset weight model is obtained by:

7. The reinforcement learning model-based disease ordering apparatus of claim 6, wherein the first determination module is specifically configured to:

taking the state output by the preset weight model of the region to which the model belongs as the weight of a plurality of dominant disease species in the region to which the model belongs;

determining a disease type for each of the plurality of suspected diseases;

and if the disease type of the suspected disease is a plurality of dominant disease types in the region of the patient, taking the weight of the dominant disease types as the weight of the suspected disease so as to obtain the weights of the plurality of diseases.

8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the reinforcement learning model based disease ordering method according to any one of claims 1 to 5 when the computer program is executed.

9. A computer readable storage medium storing a computer program which when executed by a processor implements the steps of the reinforcement learning model-based disease ordering method of any one of claims 1 to 5.