WO2023178789A1

WO2023178789A1 - Disease risk estimation network optimization method and apparatus, medium, and device

Info

Publication number: WO2023178789A1
Application number: PCT/CN2022/089727
Authority: WO
Inventors: 徐卓扬; 赵婷婷; 胡岗; 孙行智; 赵越
Original assignee: 平安科技（深圳）有限公司
Priority date: 2022-03-21
Filing date: 2022-04-28
Publication date: 2023-09-28
Also published as: CN114743665A

Abstract

The present application relates to the technical fields of artificial intelligence and digital medical treatment, and discloses a disease risk estimation network optimization method and system, a storage medium, and a computer device. The method comprises: obtaining a patient sample library; randomly selecting at least three patient samples from the patient sample library; inputting sample information of the at least three patient samples into a preset neural network in pairs, and calculating a first distance between every two patient samples by using the neural network, wherein the neural network is used for estimating a disease risk of a patient; calculating a loss value of the neural network according to the first distances; writing the loss value into a loss value list, and determining whether the loss value list meets a preset convergence condition; and if not, adjusting parameters of the neural network according to the loss value, and returning to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence condition. The method of the present application improves the accuracy of the neural network for disease risk estimation.

Description

Optimization method, device, medium and equipment for disease risk estimation network

This application claims priority with the Chinese patent application submitted to the China Patent Office on March 21, 2022, with application number 202210278345.8, and the application name is "Optimization method, device, medium and equipment for disease risk estimation network", the entire content of which Incorporated into the application by reference.

Technical field

This application relates to the fields of artificial intelligence and digital medical technology, and in particular to an optimization method, device, medium and equipment for a disease risk estimation network.

Background technique

With the rise of artificial intelligence technology, its application scenarios are becoming more and more abundant, and it can support functions such as auxiliary disease diagnosis, health management, and remote consultation. The inventor found that in the process of diagnosing a patient's disease, artificial intelligence technology can be used to determine whether the patient is a high-risk group for the disease, and then provide a reference for the doctor's diagnosis to improve the doctor's diagnostic efficiency and accuracy. However, the existing similar The patient estimation model has low accuracy, and its estimation results are often inconsistent with the patient's condition.

Contents of the invention

In view of this, this application provides an optimization method, device, medium and equipment for disease risk estimation network, which improves the accuracy of the neural network used for disease risk estimation.

According to one aspect of the present application, an optimization method for a disease risk estimation network is provided, including:

Obtain patient sample bank;

Randomly select at least three patient samples from the patient sample library;

The sample information of the at least three patient samples is input into a preset neural network in pairs, and the neural network is used to calculate the first distance between each two patient samples, wherein the neural network is used to estimate patient risk;

Calculate the loss value of the neural network according to the first distance;

Write the loss value into a loss value list, and determine whether the loss value list satisfies the preset convergence conditions, wherein the loss value list includes the neural network loss value calculated each time;

If not, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.

According to another aspect of the present application, an optimization device for a disease risk estimation network is provided, including:

Acquisition module, used to obtain patient sample library;

An initialization module, used to randomly select at least three patient samples from the patient sample library;

A calculation module, configured to input the sample information of the at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein: Neural networks are used to estimate patient risk;

The calculation module is also used to calculate the loss value of the neural network according to the first distance;

A judgment module, configured to write the loss value into a loss value list, and judge whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time;

Optimization module, used to adjust the parameters of the neural network according to the loss value if it is not satisfied, and return to the step of randomly selecting at least three patient samples in the patient sample library until the loss value list meet the preset convergence conditions.

According to yet another aspect of the present application, a storage medium is provided with a computer program stored thereon. When the computer program is executed by a processor, the optimization method for the disease risk estimation network is implemented, including:

Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.

According to yet another aspect of the present application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the computer program, the above-mentioned problems are realized. Optimization methods for disease risk estimation networks include:

In the above time-limited scheme based on the optimization method, device, medium and equipment of the disease risk estimation network, at least three patient samples are input at the same time to train the neural network. Through multiple cycle training, the importance of different characteristics of the patient samples can be distinguished. Effectively improves the accuracy of the neural network's judgment on target patients. In addition, since patient samples with the same and different results are trained at the same time, the training efficiency is high and the accuracy of the neural network is high.

The above description is only an overview of the technical solutions of the present application. In order to have a clearer understanding of the technical means of the present application, they can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present application more obvious and understandable. , the specific implementation methods of the present application are specifically listed below.

Description of the drawings

The drawings described here are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application. In the attached picture:

Figure 1 shows a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application;

Figure 2 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;

Figure 3 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;

Figure 4 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;

Figure 5 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;

Figure 6 shows a structural block diagram of an optimization device for a disease risk estimation network provided by an embodiment of the present application;

Figure 7 shows a structural block diagram of a computer social security provided by an embodiment of the present application.

Detailed ways

The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of this application can be combined with each other.

Embodiments of the present application provide a decentralized adaptive collaborative training method based on blockchain, which can be applied to electronic devices with the ability to run instructions or programs. The electronic devices can be, but are not limited to, various personal computers, notebooks, etc. Computers, smartphones, tablets and portable wearable devices can also be implemented using independent servers or server clusters composed of multiple servers. The present application is described in detail below through specific embodiments.

Please refer to Figure 1. Figure 1 is a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application, including the following steps:

S101: Obtain patient sample library;

S102: Randomly select at least three patient samples from the patient sample database;

S103: Input the sample information of at least three patient samples into the preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, where the neural network is used to estimate the patient's disease risk;

The method provided by this application is used to optimize the disease risk estimation network, where the disease risk estimation network can be a neural network, and the neural network can estimate the patient's disease risk, specifically estimating whether the patient is a high-risk group for the disease.

Among them, this application uses machine learning methods to optimize the neural network by training patient samples. Specifically, taking three patient samples randomly selected from the patient sample database as an example, the sample information of the first patient sample and the second patient sample is input into the neural network to obtain the first patient sample and the second patient sample. the first distance between them; similarly, input the sample information of the first patient sample and the third patient sample into the neural network to obtain the first distance between the first patient sample and the third patient sample; The sample information of the two patient samples and the third patient sample is input into the neural network to obtain the first distance between the second patient sample and the third patient sample; and then the three output first distances are used to perform the neural network optimization.

Wherein, the first distance may be a distance after normalization, and its value is between [0,1].

It should be understood that the neural network here can be a self-organizing feature map network or a learning vector quantization network, or other neural networks, which are not limited here.

As shown in Figure 2, in step S103, before inputting the sample information of at least three patient samples into the preset neural network in pairs, the following steps are included:

S103-1: Determine the disease information of each patient sample in at least three patient samples;

S103-2: If the disease information of at least three patient samples is the same, re-select at least three patient samples randomly from the patient sample database.

For steps S103-1 and S103-2, after randomly selecting at least three patient samples, determine whether their disease information is the same. If the disease information of all patient samples is the same, re-randomly select at least three patient samples until The disease information of one patient sample is different from the other two patient samples, and the number of patient samples reselected can be different from the number of patient samples randomly selected this time.

Among them, the disease information can be disease or non-disease. For example, if the disease information of all patient samples obtained through random selection is diseased or not diseased, reselect until at least two non-diseased samples and at least one diseased sample are obtained, or at least one diseased sample is obtained. Two diseased samples and at least one undiseased sample.

By selecting multiple patient samples and inputting them into the neural network through this step, this application can simultaneously train samples with the same disease information and samples with different disease information, that is, the neural network's ability to process similar relationships and distinguishing relationships is simultaneously trained. Its training efficiency is higher and a more accurate neural network model can be obtained faster.

S104: Calculate the loss value of the neural network based on the first distance;

Input at least two of the three patient sample information into the neural network, and output the first distance between the two patient samples corresponding to the two sample information. Then a loss function can be constructed to bring each first distance into the loss. function to calculate the loss value of the neural network.

Among them, each sample information can contain multiple features, compare the similarities and differences of each feature in the two sample information, and comprehensively analyze each feature to obtain the first distance.

As shown in Figure 3, in step S104, calculating the loss value of the neural network based on the first distance includes the following steps:

S104-1: Select any two patient samples from at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results;

S104-2: Use the difference between the first distance between the two target samples and the preset value as the middle difference, and use the square of the middle difference as the sub-loss value between the two target samples;

S104-3: Determine the loss value based on the sub-loss value between each two target samples.

For steps S104-1 to S104-3, in this step, two target samples are selected from at least three patient samples, and presets corresponding to the two target samples are set based on the disease information of the two target samples. The value, that is, the preset value depends on the disease information of the two target samples. Then the square of the difference between the first distance and the preset value is used as the sub-loss value between the two target samples, and a similar method is used to obtain the sub-loss value between each two target samples, and based on all sub-loss values The loss value determines the loss value of the neural network.

In this step, the first distance can reflect whether the disease information of the two target samples is the same, and the sub-loss value can reflect the calculation error for the two target samples. This application uses the sub-loss value to represent the closeness of the first distance to the preset value, and uses square processing to make the sub-loss value a non-negative number, eliminating the impact of negative numbers on the calculation of the final loss value.

For example, taking three patient samples as an example, if the disease information of the first target sample and the second target sample are the same, and the disease information of the third target sample is different from the first two target samples, the first target sample can be determined. The preset value corresponding to the first target sample and the second target sample is 0, while the preset value corresponding to the first target sample and the third target sample is 1, and the preset value corresponding to the second target sample and the third target sample is 1. The default value is also 1.

After determining the distances e(p1,p2), e(p1,p3) and e(p2,p3) between each two target samples, the subdivision between the first target sample and the second target sample can be determined. The loss value is L1=(e(p1,p2)-0)2, and the sub-loss value between the first target sample and the third target sample is L2=(e(p1,p3)-0)2. The sub-loss value between the two target samples and the third target sample is L3=(e(p2,p3)-1)2, and then the loss value of the neural network is determined based on all sub-loss values L=L1+L2+L3 . Among them, p1, p2 and p3 are the sample information of the first, second and third target samples respectively.

In addition, if the value of the first distance is between [0,1], it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is 1; If the first distance is between [0, d], it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is d.

S105: Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the loss value of the neural network calculated each time;

S106: If not satisfied, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.

In this step, a loop method is used to adjust the parameters of the neural network based on the loss value multiple times, so that the loss value records generated during the loop process meet the convergence conditions, that is, the loss value converges.

Specifically, after obtaining the loss value and writing the loss value into the loss value list, if the determination list meets the convergence conditions, it is considered that the current neural network no longer needs to be optimized, so the operation ends; if the determination list does not meet the convergence conditions, then Adjust the parameters of the neural network to reduce the loss value; then return to the step of randomly selecting at least three patient samples, and input the reselected patient samples into the neural network for training, that is, use the adjusted parameters to recalculate and obtain a new The first distance and the new loss value, the parameters are adjusted again to reduce the new loss value. After many cycles, when the neural network calculates the first distance between the disease information of the two sample information, its value is more Approach the preset value.

For example, in the aforementioned embodiment, L1=(e(p1,p2)-0)2, L2=(e(p1,p3)-0)2, L3=(e(p2,p3)-1)2, After many loops, the value of e(p1,p2) approaches 0, and the values of e(p1,p3) and e(p2,p3) approach 1. The loss value is effectively reduced, and the calculation accuracy of the neural network is improved.

Among them, in step S105, it is judged whether the loss value list meets the preset convergence conditions, which specifically includes:

If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then the loss value is determined The list satisfies the preset convergence conditions, where m is a positive integer, m>1, and N is a positive integer.

Specifically, a loss value is obtained for each training. After multiple cycles, the loss value list contains multiple loss values. The number of loss values is greater than or equal to the first preset threshold m, that is, the number of cycles is greater than or equal to m. Among them, the larger the value of m, the greater the number of cycles and the higher the accuracy of the neural network.

In addition, the N+1 to N+m-1th loss function values are not less than the Nth loss function value, that is, the Nth loss function value is less than or equal to several subsequent loss function values. In this case , it can be considered that the loss value has entered a steady state, and the loss value record meets the convergence conditions.

For example, if m=10 is preset, then if the loss value record includes at least 10 loss values, and the N+1 to N+9th losses are not less than the Nth loss value, then the loss value at this time can be considered The record meets the convergence condition, thus ending the loop.

Furthermore, at this time, the current parameters can be used as the final parameters of the neural network, or the parameters used when outputting the Nth loss value can be used as the final parameters of the neural network.

Among them, as shown in Figure 4, after step S106, the following steps are also included:

S107: Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;

S108: Sort the second distances from small to large to obtain a distance list, and use the first k second distances in the distance list as the target distance, where k is a preset positive integer;

S109: Determine whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance;

S110: If yes, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance.

For steps S107 to S110, after multiple loops and parameter adjustments are made so that the loss value list meets the preset convergence conditions and the final neural network is obtained, the neural network can be used to analyze the disease risk of the target patient, that is, to determine the target patient. Whether you belong to a group with a high risk of disease.

Specifically, the information of the target patient and the sample information of each patient sample in the patient sample library are input into the neural network, and the neural network is used to process the information to obtain the second distance between the target patient and each patient sample. It can be understood that the second distance can represent the degree of similarity between the target patient and the patient sample. The smaller the second distance, the more similar the target patient is to the patient sample. In this case, if the disease information of the patient sample is disease, then the target patient is more likely to be sick.

Based on this, the k second distances with the smallest values can be taken as the target distance, and the target patient can be analyzed based on the patient sample corresponding to the target distance. If the patient sample corresponding to the target distance is sick, then the target patient can be considered to belong to a high-risk population; if If the patient sample corresponding to the target distance is not sick, then it can be considered that the target patient does not belong to the high-risk population.

Furthermore, if it is determined that the target patient belongs to a group with a high risk of disease, then the drug information of the patient sample corresponding to the target distance is analyzed, that is, what drugs are taken by the patient sample corresponding to the target distance, what drugs are included in the doctor's diagnosis and prescription, etc., and then based on these drugs The information generates recommended drug data for target patients to assist doctors in diagnosing and prescribing drugs, and improve doctors' work efficiency and accuracy.

Wherein, the second distances can be sorted in order from small to large to obtain a distance list. At this time, the first k second distances in the distance list are the k second distances with the smallest values. Of course, the second distances can also be sorted in descending order to obtain a distance list. In this case, the last k second distances in the distance list are the k second distances with the smallest values.

Among them, in step S109, judging whether the target patient belongs to a group with a high risk of disease based on the patient sample corresponding to the target distance includes the following steps:

S109-1: Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be sick is the target sample;

S109-2: If the number of target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to a population with a high risk of disease; and/or,

S109-3: If the sum of the second distances of all target samples is less than the preset distance threshold, it is determined that the target patient belongs to a population with a high risk of disease.

For steps S109-1 to S109-3, when analyzing the disease risk of the target patient, the patient sample corresponding to the target distance can be used as the basis for analysis. Based on this, the disease information of the patient sample corresponding to the target distance is analyzed. If the patient sample is sick, the patient sample is determined to be the target sample, and then the target is analyzed based on the number of target samples or the second distance between the target sample and the target patient. patient risk. This application provides two methods for determining whether the target patient belongs to a high-risk population, which are suitable for different scenarios or needs.

Specifically, if the analysis is based on the number of target samples, then when the number of target samples is greater than the second preset threshold, that is, among the patient samples corresponding to the target distance, when the number of samples with disease information indicating disease is large enough, you can The target patients are believed to belong to a population with a high risk of disease.

If the analysis is based on the second distance between the target sample and the target patient, then the sum of the second distances between all target samples and the target patient can be calculated. If the sum is less than the preset distance threshold, that is, the patient corresponding to the target distance Among the samples, when the similarity between the sample whose disease information is disease and the target patient is high enough, the target patient can be considered to belong to a population with a high risk of disease.

As shown in Figure 5, in step S101, before at least three patient samples are randomly selected from the patient sample library, the following steps are included:

S100-1: Obtain patient data and generate patient samples based on the patient data. The patient samples include sample information and disease information, and the sample information includes basic patient information, drug information, and test information;

S100-2: Establish a patient sample library based on patient samples.

For steps S100-1 to S100-2, it is first necessary to establish a patient sample database, and then select patients from the patient sample database.

Among them, the patient's basic information includes: gender, age, income, occupation, marriage and childbirth history, past medical history, genetic history, etc.; disease information includes: disease type and whether it is sick, etc.; test information corresponds to the disease type, including examination of the disease. The examination items and examination results usually required for the type of disease; the drug information is information corresponding to the type of disease, which can include the patient's medication information and the doctor's diagnosis and prescription information.

Among them, the examination items in the examination item information can be obtained based on the patient's historical medical records, or can also be provided by an experienced doctor. For example, for patient A, when the disease type is diabetes, the test item information may include: glycated hemoglobin, low-density lipoprotein cholesterol, blood uric acid, urine protein, triglycerides, fasting blood sugar, etc.; in the medication information corresponding to the disease type, The patient's medication information can include whether he uses metformin, whether he uses sulfonylureas, whether he uses GLP-1, whether he uses DPP4, etc.

It can be seen that in the above scheme, at least three patient samples are input at the same time to train the neural network. Since patient samples with the same and different results are trained at the same time, the training efficiency is high and the accuracy of the neural network is high. In addition, this application not only performs similarity measurement on sample information, but also introduces disease information that characterizes the results during the training process. Through multiple cyclic trainings, the importance of different characteristics of patient samples can be distinguished, further improving the neural network's ability to target Accuracy of patient judgment. Furthermore, the neural network can be used to estimate the characteristics of the nonlinear relationship, which solves the problem of low efficiency caused by the existing technology using a linear model to calculate the first distance.

It should be understood that the sequence number of each step in the above embodiment does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

In one embodiment, an optimization device for a disease risk estimation network is provided, and the device for optimizing a disease risk estimation network corresponds one-to-one to the optimization method for the disease risk estimation network in the above embodiment. As shown in Figure 6, the optimization device of the disease risk estimation network includes: an acquisition module, an initialization module, a calculation module, a judgment module and an optimization module. The detailed description of each functional module is as follows:

Acquisition module, used to obtain patient sample library;

The calculation module is used to input the sample information of at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein the neural network is used to estimate the patient's disease risk;

The calculation module is also used to calculate the loss value of the neural network based on the first distance;

A judgment module, used to write the loss value into the loss value list, and judge whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;

The optimization module is used to adjust the parameters of the neural network according to the loss value if they are not satisfied, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.

In one embodiment, the calculation module is specifically used for:

Select any two patient samples from at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results;

The difference between the first distance between the two target samples and the preset value is used as the middle difference, and the square of the middle difference is used as the sub-loss value between the two target samples;

The loss value is determined based on the sub-loss value between each two target samples.

In one embodiment, the computing module is also used to:

Determine disease information for each of at least three patient samples;

If the disease information of at least three patient samples is the same, at least three patient samples will be randomly selected again from the patient sample database.

In one embodiment, determining whether the loss value list meets the preset convergence conditions specifically includes:

If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then it is determined that the The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.

In one embodiment, the device further includes a sample library creation module, specifically used for:

Obtain patient data and generate patient samples based on the patient data, where the patient samples include sample information and disease information, and the sample information includes basic patient information, drug information, and test information;

Establish a patient sample library based on patient samples.

In one embodiment, the device further includes an analysis module, specifically used for:

Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;

Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as the target distance, where k is a preset positive integer;

Determine whether the target patient belongs to a high-risk population based on the patient samples corresponding to the target distance;

If so, recommended drug data is generated based on the drug information of the patient sample corresponding to the target distance.

In one embodiment, the analysis module is specifically used to:

Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be sick is the target sample;

If the number of target samples is greater than the second preset number threshold, it is determined that the target patient belongs to a population with a high risk of disease; and/or,

If the sum of the second distances of all target samples is less than the preset distance threshold, it is determined that the target patient belongs to a population with a high risk of disease.

In one embodiment, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the computer program, the following steps are implemented:

Randomly select at least three patient samples from the patient sample library, and input the sample information of at least three patient samples into a preset neural network in pairs, where the neural network is used to estimate the patient's disease risk;

Using a neural network to calculate the first distance between each two patient samples, and calculating the loss value of the neural network based on the first distance;

Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;

If it is not satisfied, adjust the parameters of the neural network based on the loss value, randomly select at least three patient samples from the patient sample database, and use the adjusted parameters to recalculate the first distance and loss value;

If satisfied, the operation ends.

The internal structure diagram of the computer equipment can be shown in Figure 7. The computer device includes a processor, memory, display screen, and input device connected by a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes storage media and internal memory. The storage medium stores operating systems and computer programs. This internal memory provides an environment for the operating system and computer programs in the storage medium to run. When the computer program is executed by the processor, it implements the functions or steps of the optimization method of the disease risk estimation network.

In one embodiment, a storage medium is provided, and the storage medium may be non-volatile or volatile. A computer program is stored on the storage medium. When the computer program is executed by the processor, the following steps are implemented:

If satisfied, the operation ends.

It should be noted that for the above-mentioned functions or steps that can be implemented by storage media or computer equipment, please refer to the relevant descriptions in the foregoing method embodiments. To avoid repetition, they will not be described one by one here. Those skilled in the art can understand that the structure of a computer device provided in this embodiment does not constitute a limitation on the computer device, and may include more or less components, or combine certain components, or arrange different components.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile or volatile storage. In the media, when executed, the computer program may include the processes of the above method embodiments. Any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that for the convenience and simplicity of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units and modules according to needs. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

Through the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform, or can also be implemented by hardware. First, select multiple patient samples, and use pairwise input into the neural network to calculate the first distance between each two patient samples, then calculate the loss value based on the first distance, and then adjust the parameters of the neural network based on the loss value to To achieve the purpose of training neural network. Furthermore, multiple patient samples include both diseased samples and non-diseased samples. When training the neural network, similar relationships and distinguishing relationships can be trained at the same time, and the training efficiency is higher. In addition, after obtaining a neural network that meets the conditions, the target patient can be input into the neural network to determine whether the target patient belongs to a population with a high risk of disease based on the distance between the target patient and each sample, realizing automatic identification of the patient's disease risk. Prediction provides assistance to doctors in diagnosis to improve the efficiency and accuracy of doctors’ diagnosis.

Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the units or processes in the accompanying drawing are not necessarily necessary for implementing the present application. Those skilled in the art can understand that the units in the system in the implementation scenario can be distributed in the system in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more systems different from this implementation scenario. The units of the above implementation scenarios can be combined into one unit or further split into multiple sub-units.

The above-described embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the above-mentioned implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions in the embodiments of this application, and should be included in within the protection scope of this application.

Claims

An optimization method for disease risk estimation network, wherein the method includes:

Obtain patient sample bank;

Randomly select at least three patient samples from the patient sample library;

The sample information of the three patient samples is input into a preset neural network in pairs, and the neural network is used to calculate the first distance between each two patient samples, wherein the neural network is used to estimate the patient risk of illness;

Calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes each time The calculated loss value of the neural network;

If not, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
The method according to claim 1, wherein calculating the loss value of the neural network according to the first distance specifically includes:

Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;

The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;

The loss value is determined based on the sub-loss value between each two target samples.
The method according to claim 2, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, the method further includes:

determining disease information for each of the at least three patient samples;

If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
The method according to claim 1, wherein determining whether the loss value list satisfies a preset convergence condition specifically includes:

If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
The method according to claim 3, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:

Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;

The patient sample library is established based on the patient samples.
The method according to claim 5, wherein after the loss value list satisfies the preset convergence condition, the method further includes:

Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;

Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;

Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;

If so, recommended drug data is generated based on the drug information of the patient sample corresponding to the target distance.
The method according to claim 6, wherein determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:

Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;

If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,

If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.
An optimization device for disease risk estimation network, wherein the device includes:

Acquisition module, used to obtain patient sample library;

An initialization module, used to randomly select at least three patient samples from the patient sample library;

A calculation module, configured to input the sample information of the at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein: Neural networks are used to estimate patient risk;

The calculation module is also used to calculate the loss value of the neural network according to the first distance;

A judgment module, configured to write the loss value into a loss value list, and judge whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time;

Optimization module, used to adjust the parameters of the neural network according to the loss value if it is not satisfied, and return to the step of randomly selecting at least three patient samples in the patient sample library until the loss value list meet the preset convergence conditions.
A storage medium with a computer program stored thereon, wherein when the computer program is executed by a processor, an optimization method for disease risk estimation network is implemented, including:

Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
The storage medium according to claim 9, wherein the calculating the loss value of the neural network according to the first distance specifically includes:

Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;

The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;

The loss value is determined based on the sub-loss value between each two target samples.
The storage medium according to claim 10, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, it further includes:

determining disease information for each of the at least three patient samples;

If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
The storage medium according to claim 9, wherein the determining whether the loss value list satisfies a preset convergence condition specifically includes:

If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
The storage medium according to claim 11, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:

Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;

The patient sample library is established based on the patient samples.
The storage medium according to claim 13, wherein after the loss value list satisfies the preset convergence condition, it further includes:

Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;

Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;

Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;

If so, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance;

Determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:

Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;

If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,

If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.
A computer device, including a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, wherein the processor implements an optimization method for disease risk estimation network when executing the computer program, include:

Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
The computer device according to claim 15, wherein the calculating the loss value of the neural network according to the first distance specifically includes:

Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;

The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;

The loss value is determined based on the sub-loss value between each two target samples.
The computer device according to claim 16, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, it further includes:

determining disease information for each of the at least three patient samples;

If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
The computer device according to claim 15, wherein the determining whether the loss value list satisfies a preset convergence condition specifically includes:

If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
The computer device according to claim 17, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:

Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;

The patient sample library is established based on the patient samples.
The computer device according to claim 19, wherein after the loss value list satisfies the preset convergence condition, it further includes:

Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;

Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;

Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;

If so, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance;

Determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:

Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;

If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,

If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.