WO2023178789A1 - Disease risk estimation network optimization method and apparatus, medium, and device - Google Patents

Disease risk estimation network optimization method and apparatus, medium, and device Download PDF

Info

Publication number
WO2023178789A1
WO2023178789A1 PCT/CN2022/089727 CN2022089727W WO2023178789A1 WO 2023178789 A1 WO2023178789 A1 WO 2023178789A1 CN 2022089727 W CN2022089727 W CN 2022089727W WO 2023178789 A1 WO2023178789 A1 WO 2023178789A1
Authority
WO
WIPO (PCT)
Prior art keywords
patient
loss value
target
samples
neural network
Prior art date
Application number
PCT/CN2022/089727
Other languages
French (fr)
Chinese (zh)
Inventor
徐卓扬
赵婷婷
胡岗
孙行智
赵越
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023178789A1 publication Critical patent/WO2023178789A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the fields of artificial intelligence and digital medical technology, and in particular to an optimization method, device, medium and equipment for a disease risk estimation network.
  • this application provides an optimization method, device, medium and equipment for disease risk estimation network, which improves the accuracy of the neural network used for disease risk estimation.
  • an optimization method for a disease risk estimation network including:
  • the sample information of the at least three patient samples is input into a preset neural network in pairs, and the neural network is used to calculate the first distance between each two patient samples, wherein the neural network is used to estimate patient risk;
  • the loss value list includes the neural network loss value calculated each time
  • an optimization device for a disease risk estimation network including:
  • Acquisition module used to obtain patient sample library
  • An initialization module used to randomly select at least three patient samples from the patient sample library
  • a calculation module configured to input the sample information of the at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein: Neural networks are used to estimate patient risk;
  • the calculation module is also used to calculate the loss value of the neural network according to the first distance
  • a judgment module configured to write the loss value into a loss value list, and judge whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time;
  • Optimization module used to adjust the parameters of the neural network according to the loss value if it is not satisfied, and return to the step of randomly selecting at least three patient samples in the patient sample library until the loss value list meet the preset convergence conditions.
  • a storage medium is provided with a computer program stored thereon.
  • the optimization method for the disease risk estimation network is implemented, including:
  • a computer device including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor.
  • the processor executes the computer program, the above-mentioned problems are realized.
  • Optimization methods for disease risk estimation networks include:
  • the device, medium and equipment of the disease risk estimation network at least three patient samples are input at the same time to train the neural network.
  • the importance of different characteristics of the patient samples can be distinguished. Effectively improves the accuracy of the neural network's judgment on target patients.
  • the training efficiency is high and the accuracy of the neural network is high.
  • Figure 1 shows a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application
  • Figure 2 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application
  • Figure 3 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application
  • Figure 4 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application
  • Figure 5 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application
  • Figure 6 shows a structural block diagram of an optimization device for a disease risk estimation network provided by an embodiment of the present application
  • Figure 7 shows a structural block diagram of a computer social security provided by an embodiment of the present application.
  • Embodiments of the present application provide a decentralized adaptive collaborative training method based on blockchain, which can be applied to electronic devices with the ability to run instructions or programs.
  • the electronic devices can be, but are not limited to, various personal computers, notebooks, etc.
  • Computers, smartphones, tablets and portable wearable devices can also be implemented using independent servers or server clusters composed of multiple servers.
  • the present application is described in detail below through specific embodiments.
  • Figure 1 is a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application, including the following steps:
  • S102 Randomly select at least three patient samples from the patient sample database
  • S103 Input the sample information of at least three patient samples into the preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, where the neural network is used to estimate the patient's disease risk;
  • the method provided by this application is used to optimize the disease risk estimation network, where the disease risk estimation network can be a neural network, and the neural network can estimate the patient's disease risk, specifically estimating whether the patient is a high-risk group for the disease.
  • this application uses machine learning methods to optimize the neural network by training patient samples. Specifically, taking three patient samples randomly selected from the patient sample database as an example, the sample information of the first patient sample and the second patient sample is input into the neural network to obtain the first patient sample and the second patient sample. the first distance between them; similarly, input the sample information of the first patient sample and the third patient sample into the neural network to obtain the first distance between the first patient sample and the third patient sample; The sample information of the two patient samples and the third patient sample is input into the neural network to obtain the first distance between the second patient sample and the third patient sample; and then the three output first distances are used to perform the neural network optimization.
  • the first distance may be a distance after normalization, and its value is between [0,1].
  • the neural network here can be a self-organizing feature map network or a learning vector quantization network, or other neural networks, which are not limited here.
  • step S103 before inputting the sample information of at least three patient samples into the preset neural network in pairs, the following steps are included:
  • S103-1 Determine the disease information of each patient sample in at least three patient samples
  • steps S103-1 and S103-2 after randomly selecting at least three patient samples, determine whether their disease information is the same. If the disease information of all patient samples is the same, re-randomly select at least three patient samples until The disease information of one patient sample is different from the other two patient samples, and the number of patient samples reselected can be different from the number of patient samples randomly selected this time.
  • the disease information can be disease or non-disease.
  • the disease information of all patient samples obtained through random selection is diseased or not diseased, reselect until at least two non-diseased samples and at least one diseased sample are obtained, or at least one diseased sample is obtained. Two diseased samples and at least one undiseased sample.
  • this application can simultaneously train samples with the same disease information and samples with different disease information, that is, the neural network's ability to process similar relationships and distinguishing relationships is simultaneously trained. Its training efficiency is higher and a more accurate neural network model can be obtained faster.
  • Input at least two of the three patient sample information into the neural network, and output the first distance between the two patient samples corresponding to the two sample information. Then a loss function can be constructed to bring each first distance into the loss. function to calculate the loss value of the neural network.
  • each sample information can contain multiple features, compare the similarities and differences of each feature in the two sample information, and comprehensively analyze each feature to obtain the first distance.
  • step S104 calculating the loss value of the neural network based on the first distance includes the following steps:
  • S104-1 Select any two patient samples from at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results;
  • S104-2 Use the difference between the first distance between the two target samples and the preset value as the middle difference, and use the square of the middle difference as the sub-loss value between the two target samples;
  • S104-3 Determine the loss value based on the sub-loss value between each two target samples.
  • two target samples are selected from at least three patient samples, and presets corresponding to the two target samples are set based on the disease information of the two target samples.
  • the value that is, the preset value depends on the disease information of the two target samples.
  • the square of the difference between the first distance and the preset value is used as the sub-loss value between the two target samples, and a similar method is used to obtain the sub-loss value between each two target samples, and based on all sub-loss values
  • the loss value determines the loss value of the neural network.
  • the first distance can reflect whether the disease information of the two target samples is the same, and the sub-loss value can reflect the calculation error for the two target samples.
  • This application uses the sub-loss value to represent the closeness of the first distance to the preset value, and uses square processing to make the sub-loss value a non-negative number, eliminating the impact of negative numbers on the calculation of the final loss value.
  • the first target sample can be determined.
  • the preset value corresponding to the first target sample and the second target sample is 0, while the preset value corresponding to the first target sample and the third target sample is 1, and the preset value corresponding to the second target sample and the third target sample is 1.
  • the default value is also 1.
  • the subdivision between the first target sample and the second target sample can be determined.
  • p1, p2 and p3 are the sample information of the first, second and third target samples respectively.
  • the value of the first distance is between [0,1]
  • it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is 1;
  • the first distance is between [0, d]
  • it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is d.
  • S105 Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the loss value of the neural network calculated each time;
  • a loop method is used to adjust the parameters of the neural network based on the loss value multiple times, so that the loss value records generated during the loop process meet the convergence conditions, that is, the loss value converges.
  • the determination list if the determination list meets the convergence conditions, it is considered that the current neural network no longer needs to be optimized, so the operation ends; if the determination list does not meet the convergence conditions, then Adjust the parameters of the neural network to reduce the loss value; then return to the step of randomly selecting at least three patient samples, and input the reselected patient samples into the neural network for training, that is, use the adjusted parameters to recalculate and obtain a new The first distance and the new loss value, the parameters are adjusted again to reduce the new loss value. After many cycles, when the neural network calculates the first distance between the disease information of the two sample information, its value is more Approach the preset value.
  • step S105 it is judged whether the loss value list meets the preset convergence conditions, which specifically includes:
  • the loss value is determined
  • the list satisfies the preset convergence conditions, where m is a positive integer, m>1, and N is a positive integer.
  • the loss value list contains multiple loss values.
  • the number of loss values is greater than or equal to the first preset threshold m, that is, the number of cycles is greater than or equal to m.
  • m the number of cycles is greater than or equal to m.
  • the N+1 to N+m-1th loss function values are not less than the Nth loss function value, that is, the Nth loss function value is less than or equal to several subsequent loss function values. In this case , it can be considered that the loss value has entered a steady state, and the loss value record meets the convergence conditions.
  • the loss value record includes at least 10 loss values, and the N+1 to N+9th losses are not less than the Nth loss value, then the loss value at this time can be considered The record meets the convergence condition, thus ending the loop.
  • the current parameters can be used as the final parameters of the neural network, or the parameters used when outputting the Nth loss value can be used as the final parameters of the neural network.
  • step S106 the following steps are also included:
  • S108 Sort the second distances from small to large to obtain a distance list, and use the first k second distances in the distance list as the target distance, where k is a preset positive integer;
  • S109 Determine whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance;
  • the neural network can be used to analyze the disease risk of the target patient, that is, to determine the target patient. Whether you belong to a group with a high risk of disease.
  • the information of the target patient and the sample information of each patient sample in the patient sample library are input into the neural network, and the neural network is used to process the information to obtain the second distance between the target patient and each patient sample.
  • the second distance can represent the degree of similarity between the target patient and the patient sample. The smaller the second distance, the more similar the target patient is to the patient sample. In this case, if the disease information of the patient sample is disease, then the target patient is more likely to be sick.
  • the k second distances with the smallest values can be taken as the target distance, and the target patient can be analyzed based on the patient sample corresponding to the target distance. If the patient sample corresponding to the target distance is sick, then the target patient can be considered to belong to a high-risk population; if If the patient sample corresponding to the target distance is not sick, then it can be considered that the target patient does not belong to the high-risk population.
  • the drug information of the patient sample corresponding to the target distance is analyzed, that is, what drugs are taken by the patient sample corresponding to the target distance, what drugs are included in the doctor's diagnosis and prescription, etc., and then based on these drugs
  • the information generates recommended drug data for target patients to assist doctors in diagnosing and prescribing drugs, and improve doctors' work efficiency and accuracy.
  • the second distances can be sorted in order from small to large to obtain a distance list.
  • the first k second distances in the distance list are the k second distances with the smallest values.
  • the second distances can also be sorted in descending order to obtain a distance list.
  • the last k second distances in the distance list are the k second distances with the smallest values.
  • step S109 judging whether the target patient belongs to a group with a high risk of disease based on the patient sample corresponding to the target distance includes the following steps:
  • the patient sample corresponding to the target distance can be used as the basis for analysis. Based on this, the disease information of the patient sample corresponding to the target distance is analyzed. If the patient sample is sick, the patient sample is determined to be the target sample, and then the target is analyzed based on the number of target samples or the second distance between the target sample and the target patient. patient risk.
  • This application provides two methods for determining whether the target patient belongs to a high-risk population, which are suitable for different scenarios or needs.
  • the analysis is based on the number of target samples, then when the number of target samples is greater than the second preset threshold, that is, among the patient samples corresponding to the target distance, when the number of samples with disease information indicating disease is large enough, you can The target patients are believed to belong to a population with a high risk of disease.
  • the analysis is based on the second distance between the target sample and the target patient, then the sum of the second distances between all target samples and the target patient can be calculated. If the sum is less than the preset distance threshold, that is, the patient corresponding to the target distance Among the samples, when the similarity between the sample whose disease information is disease and the target patient is high enough, the target patient can be considered to belong to a population with a high risk of disease.
  • step S101 before at least three patient samples are randomly selected from the patient sample library, the following steps are included:
  • S100-1 Obtain patient data and generate patient samples based on the patient data.
  • the patient samples include sample information and disease information, and the sample information includes basic patient information, drug information, and test information;
  • S100-2 Establish a patient sample library based on patient samples.
  • steps S100-1 to S100-2 it is first necessary to establish a patient sample database, and then select patients from the patient sample database.
  • the patient's basic information includes: gender, age, income, occupation, marriage and childbirth history, past medical history, genetic history, etc.
  • disease information includes: disease type and whether it is sick, etc.
  • test information corresponds to the disease type, including examination of the disease.
  • the examination items and examination results usually required for the type of disease
  • the drug information is information corresponding to the type of disease, which can include the patient's medication information and the doctor's diagnosis and prescription information.
  • the examination items in the examination item information can be obtained based on the patient's historical medical records, or can also be provided by an experienced doctor.
  • the test item information may include: glycated hemoglobin, low-density lipoprotein cholesterol, blood uric acid, urine protein, triglycerides, fasting blood sugar, etc.; in the medication information corresponding to the disease type, The patient's medication information can include whether he uses metformin, whether he uses sulfonylureas, whether he uses GLP-1, whether he uses DPP4, etc.
  • the neural network can be used to estimate the characteristics of the nonlinear relationship, which solves the problem of low efficiency caused by the existing technology using a linear model to calculate the first distance.
  • sequence number of each step in the above embodiment does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
  • an optimization device for a disease risk estimation network is provided, and the device for optimizing a disease risk estimation network corresponds one-to-one to the optimization method for the disease risk estimation network in the above embodiment.
  • the optimization device of the disease risk estimation network includes: an acquisition module, an initialization module, a calculation module, a judgment module and an optimization module. The detailed description of each functional module is as follows:
  • Acquisition module used to obtain patient sample library
  • An initialization module used to randomly select at least three patient samples from the patient sample library
  • the calculation module is used to input the sample information of at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein the neural network is used to estimate the patient's disease risk;
  • the calculation module is also used to calculate the loss value of the neural network based on the first distance
  • a judgment module used to write the loss value into the loss value list, and judge whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;
  • the optimization module is used to adjust the parameters of the neural network according to the loss value if they are not satisfied, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
  • the calculation module is specifically used for:
  • the difference between the first distance between the two target samples and the preset value is used as the middle difference, and the square of the middle difference is used as the sub-loss value between the two target samples;
  • the loss value is determined based on the sub-loss value between each two target samples.
  • the computing module is also used to:
  • At least three patient samples will be randomly selected again from the patient sample database.
  • determining whether the loss value list meets the preset convergence conditions specifically includes:
  • the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then it is determined that the The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
  • the device further includes a sample library creation module, specifically used for:
  • patient samples include sample information and disease information
  • sample information includes basic patient information, drug information, and test information
  • the device further includes an analysis module, specifically used for:
  • recommended drug data is generated based on the drug information of the patient sample corresponding to the target distance.
  • the analysis module is specifically used to:
  • the patient sample whose disease information is determined to be sick is the target sample
  • the target patient belongs to a population with a high risk of disease; and/or,
  • the target patient belongs to a population with a high risk of disease.
  • a computer device including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor.
  • the processor executes the computer program, the following steps are implemented:
  • the loss value list includes the neural network loss value calculated each time;
  • the internal structure diagram of the computer equipment can be shown in Figure 7.
  • the computer device includes a processor, memory, display screen, and input device connected by a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes storage media and internal memory.
  • the storage medium stores operating systems and computer programs. This internal memory provides an environment for the operating system and computer programs in the storage medium to run. When the computer program is executed by the processor, it implements the functions or steps of the optimization method of the disease risk estimation network.
  • a storage medium is provided, and the storage medium may be non-volatile or volatile.
  • a computer program is stored on the storage medium. When the computer program is executed by the processor, the following steps are implemented:
  • the loss value list includes the neural network loss value calculated each time;
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM
  • Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
  • the present application can be implemented by means of software plus a necessary general hardware platform, or can also be implemented by hardware.
  • multiple patient samples include both diseased samples and non-diseased samples.
  • the target patient can be input into the neural network to determine whether the target patient belongs to a population with a high risk of disease based on the distance between the target patient and each sample, realizing automatic identification of the patient's disease risk. Prediction provides assistance to doctors in diagnosis to improve the efficiency and accuracy of doctors’ diagnosis.
  • the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the units or processes in the accompanying drawing are not necessarily necessary for implementing the present application.
  • the units in the system in the implementation scenario can be distributed in the system in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more systems different from this implementation scenario.
  • the units of the above implementation scenarios can be combined into one unit or further split into multiple sub-units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Molecular Biology (AREA)
  • Toxicology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Medicinal Chemistry (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present application relates to the technical fields of artificial intelligence and digital medical treatment, and discloses a disease risk estimation network optimization method and system, a storage medium, and a computer device. The method comprises: obtaining a patient sample library; randomly selecting at least three patient samples from the patient sample library; inputting sample information of the at least three patient samples into a preset neural network in pairs, and calculating a first distance between every two patient samples by using the neural network, wherein the neural network is used for estimating a disease risk of a patient; calculating a loss value of the neural network according to the first distances; writing the loss value into a loss value list, and determining whether the loss value list meets a preset convergence condition; and if not, adjusting parameters of the neural network according to the loss value, and returning to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence condition. The method of the present application improves the accuracy of the neural network for disease risk estimation.

Description

患病风险估计网络的优化方法、装置、介质及设备Optimization method, device, medium and equipment for disease risk estimation network
本申请要求与2022年03月21日提交中国专利局、申请号为202210278345.8、申请名称为“患病风险估计网络的优化方法、装置、介质及设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims priority with the Chinese patent application submitted to the China Patent Office on March 21, 2022, with application number 202210278345.8, and the application name is "Optimization method, device, medium and equipment for disease risk estimation network", the entire content of which Incorporated into the application by reference.
技术领域Technical field
本申请涉及人工智能以及数字医疗技术领域,尤其是涉及到一种患病风险估计网络的优化方法、装置、介质及设备。This application relates to the fields of artificial intelligence and digital medical technology, and in particular to an optimization method, device, medium and equipment for a disease risk estimation network.
背景技术Background technique
随着人工智能技术的兴起,其应用场景越发丰富,可以支持疾病辅助诊断、健康管理、远程会诊等功能。发明人发现,在对患者进行疾病诊断的过程中,可以利用人工智能技术判断患者是否为疾病的高发人群,进而为医生的诊断提供参考,以提高医生诊断效率以及准确度,但是现有的相似患者估计模型准确度低,其估计结果往往与患者病情不符。With the rise of artificial intelligence technology, its application scenarios are becoming more and more abundant, and it can support functions such as auxiliary disease diagnosis, health management, and remote consultation. The inventor found that in the process of diagnosing a patient's disease, artificial intelligence technology can be used to determine whether the patient is a high-risk group for the disease, and then provide a reference for the doctor's diagnosis to improve the doctor's diagnostic efficiency and accuracy. However, the existing similar The patient estimation model has low accuracy, and its estimation results are often inconsistent with the patient's condition.
发明内容Contents of the invention
有鉴于此,本申请提供了一种患病风险估计网络的优化方法、装置、介质及设备,提高了用于患病风险估计的神经网络的精准度。In view of this, this application provides an optimization method, device, medium and equipment for disease risk estimation network, which improves the accuracy of the neural network used for disease risk estimation.
根据本申请的一个方面,提供了一种患病风险估计网络的优化方法,包括:According to one aspect of the present application, an optimization method for a disease risk estimation network is provided, including:
获取患者样本库;Obtain patient sample bank;
在所述患者样本库中随机选取至少三个患者样本;Randomly select at least three patient samples from the patient sample library;
将所述至少三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;The sample information of the at least three patient samples is input into a preset neural network in pairs, and the neural network is used to calculate the first distance between each two patient samples, wherein the neural network is used to estimate patient risk;
根据所述第一距离计算所述神经网络的损失值;Calculate the loss value of the neural network according to the first distance;
将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;Write the loss value into a loss value list, and determine whether the loss value list satisfies the preset convergence conditions, wherein the loss value list includes the neural network loss value calculated each time;
若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。If not, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
根据本申请的另一方面,提供了一种患病风险估计网络的优化装置,包括:According to another aspect of the present application, an optimization device for a disease risk estimation network is provided, including:
获取模块,用于获取患者样本库;Acquisition module, used to obtain patient sample library;
初始化模块,用于在所述患者样本库中随机选取至少三个患者样本;An initialization module, used to randomly select at least three patient samples from the patient sample library;
计算模块,用于将所述至少三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;A calculation module, configured to input the sample information of the at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein: Neural networks are used to estimate patient risk;
所述计算模块,还用于根据所述第一距离计算所述神经网络的损失值;The calculation module is also used to calculate the loss value of the neural network according to the first distance;
判断模块,用于将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;A judgment module, configured to write the loss value into a loss value list, and judge whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time;
优化模块,用于若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Optimization module, used to adjust the parameters of the neural network according to the loss value if it is not satisfied, and return to the step of randomly selecting at least three patient samples in the patient sample library until the loss value list meet the preset convergence conditions.
根据本申请又一个方面,提供了一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述患病风险估计网络的优化方法,包括:According to yet another aspect of the present application, a storage medium is provided with a computer program stored thereon. When the computer program is executed by a processor, the optimization method for the disease risk estimation network is implemented, including:
获取患者样本库;在所述患者样本库中随机选取至少三个患者样本;将所述三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;根据所述第一距离计算所述神经网络的损失值;将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
根据本申请再一个方面,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述患病风险估计网络的优化方法,包括:According to yet another aspect of the present application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the computer program, the above-mentioned problems are realized. Optimization methods for disease risk estimation networks include:
获取患者样本库;在所述患者样本库中随机选取至少三个患者样本;将所述三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;根据所述第一距离计算所述神经网络的损失值;将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
上述基于患病风险估计网络的优化方法、装置、介质及设备所时限的方案中,同时输入至少三个患者样本对神经网络进行训练,通过多次循环训练可区分患者样本不同特征的重要程度,有效提高了神经网络针对目标患者的判断准确度。此外,由于同时训练了相同以及不同结果的患者样本,因此训练效率高,神经网络的精准度高。In the above time-limited scheme based on the optimization method, device, medium and equipment of the disease risk estimation network, at least three patient samples are input at the same time to train the neural network. Through multiple cycle training, the importance of different characteristics of the patient samples can be distinguished. Effectively improves the accuracy of the neural network's judgment on target patients. In addition, since patient samples with the same and different results are trained at the same time, the training efficiency is high and the accuracy of the neural network is high.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solutions of the present application. In order to have a clearer understanding of the technical means of the present application, they can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present application more obvious and understandable. , the specific implementation methods of the present application are specifically listed below.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的 示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application. In the attached picture:
图1示出了本申请实施例提供的一种患病风险估计网络的优化方法的流程示意图;Figure 1 shows a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种患病风险估计网络的优化方法的流程示意图;Figure 2 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;
图3示出了本申请实施例提供的另一种患病风险估计网络的优化方法的流程示意图;Figure 3 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;
图4示出了本申请实施例提供的另一种患病风险估计网络的优化方法的流程示意图;Figure 4 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;
图5示出了本申请实施例提供的另一种患病风险估计网络的优化方法的流程示意图;Figure 5 shows a schematic flow chart of another optimization method for a disease risk estimation network provided by an embodiment of the present application;
图6示出了本申请实施例提供的一种患病风险估计网络的优化装置的结构框图;Figure 6 shows a structural block diagram of an optimization device for a disease risk estimation network provided by an embodiment of the present application;
图7示出了本申请实施例提供的一种计算机社保的结构框图。Figure 7 shows a structural block diagram of a computer social security provided by an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of this application can be combined with each other.
本申请实施例提供了一种基于区块链的去中心化自适应协同训练方法,可以应用在具有指令或程序运行能力的电子设备中,其中,电子设备可以但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,也可以用独立的服务器或者是多个服务器组成的服务器集群来实现。下面通过具体的实施例对本申请进行详细的描述。Embodiments of the present application provide a decentralized adaptive collaborative training method based on blockchain, which can be applied to electronic devices with the ability to run instructions or programs. The electronic devices can be, but are not limited to, various personal computers, notebooks, etc. Computers, smartphones, tablets and portable wearable devices can also be implemented using independent servers or server clusters composed of multiple servers. The present application is described in detail below through specific embodiments.
请参阅图1所示,图1为本申请实施例提供的患病风险估计网络的优化方法的一个流程示意图,包括如下步骤:Please refer to Figure 1. Figure 1 is a schematic flow chart of an optimization method for a disease risk estimation network provided by an embodiment of the present application, including the following steps:
S101:获取患者样本库;S101: Obtain patient sample library;
S102:在患者样本库中随机选取至少三个患者样本;S102: Randomly select at least three patient samples from the patient sample database;
S103:将至少三个患者样本的样本信息两两输入预设的神经网络中,利用神经网络计算每两个患者样本之间的第一距离,其中,神经网络用于估计患者患病风险;S103: Input the sample information of at least three patient samples into the preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, where the neural network is used to estimate the patient's disease risk;
本申请提供的方法,用于优化患病风险估计网络,其中,患病风险估计网络可以为神经网络,该神经网络可以估计患者患病风险,具体为估计患者是否为疾病高发人群。The method provided by this application is used to optimize the disease risk estimation network, where the disease risk estimation network can be a neural network, and the neural network can estimate the patient's disease risk, specifically estimating whether the patient is a high-risk group for the disease.
其中,本申请利用机器学习方法,通过训练患者样本实现神经网络的优化。具体地,以在患者样本库中随机选取三个患者样本为例,将第一个患者样本以及第二个患者样本的样本信息输入神经网络中,得到第一个患者样本和第二个患者样本之间的第一距离;类似地,将第一个患者样本以及第三个患者样本的样本信息输入神经网络中,得到第一个患者样本和第三个患者样本之间的第一距离;第二个患者样本以及第三个患者样本的样本信息输入神经网络中,得到第二个患者样本和第三个患者样本之间的第一距离;进而利用输出的三个第一距离对神经网络进行优化。Among them, this application uses machine learning methods to optimize the neural network by training patient samples. Specifically, taking three patient samples randomly selected from the patient sample database as an example, the sample information of the first patient sample and the second patient sample is input into the neural network to obtain the first patient sample and the second patient sample. the first distance between them; similarly, input the sample information of the first patient sample and the third patient sample into the neural network to obtain the first distance between the first patient sample and the third patient sample; The sample information of the two patient samples and the third patient sample is input into the neural network to obtain the first distance between the second patient sample and the third patient sample; and then the three output first distances are used to perform the neural network optimization.
其中,第一距离可以是进行归一化处理后的距离,其值在[0,1]之间。Wherein, the first distance may be a distance after normalization, and its value is between [0,1].
需要理解的是,这里的神经网络,可以是自组织特征映射网络或学习向量量化网络,也可以是其他神经网络,在此不做限定。It should be understood that the neural network here can be a self-organizing feature map network or a learning vector quantization network, or other neural networks, which are not limited here.
其中,如图2所示,步骤S103中,将至少三个患者样本的样本信息两两输入预设的神经网络中之前,包括如下步骤:As shown in Figure 2, in step S103, before inputting the sample information of at least three patient samples into the preset neural network in pairs, the following steps are included:
S103-1:确定至少三个患者样本中每个患者样本的患病信息;S103-1: Determine the disease information of each patient sample in at least three patient samples;
S103-2:若至少三个患者样本的患病信息均相同,则重新在患者样本库中随机选取至少三个患者样本。S103-2: If the disease information of at least three patient samples is the same, re-select at least three patient samples randomly from the patient sample database.
对于步骤S103-1和S103-2,在随机选取至少三个患者样本后,判断其患病信息是否相同,若所有患者样本的患病信息均相同,则重新随机选取至少三个患者样本,直至有一个患者样本的患病信息与另两个患者样本不同,其中,重新选取的患者样本数量可以与本次随机选取的患者样本数量不同。For steps S103-1 and S103-2, after randomly selecting at least three patient samples, determine whether their disease information is the same. If the disease information of all patient samples is the same, re-randomly select at least three patient samples until The disease information of one patient sample is different from the other two patient samples, and the number of patient samples reselected can be different from the number of patient samples randomly selected this time.
其中,患病信息可以为患病或未患病。例如,若随机选取得到的所有患者样本的患病信息均为患病或均为未患病,则重新选取,直至得到至少两个未患病的样本和至少一个患病的样本,或得到至少两个患病的样本和至少一个未患病的样本。Among them, the disease information can be disease or non-disease. For example, if the disease information of all patient samples obtained through random selection is diseased or not diseased, reselect until at least two non-diseased samples and at least one diseased sample are obtained, or at least one diseased sample is obtained. Two diseased samples and at least one undiseased sample.
通过此步骤选取多个患者样本并输入神经网络,本申请可以同时训练到患病信息相同的样本以及患病信息不同的样本,也即同时训练了神经网络对于相似关系以及区分关系的处理能力,其训练效率更高,可以更快地得到较精准的神经网络模型。By selecting multiple patient samples and inputting them into the neural network through this step, this application can simultaneously train samples with the same disease information and samples with different disease information, that is, the neural network's ability to process similar relationships and distinguishing relationships is simultaneously trained. Its training efficiency is higher and a more accurate neural network model can be obtained faster.
S104:根据第一距离计算神经网络的损失值;S104: Calculate the loss value of the neural network based on the first distance;
将至少三个患者样本信息中的两个输入神经网络,输出与这两个样本信息对应的两个患者样本之间的第一距离,进而可构造损失函数,将每个第一距离带入损失函数中以计算神经网络的损失值。Input at least two of the three patient sample information into the neural network, and output the first distance between the two patient samples corresponding to the two sample information. Then a loss function can be constructed to bring each first distance into the loss. function to calculate the loss value of the neural network.
其中,每个样本信息可以包含多个特征,比较两个样本信息中每个特征的异同以及差别,综合分析各个特征得到第一距离。Among them, each sample information can contain multiple features, compare the similarities and differences of each feature in the two sample information, and comprehensively analyze each feature to obtain the first distance.
其中,如图3所示,步骤S104中,根据第一距离计算神经网络的损失值,包括如下步骤:As shown in Figure 3, in step S104, calculating the loss value of the neural network based on the first distance includes the following steps:
S104-1:在至少三个患者样本中选择任两个患者样本作为目标样本,判断两个目标样本的患病信息是否相同,并根据判断结果确定与两个目标样本对应的预设数值;S104-1: Select any two patient samples from at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results;
S104-2:将两个目标样本之间的第一距离与预设数值的差值作为中间差值,并将中间差值的平方作为两个目标样本之间的子损失值;S104-2: Use the difference between the first distance between the two target samples and the preset value as the middle difference, and use the square of the middle difference as the sub-loss value between the two target samples;
S104-3:根据每两个目标样本之间的子损失值确定损失值。S104-3: Determine the loss value based on the sub-loss value between each two target samples.
对于步骤S104-1至S104-3,在该步骤中,在至少三个患者样本中选择两个目标样本,并基于两个目标样本的患病信息来设置与这两个目标样本对应的预设数值,也即预设数值取决于两个目标样本的患病信息。之后利用第一距离与预设数值之间的差值的平方作为这两个目标样本之间的子损失值,利用类似的方法得到每两个目标样本之间的子损失值,并根据所有子损失值确定神经网络的损失值。For steps S104-1 to S104-3, in this step, two target samples are selected from at least three patient samples, and presets corresponding to the two target samples are set based on the disease information of the two target samples. The value, that is, the preset value depends on the disease information of the two target samples. Then the square of the difference between the first distance and the preset value is used as the sub-loss value between the two target samples, and a similar method is used to obtain the sub-loss value between each two target samples, and based on all sub-loss values The loss value determines the loss value of the neural network.
在该步骤中,第一距离可以反应两个目标样本的患病信息是否相同,子损失值可以反应针对这两个目标样本的计算误差。本申请在利用子损失值表征第一距离与预设数值的相近程度的同时,利用平方处理使子损失值为非负数,消除了负数对于最终损失值计算的影响。In this step, the first distance can reflect whether the disease information of the two target samples is the same, and the sub-loss value can reflect the calculation error for the two target samples. This application uses the sub-loss value to represent the closeness of the first distance to the preset value, and uses square processing to make the sub-loss value a non-negative number, eliminating the impact of negative numbers on the calculation of the final loss value.
例如,以选取三个患者样本为例,若第一个目标样本和第二个目标样本的疾病信息相同,而第三个目标样本的疾病信息与前两个目标样本不同,则可以确定第一个目标样本和第二个目标样本对应的预设数值为0,而第一个目标样本和第三个目标样本对应的预设数值为1,第二个目标样本和第三个目标样本对应的预设数值也为1。For example, taking three patient samples as an example, if the disease information of the first target sample and the second target sample are the same, and the disease information of the third target sample is different from the first two target samples, the first target sample can be determined. The preset value corresponding to the first target sample and the second target sample is 0, while the preset value corresponding to the first target sample and the third target sample is 1, and the preset value corresponding to the second target sample and the third target sample is 1. The default value is also 1.
在确定每两个目标样本之间的距离e(p1,p2)、e(p1,p3)和e(p2,p3)之后,可以确定第一个目标样本和第二个目标样本之间的子损失值为L1=(e(p1,p2)-0)2,第一个目标样本和第三个目标样本之间的子损失值为L2=(e(p1,p3)-0)2,第二个目标样本和第三个目标样本之间的子损失值为L3=(e(p2,p3)-1)2,进而根据所有子损失值确定神经网络的损失值L=L1+L2+L3。其中,p1、p2和p3分别为第一个、第二个和第三个目标样本的样本信息。After determining the distances e(p1,p2), e(p1,p3) and e(p2,p3) between each two target samples, the subdivision between the first target sample and the second target sample can be determined. The loss value is L1=(e(p1,p2)-0)2, and the sub-loss value between the first target sample and the third target sample is L2=(e(p1,p3)-0)2. The sub-loss value between the two target samples and the third target sample is L3=(e(p2,p3)-1)2, and then the loss value of the neural network is determined based on all sub-loss values L=L1+L2+L3 . Among them, p1, p2 and p3 are the sample information of the first, second and third target samples respectively.
此外,若第一距离的值在[0,1]之间,则可以确定两个疾病信息相同的目标样本对应的预设数值为0,两个疾病信息不同的目标样本对应的预设数值为1;若第一距离的至在[0,d]之间,则可以确定两个疾病信息相同的目标样本对应的预设数值为0,两个疾病信息不同的目标样本对应的预设数值为d。In addition, if the value of the first distance is between [0,1], it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is 1; If the first distance is between [0, d], it can be determined that the preset value corresponding to the two target samples with the same disease information is 0, and the preset value corresponding to the two target samples with different disease information is d.
S105:将损失值写入损失值列表,并判断损失值列表是否满足预设收敛条件,其中,损失值列表包括每次计算得到的神经网络的损失值;S105: Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the loss value of the neural network calculated each time;
S106:若不满足,则根据损失值调整神经网络的参数,并返回在患者样本库中随机选取至少三个患者样本的步骤,直至损失值列表满足预设收敛条件。S106: If not satisfied, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
在该步骤中,利用循环的方式,多次根据损失值调整神经网络的参数,使得循环过程中产生的损失值记录满足收敛条件,也即损失值收敛。In this step, a loop method is used to adjust the parameters of the neural network based on the loss value multiple times, so that the loss value records generated during the loop process meet the convergence conditions, that is, the loss value converges.
具体地,在得到损失值并将损失值写入损失值列表后,若判定列表满足收敛条件,则认为当前的神经网络已经不需再优化,因此结束运算;若判定列表不满足收敛条件,则调整神经网络的参数,以使损失值减小;然后返回随机选取至少三个患者样本的步骤,并将重新选取的患者样本输入神经网络中训练,也即利用调整后的参数重新计算得到新的第一距离以及新的损失值,再次调整参数以使新的损失值减小,经过多次循环之后,神经网络在计算两个样本信息的患病信息之间的第一距离时,其值更加逼近预设数值。Specifically, after obtaining the loss value and writing the loss value into the loss value list, if the determination list meets the convergence conditions, it is considered that the current neural network no longer needs to be optimized, so the operation ends; if the determination list does not meet the convergence conditions, then Adjust the parameters of the neural network to reduce the loss value; then return to the step of randomly selecting at least three patient samples, and input the reselected patient samples into the neural network for training, that is, use the adjusted parameters to recalculate and obtain a new The first distance and the new loss value, the parameters are adjusted again to reduce the new loss value. After many cycles, when the neural network calculates the first distance between the disease information of the two sample information, its value is more Approach the preset value.
例如,在前述实施例中,L1=(e(p1,p2)-0)2,L2=(e(p1,p3)-0)2,L3=(e(p2,p3)-1)2,经过多次循环使e(p1,p2)的值逼近0,而e(p1,p3)和e(p2,p3)的值逼近1。损失值有效减小,神经网络的计算精度得到提高。For example, in the aforementioned embodiment, L1=(e(p1,p2)-0)2, L2=(e(p1,p3)-0)2, L3=(e(p2,p3)-1)2, After many loops, the value of e(p1,p2) approaches 0, and the values of e(p1,p3) and e(p2,p3) approach 1. The loss value is effectively reduced, and the calculation accuracy of the neural network is improved.
其中,步骤S105中,判断损失值列表是否满足预设收敛条件,具体包括:Among them, in step S105, it is judged whether the loss value list meets the preset convergence conditions, which specifically includes:
若损失值记录中的损失值数量大于或等于第一预设数量阈值m,且第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,则判定损失值列表满足预设收敛条件,其中,m为正整数,m>1,N为正整数。If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then the loss value is determined The list satisfies the preset convergence conditions, where m is a positive integer, m>1, and N is a positive integer.
具体地,每次训练得到一个损失值,在循环多次之后,损失值列表中包含多个损失值,损失值数量大于或等于第一预设阈值m也就是循环次数大于或等于m。其中,m的数值越大,循环次数越多,神经网络的精准度越高。Specifically, a loss value is obtained for each training. After multiple cycles, the loss value list contains multiple loss values. The number of loss values is greater than or equal to the first preset threshold m, that is, the number of cycles is greater than or equal to m. Among them, the larger the value of m, the greater the number of cycles and the higher the accuracy of the neural network.
此外,第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,也即第N个损失函数值小于或等于其之后的若干个损失函数值,在此情况下,可认为损失值已经进入稳态,损失值记录满足收敛条件。In addition, the N+1 to N+m-1th loss function values are not less than the Nth loss function value, that is, the Nth loss function value is less than or equal to several subsequent loss function values. In this case , it can be considered that the loss value has entered a steady state, and the loss value record meets the convergence conditions.
例如,预先设置m=10,则若损失值记录中包括至少10个损失值,并且第N+1至第N+9个损失至均不小于第N个损失值,那么可以认为此时损失值记录满足收敛条件,因而结束循环。For example, if m=10 is preset, then if the loss value record includes at least 10 loss values, and the N+1 to N+9th losses are not less than the Nth loss value, then the loss value at this time can be considered The record meets the convergence condition, thus ending the loop.
进一步地,此时可将当前的参数作为神经网络的最终参数,也可将输出第N个损失值时所使用的参数作为神经网络的最终参数。Furthermore, at this time, the current parameters can be used as the final parameters of the neural network, or the parameters used when outputting the Nth loss value can be used as the final parameters of the neural network.
其中,如图4所示,步骤S106后,还包括如下步骤:Among them, as shown in Figure 4, after step S106, the following steps are also included:
S107:分别计算目标患者与患者样本库中每个患者样本之间的第二距离;S107: Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;
S108:将第二距离按照由小至大的顺序排序,得到距离列表,并将距离列表中前k个第二距离作为目标距离,其中,k为预设正整数;S108: Sort the second distances from small to large to obtain a distance list, and use the first k second distances in the distance list as the target distance, where k is a preset positive integer;
S109:根据目标距离对应的患者样本判断目标患者是否属于疾病高发人群;S109: Determine whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance;
S110:若属于,则根据目标距离对应的患者样本的药物信息生成推荐药物数据。S110: If yes, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance.
对于步骤S107至S110,在通过多次循环并调整参数,使得损失值列表满足预设收敛条件,得到最终的神经网络之后,可以利用该神经网络分析目标患者的患病风险,也即判断目标患者是否属于患病高发人群。For steps S107 to S110, after multiple loops and parameter adjustments are made so that the loss value list meets the preset convergence conditions and the final neural network is obtained, the neural network can be used to analyze the disease risk of the target patient, that is, to determine the target patient. Whether you belong to a group with a high risk of disease.
具体地,将目标患者的信息和患者样本库中的每个患者样本的样本信息输入神经网络中,利用神经网络处理这些信息,得到目标患者与每个患者样本之间的第二距离。可以理解的是,第二距离可以表征目标患者与患者样本之间的相似程度,第二距离越小,则目标患者与患者样本越相似,在此情况下,若患者样本的患病信息为患病,那么目标患者患病的可能性较大。Specifically, the information of the target patient and the sample information of each patient sample in the patient sample library are input into the neural network, and the neural network is used to process the information to obtain the second distance between the target patient and each patient sample. It can be understood that the second distance can represent the degree of similarity between the target patient and the patient sample. The smaller the second distance, the more similar the target patient is to the patient sample. In this case, if the disease information of the patient sample is disease, then the target patient is more likely to be sick.
基于此,可以取数值最小的k个第二距离为目标距离,并根据目标距离对应的患者样本分析目标患者,若目标距离对应的患者样本患病,那么可以认为目标患者属于疾病高发人群;若目标距离对应的患者样本未患病,那么可以认为目标患者不属于疾病高发人群。Based on this, the k second distances with the smallest values can be taken as the target distance, and the target patient can be analyzed based on the patient sample corresponding to the target distance. If the patient sample corresponding to the target distance is sick, then the target patient can be considered to belong to a high-risk population; if If the patient sample corresponding to the target distance is not sick, then it can be considered that the target patient does not belong to the high-risk population.
进一步地,若判定目标患者属于疾病高发人群,那么分析目标距离对应的患者样本的药物信息,也即分析目标距离对应的患者样本服用了哪些药物、医生诊断处方包含哪些药物等,然后根据这些药物信息生成针对目标患者的推荐药物数据,以辅助医生诊断开药,提高医生工作效率以及准确度。Furthermore, if it is determined that the target patient belongs to a group with a high risk of disease, then the drug information of the patient sample corresponding to the target distance is analyzed, that is, what drugs are taken by the patient sample corresponding to the target distance, what drugs are included in the doctor's diagnosis and prescription, etc., and then based on these drugs The information generates recommended drug data for target patients to assist doctors in diagnosing and prescribing drugs, and improve doctors' work efficiency and accuracy.
其中,可以将第二距离按照由小至大的顺序排序得到距离列表,此时距离列表中前k个第二距离即为数值最小的k个第二距离。当然,也可以将第二距离按照由大至小的顺序排序得到距离列表,此时距离列表中后k个第二距离为数值最小的k个第二距离。Wherein, the second distances can be sorted in order from small to large to obtain a distance list. At this time, the first k second distances in the distance list are the k second distances with the smallest values. Of course, the second distances can also be sorted in descending order to obtain a distance list. In this case, the last k second distances in the distance list are the k second distances with the smallest values.
其中,步骤S109中,根据目标距离对应的患者样本判断目标患者是否属于疾病高发人群,包括如下步骤:Among them, in step S109, judging whether the target patient belongs to a group with a high risk of disease based on the patient sample corresponding to the target distance includes the following steps:
S109-1:在与目标距离对应的患者样本中,确定患病信息为患病的患者样本为目标样本;S109-1: Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be sick is the target sample;
S109-2:若目标样本的数量大于第二预设数量阈值,则判定目标患者属于疾病高发人群;和/或,S109-2: If the number of target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to a population with a high risk of disease; and/or,
S109-3:若所有目标样本的第二距离之和小于预设距离阈值,则判定目标患者属于疾病高发人群。S109-3: If the sum of the second distances of all target samples is less than the preset distance threshold, it is determined that the target patient belongs to a population with a high risk of disease.
对于步骤S109-1至S109-3,在分析目标患者的患病风险时,与目标距离对应的患者样本可以作为分析依据。基于此,分析与目标距离对应的患者样本的患病信息,若患病,则确定该患者样本为目标样本,进而根据目标样本的数量或者目标样本与目标患者之间的第二距离来分析目标患者的患病风险。本申请提供了两种判断目标患者是否属于疾病高发人群的方法,适用于不同场景或不同需求。For steps S109-1 to S109-3, when analyzing the disease risk of the target patient, the patient sample corresponding to the target distance can be used as the basis for analysis. Based on this, the disease information of the patient sample corresponding to the target distance is analyzed. If the patient sample is sick, the patient sample is determined to be the target sample, and then the target is analyzed based on the number of target samples or the second distance between the target sample and the target patient. patient risk. This application provides two methods for determining whether the target patient belongs to a high-risk population, which are suitable for different scenarios or needs.
具体地,若根据目标样本的数量分析,那么在目标样本的数量大于第二预设阈值时,也即与目标距离对应的患者样本中,患病信息为患病的样本数量够大时,可以认为目标患者属于疾病高发人群。Specifically, if the analysis is based on the number of target samples, then when the number of target samples is greater than the second preset threshold, that is, among the patient samples corresponding to the target distance, when the number of samples with disease information indicating disease is large enough, you can The target patients are believed to belong to a population with a high risk of disease.
若根据目标样本与目标患者之间的第二距离分析,那么可以计算所有目标样本与目标患者之间的第二距离之和,若其和小于预设距离阈值,也即与目标距离对应的患者样本中,患病信息为患病的样本与目标患者的相似度够高时,可以认为目标患者属于疾病高发人群。If the analysis is based on the second distance between the target sample and the target patient, then the sum of the second distances between all target samples and the target patient can be calculated. If the sum is less than the preset distance threshold, that is, the patient corresponding to the target distance Among the samples, when the similarity between the sample whose disease information is disease and the target patient is high enough, the target patient can be considered to belong to a population with a high risk of disease.
其中,如图5所示,步骤S101中,在患者样本库中随机选取至少三个患者样本之前,包括如下步骤:As shown in Figure 5, in step S101, before at least three patient samples are randomly selected from the patient sample library, the following steps are included:
S100-1:获取患者数据,并根据患者数据生成患者样本,其中,患者样本包括样本信息以及患病信息,样本信息包括患者基本信息、药物信息以及检验信息;S100-1: Obtain patient data and generate patient samples based on the patient data. The patient samples include sample information and disease information, and the sample information includes basic patient information, drug information, and test information;
S100-2:根据患者样本建立患者样本库。S100-2: Establish a patient sample library based on patient samples.
对于步骤S100-1至S100-2,首先需要建立患者样本库,进而从患者样本库中选取患者。For steps S100-1 to S100-2, it is first necessary to establish a patient sample database, and then select patients from the patient sample database.
其中,患者基本信息包括:性别、年龄、收入、职业、婚育史、既往病史、遗传病史等;患病信息包括:疾病种类以及是否患病等;检验信息与疾病种类相对应,包括检查该种类疾病通常需要的检查项目以及检查结果;药物信息为与疾病种类对应的信息,可包括患者的服药信息以及医生的诊断开药信息。Among them, the patient's basic information includes: gender, age, income, occupation, marriage and childbirth history, past medical history, genetic history, etc.; disease information includes: disease type and whether it is sick, etc.; test information corresponds to the disease type, including examination of the disease. The examination items and examination results usually required for the type of disease; the drug information is information corresponding to the type of disease, which can include the patient's medication information and the doctor's diagnosis and prescription information.
其中,检验项目信息中的检查项目可以根据患者历史就医记录得出,也可由有经验的医生提供。例如,对于患者A,疾病种类为糖尿病时,检验项目信息可以包括:糖化血红蛋白、低密度脂蛋白胆固醇、血尿酸、尿蛋白、甘油三酯、空腹血糖等;与疾病种类对应的用药信息中,患者的服药信息可以包括是否使用二甲双胍、是否使用磺脲类药物、是否使用GLP-1、是否使用DPP4等。Among them, the examination items in the examination item information can be obtained based on the patient's historical medical records, or can also be provided by an experienced doctor. For example, for patient A, when the disease type is diabetes, the test item information may include: glycated hemoglobin, low-density lipoprotein cholesterol, blood uric acid, urine protein, triglycerides, fasting blood sugar, etc.; in the medication information corresponding to the disease type, The patient's medication information can include whether he uses metformin, whether he uses sulfonylureas, whether he uses GLP-1, whether he uses DPP4, etc.
可见,在上述方案中,同时输入至少三个患者样本对神经网络进行训练,由于同时训练了相同以及不同结果的患者样本,训练效率高,神经网络的精准度高。此外,本申请不仅仅只针对样本信息进行相似度量,在训练过程中还引入了表征结果的患病信息,通过多次循环训练可区分患者样本不同特征的重要程度,进一步提高了神经网络针对目标患者的判断准确度。进一步地,利用神经网络可以估计非线性关系的特性,解决了现有技术利用线性模型计算第一距离所导致的效率低等问题。It can be seen that in the above scheme, at least three patient samples are input at the same time to train the neural network. Since patient samples with the same and different results are trained at the same time, the training efficiency is high and the accuracy of the neural network is high. In addition, this application not only performs similarity measurement on sample information, but also introduces disease information that characterizes the results during the training process. Through multiple cyclic trainings, the importance of different characteristics of patient samples can be distinguished, further improving the neural network's ability to target Accuracy of patient judgment. Furthermore, the neural network can be used to estimate the characteristics of the nonlinear relationship, which solves the problem of low efficiency caused by the existing technology using a linear model to calculate the first distance.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the sequence number of each step in the above embodiment does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种患病风险估计网络的优化装置,该患病风险估计网络的优化装置与上述实施例中患病风险估计网络的优化方法一一对应。如图6所示,该患病风险估计网络的优化装置包括:获取模块、初始化模块、计算模块、判断模块以及优化模块。各功能模块详细说明如下:In one embodiment, an optimization device for a disease risk estimation network is provided, and the device for optimizing a disease risk estimation network corresponds one-to-one to the optimization method for the disease risk estimation network in the above embodiment. As shown in Figure 6, the optimization device of the disease risk estimation network includes: an acquisition module, an initialization module, a calculation module, a judgment module and an optimization module. The detailed description of each functional module is as follows:
获取模块,用于获取患者样本库;Acquisition module, used to obtain patient sample library;
初始化模块,用于在患者样本库中随机选取至少三个患者样本;An initialization module, used to randomly select at least three patient samples from the patient sample library;
计算模块,用于将至少三个患者样本的样本信息两两输入预设的神经网络中,利用神经网络计算每两个患者样本之间的第一距离,其中,神经网络用于估计患者患病风险;The calculation module is used to input the sample information of at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein the neural network is used to estimate the patient's disease risk;
计算模块,还用于利根据第一距离计算神经网络的损失值;The calculation module is also used to calculate the loss value of the neural network based on the first distance;
判断模块,用于将损失值写入损失值列表,并判断损失值列表是否满足预设收敛条件,其中,损失值列表包括每次计算得到的神经网络损失值;A judgment module, used to write the loss value into the loss value list, and judge whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;
优化模块,用于若不满足,则根据损失值调整神经网络的参数,并返回在患者样本库中随机选取至少三个患者样本的步骤,直至损失值列表满足预设收敛条件。The optimization module is used to adjust the parameters of the neural network according to the loss value if they are not satisfied, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
在一实施例中,计算模块,具体用于:In one embodiment, the calculation module is specifically used for:
在至少三个患者样本中选择任两个患者样本作为目标样本,判断两个目标样本的患病信息是否相同,并根据判断结果确定与两个目标样本对应的预设数值;Select any two patient samples from at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results;
将两个目标样本之间的第一距离与预设数值的差值作为中间差值,并将中间差值的平方作为两个目标样本之间的子损失值;The difference between the first distance between the two target samples and the preset value is used as the middle difference, and the square of the middle difference is used as the sub-loss value between the two target samples;
根据每两个目标样本之间的子损失值确定损失值。The loss value is determined based on the sub-loss value between each two target samples.
在一实施例中,计算模块,还用于:In one embodiment, the computing module is also used to:
确定至少三个患者样本中每个患者样本的患病信息;Determine disease information for each of at least three patient samples;
若至少三个患者样本的患病信息均相同,则重新在患者样本库中随机选取至少三个患者样本。If the disease information of at least three patient samples is the same, at least three patient samples will be randomly selected again from the patient sample database.
在一实施例中,判定损失值列表是否满足预设收敛条件,具体包括:In one embodiment, determining whether the loss value list meets the preset convergence conditions specifically includes:
若损失值记录中的损失值数量大于或等于第一预设数量阈值m,且第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,则判定所述损失值列表满足所述预设收敛条件,其中,m为正整数,m>1,N为正整数。If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then it is determined that the The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
在一实施例中,装置还包括样本库建立模块,具体用于:In one embodiment, the device further includes a sample library creation module, specifically used for:
获取患者数据,并根据患者数据生成患者样本,其中,患者样本包括样本信息以及患病信息,样本信息包括患者基本信息、药物信息以及检验信息;Obtain patient data and generate patient samples based on the patient data, where the patient samples include sample information and disease information, and the sample information includes basic patient information, drug information, and test information;
根据患者样本建立患者样本库。Establish a patient sample library based on patient samples.
在一实施例中,装置还包括分析模块,具体用于:In one embodiment, the device further includes an analysis module, specifically used for:
分别计算目标患者与患者样本库中每个患者样本之间的第二距离;Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;
将第二距离按照由小至大的顺序排序,得到距离列表,并将距离列表中前k个第二距离作为目标距离,其中,k为预设正整数;Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as the target distance, where k is a preset positive integer;
根据目标距离对应的患者样本判断目标患者是否属于疾病高发人群;Determine whether the target patient belongs to a high-risk population based on the patient samples corresponding to the target distance;
若属于,则根据目标距离对应的患者样本的药物信息生成推荐药物数据。If so, recommended drug data is generated based on the drug information of the patient sample corresponding to the target distance.
在一实施例中,分析模块,具体用于:In one embodiment, the analysis module is specifically used to:
在与目标距离对应的患者样本中,确定患病信息为患病的患者样本为目标样本;Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be sick is the target sample;
若目标样本的数量大于第二预设数量阈值,则判定目标患者属于疾病高发人群;和/或,If the number of target samples is greater than the second preset number threshold, it is determined that the target patient belongs to a population with a high risk of disease; and/or,
若所有目标样本的第二距离之和小于预设距离阈值,则判定目标患者属于疾病高发人群。If the sum of the second distances of all target samples is less than the preset distance threshold, it is determined that the target patient belongs to a population with a high risk of disease.
在一个实施例中,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现以下步骤:In one embodiment, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the computer program, the following steps are implemented:
在患者样本库中随机选取至少三个患者样本,并将至少三个患者样本的样本信息两两输入预设的神经网络中,其中,神经网络用于估计患者患病风险;Randomly select at least three patient samples from the patient sample library, and input the sample information of at least three patient samples into a preset neural network in pairs, where the neural network is used to estimate the patient's disease risk;
利用神经网络计算每两个患者样本之间的第一距离,并根据第一距离计算神经网络的损失值;Using a neural network to calculate the first distance between each two patient samples, and calculating the loss value of the neural network based on the first distance;
将损失值写入损失值列表,并判断损失值列表是否满足预设收敛条件,其中,损失值列表包括每次计算得到的神经网络损失值;Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;
若不满足,则根据损失值调整神经网络的参数,并重新在患者样本库中随机选取至少三个患者样本,利用调整后的参数重新计算第一距离以及损失值;If it is not satisfied, adjust the parameters of the neural network based on the loss value, randomly select at least three patient samples from the patient sample database, and use the adjusted parameters to recalculate the first distance and loss value;
若满足,则结束运算。If satisfied, the operation ends.
该计算机设备内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括存储介质、内存储器。该存储介质存储有操作系统和计算机程序。该内存储器为存储介质中的操作系统和计算机程序的运行提供环境。该计算机程序被处理器执行时以实现一种上述患病风险估计网络的优化方法的功能或步骤。The internal structure diagram of the computer equipment can be shown in Figure 7. The computer device includes a processor, memory, display screen, and input device connected by a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes storage media and internal memory. The storage medium stores operating systems and computer programs. This internal memory provides an environment for the operating system and computer programs in the storage medium to run. When the computer program is executed by the processor, it implements the functions or steps of the optimization method of the disease risk estimation network.
在一个实施例中,提供了一种存储介质,所述存储介质可以是非易失性,也可以是易失性。所述存储介质上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:In one embodiment, a storage medium is provided, and the storage medium may be non-volatile or volatile. A computer program is stored on the storage medium. When the computer program is executed by the processor, the following steps are implemented:
在患者样本库中随机选取至少三个患者样本,并将至少三个患者样本的样本信息两两输入预设的神经网络中,其中,神经网络用于估计患者患病风险;Randomly select at least three patient samples from the patient sample library, and input the sample information of at least three patient samples into a preset neural network in pairs, where the neural network is used to estimate the patient's disease risk;
利用神经网络计算每两个患者样本之间的第一距离,并根据第一距离计算神经网络的损失值;Using a neural network to calculate the first distance between each two patient samples, and calculating the loss value of the neural network based on the first distance;
将损失值写入损失值列表,并判断损失值列表是否满足预设收敛条件,其中,损失值列表包括每次计算得到的神经网络损失值;Write the loss value into the loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes the neural network loss value calculated each time;
若不满足,则根据损失值调整神经网络的参数,并重新在患者样本库中随机选取至少三个患者样本,利用调整后的参数重新计算第一距离以及损失值;If it is not satisfied, adjust the parameters of the neural network based on the loss value, randomly select at least three patient samples from the patient sample database, and use the adjusted parameters to recalculate the first distance and loss value;
若满足,则结束运算。If satisfied, the operation ends.
需要说明的是,上述关于存储介质或计算机设备所能实现的功能或步骤,可对应参阅前述方法实施例中的相关描述,为避免重复,这里不再一一描述。本领域技术人员可以理解,本实施例提供的一种计算机设备结构并不构成对该计算机设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。It should be noted that for the above-mentioned functions or steps that can be implemented by storage media or computer equipment, please refer to the relevant descriptions in the foregoing method embodiments. To avoid repetition, they will not be described one by one here. Those skilled in the art can understand that the structure of a computer device provided in this embodiment does not constitute a limitation on the computer device, and may include more or less components, or combine certain components, or arrange different components.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性或易失性存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile or volatile storage. In the media, when executed, the computer program may include the processes of the above method embodiments. Any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that for the convenience and simplicity of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units and modules according to needs. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来实现,也可以通过硬件实现。首先选取多个患者样本,并利用两两输入神经网络的方式,计算每两个患者样本之间的第一距离,进而根据第一距离计算损失值,再根据损失值调整神经网络的参数,以达到训练神经网络的目的。进一步地,多个患者样本中既包括患病样本,也包括未患病样本,在训练神经网络时,可以同时训练到相似关系与区分关系,训练的效率更高。此外,在得到满足条件的神经网络后,可将目标患者输入该神经网络,以根据目标患者与每个样本之间的距离来判断目标患者是否属于疾病高发人群,实现了患者患病风险的自动预估,为医生诊断提供辅助,以提高医生诊断效率以及准确度。Through the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform, or can also be implemented by hardware. First, select multiple patient samples, and use pairwise input into the neural network to calculate the first distance between each two patient samples, then calculate the loss value based on the first distance, and then adjust the parameters of the neural network based on the loss value to To achieve the purpose of training neural network. Furthermore, multiple patient samples include both diseased samples and non-diseased samples. When training the neural network, similar relationships and distinguishing relationships can be trained at the same time, and the training efficiency is higher. In addition, after obtaining a neural network that meets the conditions, the target patient can be input into the neural network to determine whether the target patient belongs to a population with a high risk of disease based on the distance between the target patient and each sample, realizing automatic identification of the patient's disease risk. Prediction provides assistance to doctors in diagnosis to improve the efficiency and accuracy of doctors’ diagnosis.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的单元或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的系统中的单元可以按照实施场景描述进行分布于实施场景的系统中,也可以进行相应变化位于不同于本实施场景的一个或多个系统中。上述实施场景的单元可以合并为一个单元,也可以进一步拆分成多个子单元。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the units or processes in the accompanying drawing are not necessarily necessary for implementing the present application. Those skilled in the art can understand that the units in the system in the implementation scenario can be distributed in the system in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more systems different from this implementation scenario. The units of the above implementation scenarios can be combined into one unit or further split into multiple sub-units.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-described embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the above-mentioned implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions in the embodiments of this application, and should be included in within the protection scope of this application.

Claims (20)

  1. 一种患病风险估计网络的优化方法,其中,所述方法包括:An optimization method for disease risk estimation network, wherein the method includes:
    获取患者样本库;Obtain patient sample bank;
    在所述患者样本库中随机选取至少三个患者样本;Randomly select at least three patient samples from the patient sample library;
    将所述三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;The sample information of the three patient samples is input into a preset neural network in pairs, and the neural network is used to calculate the first distance between each two patient samples, wherein the neural network is used to estimate the patient risk of illness;
    根据所述第一距离计算所述神经网络的损失值;将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;Calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, and determine whether the loss value list meets the preset convergence conditions, where the loss value list includes each time The calculated loss value of the neural network;
    若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。If not, adjust the parameters of the neural network according to the loss value, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list meets the preset convergence conditions.
  2. 根据权利要求1所述的方法,其中,所述根据所述第一距离计算所述神经网络的损失值,具体包括:The method according to claim 1, wherein calculating the loss value of the neural network according to the first distance specifically includes:
    在所述至少三个患者样本中选择任两个患者样本作为目标样本,判断两个所述目标样本的患病信息是否相同,并根据判断结果确定与两个所述目标样本对应的预设数值;Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;
    将两个所述目标样本之间的第一距离与所述预设数值的差值作为中间差值,并将所述中间差值的平方作为两个所述目标样本之间的子损失值;The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;
    根据每两个所述目标样本之间的子损失值确定所述损失值。The loss value is determined based on the sub-loss value between each two target samples.
  3. 根据权利要求2所述的方法,其中,所述将所述至少三个患者样本的样本信息两两输入预设的神经网络中之前,所述方法还包括:The method according to claim 2, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, the method further includes:
    确定所述至少三个患者样本中每个患者样本的患病信息;determining disease information for each of the at least three patient samples;
    若所述至少三个患者样本的患病信息均相同,则重新在患者样本库中随机选取至少三个患者样本。If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
  4. 根据权利要求1所述的方法,其中,所述判断所述损失值列表是否满足预设收敛条件,具体包括:The method according to claim 1, wherein determining whether the loss value list satisfies a preset convergence condition specifically includes:
    若所述损失值记录中的损失值数量大于或等于第一预设数量阈值m,且第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,则判定所述损失值列表满足所述预设收敛条件,其中,m为正整数,m>1,N为正整数。If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
  5. 根据权利要求3所述的方法,其中,所述在所述患者样本库中随机选取至少三个患者样本之前,所述方法还包括:The method according to claim 3, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:
    获取患者数据,并根据所述患者数据生成所述患者样本,其中,所述患者样本包括所述样本信息以及所述患病信息,所述样本信息包括患者基本信息、药物信息以及检验信息;Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;
    根据所述患者样本建立所述患者样本库。The patient sample library is established based on the patient samples.
  6. 根据权利要求5所述的方法,其中,所述损失值列表满足所述预设收敛条件之后,所述方法还包括:The method according to claim 5, wherein after the loss value list satisfies the preset convergence condition, the method further includes:
    分别计算所述目标患者与所述患者样本库中每个患者样本之间的第二距离;Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;
    将所述第二距离按照由小至大的顺序排序,得到距离列表,并将所述距离列表中前k个第二距离作为目标距离,其中,k为预设正整数;Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;
    根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群;Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;
    若属于,则根据所述目标距离对应的患者样本的药物信息生成推荐药物数据。If so, recommended drug data is generated based on the drug information of the patient sample corresponding to the target distance.
  7. 根据权利要求6所述的方法,其中,所述根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群,具体包括:The method according to claim 6, wherein determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:
    在与所述目标距离对应的患者样本中,确定所述患病信息为患病的患者样本为目标样本;Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;
    若所述目标样本的数量大于第二预设数量阈值,则判定所述目标患者属于所述疾病高发人群;和/或,If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,
    若所有所述目标样本的第二距离之和小于预设距离阈值,则判定所述目标患者属于所述疾病高发人群。If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.
  8. 一种患病风险估计网络的优化装置,其中,所述装置包括:An optimization device for disease risk estimation network, wherein the device includes:
    获取模块,用于获取患者样本库;Acquisition module, used to obtain patient sample library;
    初始化模块,用于在所述患者样本库中随机选取至少三个患者样本;An initialization module, used to randomly select at least three patient samples from the patient sample library;
    计算模块,用于将所述至少三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;A calculation module, configured to input the sample information of the at least three patient samples into a preset neural network in pairs, and use the neural network to calculate the first distance between each two patient samples, wherein: Neural networks are used to estimate patient risk;
    所述计算模块,还用于根据所述第一距离计算所述神经网络的损失值;The calculation module is also used to calculate the loss value of the neural network according to the first distance;
    判断模块,用于将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;A judgment module, configured to write the loss value into a loss value list, and judge whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time;
    优化模块,用于若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Optimization module, used to adjust the parameters of the neural network according to the loss value if it is not satisfied, and return to the step of randomly selecting at least three patient samples in the patient sample library until the loss value list meet the preset convergence conditions.
  9. 一种存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现患病风险估计网络的优化方法,包括:A storage medium with a computer program stored thereon, wherein when the computer program is executed by a processor, an optimization method for disease risk estimation network is implemented, including:
    获取患者样本库;在所述患者样本库中随机选取至少三个患者样本;将所述三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;根据所述第一距离计算所述神经网络的损失值;将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
  10. 根据权利要求9所述的存储介质,其中,所述根据所述第一距离计算所述神经网络的损失值,具体包括:The storage medium according to claim 9, wherein the calculating the loss value of the neural network according to the first distance specifically includes:
    在所述至少三个患者样本中选择任两个患者样本作为目标样本,判断两个所述目标样本的患病信息是否相同,并根据判断结果确定与两个所述目标样本对应的预设数值;Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;
    将两个所述目标样本之间的第一距离与所述预设数值的差值作为中间差值,并将所述 中间差值的平方作为两个所述目标样本之间的子损失值;The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;
    根据每两个所述目标样本之间的子损失值确定所述损失值。The loss value is determined based on the sub-loss value between each two target samples.
  11. 根据权利要求10所述的存储介质,其中,所述将所述至少三个患者样本的样本信息两两输入预设的神经网络中之前,还包括:The storage medium according to claim 10, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, it further includes:
    确定所述至少三个患者样本中每个患者样本的患病信息;determining disease information for each of the at least three patient samples;
    若所述至少三个患者样本的患病信息均相同,则重新在患者样本库中随机选取至少三个患者样本。If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
  12. 根据权利要求9所述的存储介质,其中,所述判断所述损失值列表是否满足预设收敛条件,具体包括:The storage medium according to claim 9, wherein the determining whether the loss value list satisfies a preset convergence condition specifically includes:
    若所述损失值记录中的损失值数量大于或等于第一预设数量阈值m,且第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,则判定所述损失值列表满足所述预设收敛条件,其中,m为正整数,m>1,N为正整数。If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
  13. 根据权利要求11所述的存储介质,其中,所述在所述患者样本库中随机选取至少三个患者样本之前,还包括:The storage medium according to claim 11, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:
    获取患者数据,并根据所述患者数据生成所述患者样本,其中,所述患者样本包括所述样本信息以及所述患病信息,所述样本信息包括患者基本信息、药物信息以及检验信息;Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;
    根据所述患者样本建立所述患者样本库。The patient sample library is established based on the patient samples.
  14. 根据权利要求13所述的存储介质,其中,所述损失值列表满足所述预设收敛条件之后,还包括:The storage medium according to claim 13, wherein after the loss value list satisfies the preset convergence condition, it further includes:
    分别计算所述目标患者与所述患者样本库中每个患者样本之间的第二距离;Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;
    将所述第二距离按照由小至大的顺序排序,得到距离列表,并将所述距离列表中前k个第二距离作为目标距离,其中,k为预设正整数;Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;
    根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群;Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;
    若属于,则根据所述目标距离对应的患者样本的药物信息生成推荐药物数据;If so, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance;
    所述根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群,具体包括:Determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:
    在与所述目标距离对应的患者样本中,确定所述患病信息为患病的患者样本为目标样本;Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;
    若所述目标样本的数量大于第二预设数量阈值,则判定所述目标患者属于所述疾病高发人群;和/或,If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,
    若所有所述目标样本的第二距离之和小于预设距离阈值,则判定所述目标患者属于所述疾病高发人群。If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.
  15. 一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现患病风险估计网络的优化方法,包括:A computer device, including a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, wherein the processor implements an optimization method for disease risk estimation network when executing the computer program, include:
    获取患者样本库;在所述患者样本库中随机选取至少三个患者样本;将所述三个患者样本的样本信息两两输入预设的神经网络中,利用所述神经网络计算每两个所述患者样本之间的第一距离,其中,所述神经网络用于估计患者患病风险;根据所述第一距离计算所 述神经网络的损失值;将所述损失值写入损失值列表,并判断所述损失值列表是否满足预设收敛条件,其中,所述损失值列表包括每次计算得到的所述神经网络损失值;若不满足,则根据所述损失值调整所述神经网络的参数,并返回所述在所述患者样本库中随机选取至少三个患者样本的步骤,直至所述损失值列表满足所述预设收敛条件。Obtain a patient sample library; randomly select at least three patient samples in the patient sample library; input the sample information of the three patient samples into a preset neural network in pairs, and use the neural network to calculate each two the first distance between the patient samples, wherein the neural network is used to estimate the patient's disease risk; calculate the loss value of the neural network according to the first distance; write the loss value into a loss value list, And determine whether the loss value list satisfies the preset convergence condition, wherein the loss value list includes the neural network loss value calculated each time; if not, adjust the neural network according to the loss value. parameters, and return to the step of randomly selecting at least three patient samples from the patient sample library until the loss value list satisfies the preset convergence condition.
  16. 根据权利要求15所述的计算机设备,其中,所述根据所述第一距离计算所述神经网络的损失值,具体包括:The computer device according to claim 15, wherein the calculating the loss value of the neural network according to the first distance specifically includes:
    在所述至少三个患者样本中选择任两个患者样本作为目标样本,判断两个所述目标样本的患病信息是否相同,并根据判断结果确定与两个所述目标样本对应的预设数值;Select any two patient samples among the at least three patient samples as target samples, determine whether the disease information of the two target samples is the same, and determine the preset values corresponding to the two target samples based on the judgment results. ;
    将两个所述目标样本之间的第一距离与所述预设数值的差值作为中间差值,并将所述中间差值的平方作为两个所述目标样本之间的子损失值;The difference between the first distance between the two target samples and the preset value is used as the intermediate difference, and the square of the intermediate difference is used as the sub-loss value between the two target samples;
    根据每两个所述目标样本之间的子损失值确定所述损失值。The loss value is determined based on the sub-loss value between each two target samples.
  17. 根据权利要求16所述的计算机设备,其中,所述将所述至少三个患者样本的样本信息两两输入预设的神经网络中之前,还包括:The computer device according to claim 16, wherein before inputting the sample information of the at least three patient samples into the preset neural network in pairs, it further includes:
    确定所述至少三个患者样本中每个患者样本的患病信息;determining disease information for each of the at least three patient samples;
    若所述至少三个患者样本的患病信息均相同,则重新在患者样本库中随机选取至少三个患者样本。If the disease information of the at least three patient samples is the same, at least three patient samples are randomly selected again from the patient sample database.
  18. 根据权利要求15所述的计算机设备,其中,所述判断所述损失值列表是否满足预设收敛条件,具体包括:The computer device according to claim 15, wherein the determining whether the loss value list satisfies a preset convergence condition specifically includes:
    若所述损失值记录中的损失值数量大于或等于第一预设数量阈值m,且第N+1至第N+m-1个损失函数值均不小于第N个损失函数值,则判定所述损失值列表满足所述预设收敛条件,其中,m为正整数,m>1,N为正整数。If the number of loss values in the loss value record is greater than or equal to the first preset quantity threshold m, and the N+1 to N+m-1th loss function values are not less than the Nth loss function value, then determine The loss value list satisfies the preset convergence condition, where m is a positive integer, m>1, and N is a positive integer.
  19. 根据权利要求17所述的计算机设备,其中,所述在所述患者样本库中随机选取至少三个患者样本之前,还包括:The computer device according to claim 17, wherein before randomly selecting at least three patient samples from the patient sample library, the method further includes:
    获取患者数据,并根据所述患者数据生成所述患者样本,其中,所述患者样本包括所述样本信息以及所述患病信息,所述样本信息包括患者基本信息、药物信息以及检验信息;Obtain patient data, and generate the patient sample according to the patient data, wherein the patient sample includes the sample information and the disease information, and the sample information includes basic patient information, drug information, and test information;
    根据所述患者样本建立所述患者样本库。The patient sample library is established based on the patient samples.
  20. 根据权利要求19所述的计算机设备,其中,所述损失值列表满足所述预设收敛条件之后,还包括:The computer device according to claim 19, wherein after the loss value list satisfies the preset convergence condition, it further includes:
    分别计算所述目标患者与所述患者样本库中每个患者样本之间的第二距离;Calculate the second distance between the target patient and each patient sample in the patient sample library respectively;
    将所述第二距离按照由小至大的顺序排序,得到距离列表,并将所述距离列表中前k个第二距离作为目标距离,其中,k为预设正整数;Sort the second distances in order from small to large to obtain a distance list, and use the first k second distances in the distance list as target distances, where k is a preset positive integer;
    根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群;Determine whether the target patient belongs to a high-risk population according to the patient sample corresponding to the target distance;
    若属于,则根据所述目标距离对应的患者样本的药物信息生成推荐药物数据;If so, generate recommended drug data based on the drug information of the patient sample corresponding to the target distance;
    所述根据所述目标距离对应的患者样本判断所述目标患者是否属于疾病高发人群,具体包括:Determining whether the target patient belongs to a high-risk population based on the patient sample corresponding to the target distance specifically includes:
    在与所述目标距离对应的患者样本中,确定所述患病信息为患病的患者样本为目标样本;Among the patient samples corresponding to the target distance, the patient sample whose disease information is determined to be diseased is the target sample;
    若所述目标样本的数量大于第二预设数量阈值,则判定所述目标患者属于所述疾病高 发人群;和/或,If the number of the target samples is greater than the second preset quantity threshold, it is determined that the target patient belongs to the high-risk population of the disease; and/or,
    若所有所述目标样本的第二距离之和小于预设距离阈值,则判定所述目标患者属于所述疾病高发人群。If the sum of the second distances of all the target samples is less than the preset distance threshold, it is determined that the target patient belongs to the high-risk population of the disease.
PCT/CN2022/089727 2022-03-21 2022-04-28 Disease risk estimation network optimization method and apparatus, medium, and device WO2023178789A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210278345.8A CN114743665A (en) 2022-03-21 2022-03-21 Optimization method, device, medium and equipment of disease risk estimation network
CN202210278345.8 2022-03-21

Publications (1)

Publication Number Publication Date
WO2023178789A1 true WO2023178789A1 (en) 2023-09-28

Family

ID=82276211

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089727 WO2023178789A1 (en) 2022-03-21 2022-04-28 Disease risk estimation network optimization method and apparatus, medium, and device

Country Status (2)

Country Link
CN (1) CN114743665A (en)
WO (1) WO2023178789A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509963A (en) * 2017-02-28 2018-09-07 株式会社日立制作所 Target otherness detection method based on deep learning and target otherness detection device
CN109493971A (en) * 2019-01-25 2019-03-19 中电健康云科技有限公司 Other fatty liver prediction technique and device are known each other based on tongue
US20190251333A1 (en) * 2017-06-02 2019-08-15 Tencent Technology (Shenzhen) Company Limited Face detection training method and apparatus, and electronic device
US20200345288A1 (en) * 2018-04-24 2020-11-05 Industry Academic Cooperation Foundation, Hallym University A 3-dimensional measurement method for eye movement and fully automated deep-learning based system for vertigo diagnosis
CN112017742A (en) * 2020-09-08 2020-12-01 平安科技(深圳)有限公司 Triage data processing method and device, computer equipment and storage medium
CN113705311A (en) * 2021-04-02 2021-11-26 腾讯科技(深圳)有限公司 Image processing method and apparatus, storage medium, and electronic apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509963A (en) * 2017-02-28 2018-09-07 株式会社日立制作所 Target otherness detection method based on deep learning and target otherness detection device
US20190251333A1 (en) * 2017-06-02 2019-08-15 Tencent Technology (Shenzhen) Company Limited Face detection training method and apparatus, and electronic device
US20200345288A1 (en) * 2018-04-24 2020-11-05 Industry Academic Cooperation Foundation, Hallym University A 3-dimensional measurement method for eye movement and fully automated deep-learning based system for vertigo diagnosis
CN109493971A (en) * 2019-01-25 2019-03-19 中电健康云科技有限公司 Other fatty liver prediction technique and device are known each other based on tongue
CN112017742A (en) * 2020-09-08 2020-12-01 平安科技(深圳)有限公司 Triage data processing method and device, computer equipment and storage medium
CN113705311A (en) * 2021-04-02 2021-11-26 腾讯科技(深圳)有限公司 Image processing method and apparatus, storage medium, and electronic apparatus

Also Published As

Publication number Publication date
CN114743665A (en) 2022-07-12

Similar Documents

Publication Publication Date Title
Muhlestein et al. Predicting inpatient length of stay after brain tumor surgery: developing machine learning ensembles to improve predictive performance
US11710571B2 (en) Long short-term memory model-based disease prediction method and apparatus, and computer device
CN109659033B (en) Chronic disease state of an illness change event prediction device based on recurrent neural network
Bashir et al. BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting
CN112016295B (en) Symptom data processing method, symptom data processing device, computer equipment and storage medium
US20220044809A1 (en) Systems and methods for using deep learning to generate acuity scores for critically ill or injured patients
JP2017537365A (en) Bayesian causal network model for medical examination and treatment based on patient data
Siristatidis et al. Predicting IVF outcome: a proposed web-based system using artificial intelligence
WO2021164388A1 (en) Triage fusion model training method, triage method, apparatus, device, and medium
US20210125072A1 (en) Neural network dynamic layer-selective training of patient diagnostic states
WO2021114635A1 (en) Patient grouping model constructing method, patient grouping method, and related device
CN111612278A (en) Life state prediction method and device, electronic equipment and storage medium
CN112447270A (en) Medication recommendation method, device, equipment and storage medium
CN116864139A (en) Disease risk assessment method, device, computer equipment and readable storage medium
Yu et al. Predict or draw blood: An integrated method to reduce lab tests
WO2021139223A1 (en) Method and apparatus for interpretation of clustering model, computer device, and storage medium
WO2020087971A1 (en) Prediction model-based hospitalization rationality prediction method and related products
Kadum et al. Machine learning-based telemedicine framework to prioritize remote patients with multi-chronic diseases for emergency healthcare services
Mansoori et al. Optimization of Tree‐Based Machine Learning Models to Predict the Length of Hospital Stay Using Genetic Algorithm
Li et al. StratMed: Relevance stratification between biomedical entities for sparsity on medication recommendation
WO2023178789A1 (en) Disease risk estimation network optimization method and apparatus, medium, and device
CN116719926A (en) Congenital heart disease report data screening method and system based on intelligent medical treatment
WO2023050668A1 (en) Clustering model construction method based on causal inference and medical data processing method
Wang et al. Semisupervised transfer learning for evaluation of model classification performance
Zhang et al. Application of L 1/2 regularization logistic method in heart disease diagnosis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22932846

Country of ref document: EP

Kind code of ref document: A1