WO2021151326A1 - Electronic medical record screening method and apparatus based on adversarial network, and device and medium - Google Patents

Electronic medical record screening method and apparatus based on adversarial network, and device and medium Download PDF

Info

Publication number
WO2021151326A1
WO2021151326A1 PCT/CN2020/124219 CN2020124219W WO2021151326A1 WO 2021151326 A1 WO2021151326 A1 WO 2021151326A1 CN 2020124219 W CN2020124219 W CN 2020124219W WO 2021151326 A1 WO2021151326 A1 WO 2021151326A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
discrimination
discriminator
simulation
loss value
Prior art date
Application number
PCT/CN2020/124219
Other languages
French (fr)
Chinese (zh)
Inventor
李彦轩
唐蕊
孙行智
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151326A1 publication Critical patent/WO2021151326A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the field of machine learning, and in particular to a method, device, equipment, and medium for screening electronic medical records based on an adversarial network.
  • the medical record is used to record the treatment information generated by the patient during the diagnosis and treatment process, which is of great significance to the doctor's diagnosis and treatment.
  • medical records are gradually becoming electronic, and the formed electronic medical records are stored in the electronic medical record database of the hospital.
  • a screening method for electronic medical records based on a confrontation network including:
  • the comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  • An electronic medical record screening device based on a confrontation network including:
  • the first generating module is used to generate simulated misdiagnosis data through the first generator
  • the second generating module is used to generate simulated missed diagnosis data through the second generator
  • a training module configured to obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator;
  • the determining discriminator module is used to determine the trained initial comprehensive discriminator as the comprehensive discriminator after the training is completed;
  • the screening module is configured to use the comprehensive discriminator to process the medical record to be screened, and obtain the processing result of the medical record to be screened.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  • the above-mentioned anti-network-based electronic medical record screening method, device, computer equipment and storage medium generate simulated misdiagnosis data through the first generator to generate a large amount of simulated misdiagnosis data close to real, and improve the initial comprehensive discriminator’s ability to discriminate misdiagnosis data .
  • the simulated missed diagnosis data is generated by the second generator to generate a large amount of simulated missed diagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the missed diagnosis data is improved.
  • the discriminative ability of the model Huge improvements.
  • the trained initial comprehensive discriminator is determined as the comprehensive discriminator to obtain a discriminator that can screen medical records.
  • the comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened are obtained.
  • the comprehensive discriminator is used to screen whether the medical records are normal, which can improve the accuracy and efficiency of medical record screening, and reduce the medical records The cost of screening.
  • This application can solve the screening problem of electronic medical records. This application can be applied to the smart medical field of smart cities, so as to promote the construction of smart cities.
  • FIG. 1 is a schematic diagram of an application environment of an electronic medical record screening method based on a confrontation network in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application
  • FIG. 3 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application
  • FIG. 4 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application
  • FIG. 6 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application
  • FIG. 7 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic medical record screening device based on a confrontation network in an embodiment of the present application.
  • Fig. 9 is a schematic diagram of a computer device in an embodiment of the present application.
  • the electronic medical record screening method based on the confrontation network can be applied in the application environment as shown in FIG. 1, in which the client communicates with the server.
  • the client includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented with an independent server or a server cluster composed of multiple servers.
  • a method for screening electronic medical records based on a confrontation network is provided.
  • the method is applied to the server in FIG. 1 as an example for description, including the following steps:
  • the first generator refers to a simulation generator obtained after training by a Generative Adversarial Network (GAN), and is used to generate simulated misdiagnosis data.
  • GAN Generative Adversarial Network
  • the first generator can generate a large amount of simulated misdiagnosis data close to the real, ensuring that the initial comprehensive discriminator has enough misdiagnosis training data to improve the ability to discriminate misdiagnosed data.
  • Simulated misdiagnosis data is a type of medical record data, which refers to medical records that have misdiagnosis problems.
  • the second generator is also trained by a Generative Adversarial Network (GAN) to obtain a simulation generator for generating simulated missed diagnosis data.
  • GAN Generative Adversarial Network
  • the second generator can generate a large amount of simulated missed diagnosis data that is close to real, ensuring that the initial comprehensive discriminator has enough missed diagnosis training data to improve the ability to discriminate the missed diagnosis data.
  • the simulated missed diagnosis data is a type of medical record data, which refers to the medical records that have the problem of missed diagnosis.
  • the true normal data refers to the true medical record data without any missed diagnosis or misdiagnosis.
  • the initial comprehensive discriminator is a three-class classifier.
  • the discriminant data of the initial comprehensive discriminator can be returned to the first generator and the second generator to improve the correlation between the first generator, the second generator and the initial comprehensive discriminator (based on the loss Function) to further improve the discrimination ability of the initial comprehensive discriminator.
  • the discriminant data of the initial comprehensive discriminator converges.
  • the trained initial comprehensive discriminator can be determined as the comprehensive discriminator.
  • the comprehensive discriminator can be used to discriminate the type of medical record data.
  • the initial comprehensive discriminator combines the simulated data of the first generator and the second generator (including simulated missed diagnosis data and simulated misdiagnosis data), and the final comprehensive discriminator has good discrimination capabilities and can accurately distinguish the types of electronic medical records.
  • the medical record to be screened refers to the medical record that needs to be screened.
  • the comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened can be obtained.
  • the simulated misdiagnosis data is generated by the first generator to generate a large amount of simulated misdiagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the misdiagnosis data is improved.
  • the simulated missed diagnosis data is generated by the second generator to generate a large amount of simulated missed diagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the missed diagnosis data is improved.
  • the trained initial comprehensive discriminator is determined as the comprehensive discriminator to obtain a discriminator that can screen medical records.
  • the comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened are obtained.
  • the comprehensive discriminator is used to screen whether the medical records are normal, which can improve the accuracy and efficiency of medical record screening, and reduce the medical records The cost of screening.
  • step S10 that is, before generating simulated misdiagnosis data by the first generator, further includes:
  • the first initial generator receives the first random noise, and generates the first simulation data
  • the first discriminator receives the true misdiagnosis data with a tag of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a tag of 0, and generates first simulation discrimination data;
  • the first confrontation neural network includes a first initial generator and a first discriminator.
  • the first random noise can be generated by a random algorithm.
  • the generator will obtain a larger first generation loss value, and adjust the calculation parameters in the first initial generator according to the first generation loss value, so that the data generated by the first initial generator gradually approaches the true misdiagnosis data.
  • the first discriminator makes a misjudgment when discriminating the first analog data, it will also obtain a larger first discrimination loss value, and adjust the first discriminator according to the first discrimination loss value Calculating parameters in, so that it has a stronger ability to distinguish between the first simulated data and the real misdiagnosed data.
  • Repeating the step of updating the first discriminator refers to repeating the steps related to the first discriminator in steps S101-S104.
  • Repeating the step of updating the first generator refers to repeating the steps related to the first initial generator in steps S101-S104.
  • the update steps of the first discriminator and the first initial generator are performed simultaneously.
  • the first simulation judgment data is in the first preset range, it can be considered that the first adversarial neural network meets the first preset termination condition.
  • the first discrimination data of the first simulation data of the first discriminator is 0.5, it is difficult for the first discriminator to determine whether the first simulation data output by the first initial generator is true. That is, the first simulation data generated by the first generator is very similar to the real misdiagnosis data. At this time, the first adversarial neural network has reached convergence.
  • the first initial generator receives the first random noise and generates first simulation data, where the first initial generator continuously generates new first simulation data .
  • the first discriminator receives the true misdiagnosis data with a label of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a label of 0, and generates first simulation discrimination data, where, The first discriminator simultaneously discriminates the real misdiagnosed data and the first simulated data, which can improve the ability of the first discriminator to discriminate the misdiagnosed data.
  • the first discriminator is updated according to the first discriminating loss value, and the first initial generator is updated according to the first generation loss value.
  • the parameters of the respective models are gradually updated through the loss value to improve the model’s performance Accuracy.
  • the preset range is used to complete the training of the model.
  • the first initial generator that satisfies the first preset termination condition is determined as the first generator, so as to obtain a first generator that can be used to generate first simulation data.
  • step S103 the calculation of the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data includes:
  • x r1 represents the true misdiagnosis data
  • D 1 (x) represents the first true discrimination data
  • E is the expected calculation symbol
  • x f1 represents the first simulation data
  • z 1 represents the first random noise
  • G 1 (z 1 ) represents the first simulation data
  • D 1 (G 1 (z 1 )) represents the first discrimination loss value
  • step S103 the calculating the first generation loss value of the first initial generator according to the first simulation discrimination data includes:
  • D 1 (x f1 ) is the first simulation discrimination data
  • D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data
  • is a hyperparameter
  • G 1 refers to the first initial generator
  • D 1 refers to the first discriminator.
  • the first initial generator is used to generate the first simulation data
  • the first discriminator is used to discriminate the first simulation data and generate the first simulation discrimination data; it is also used to discriminate the real misdiagnosis data and generate the first real discrimination data.
  • the training of the first initial generator and the first discriminator is a process of confrontation.
  • is a hyperparameter, which can be set before model training.
  • a loss item ( ⁇ logD 3 (x f1 )) including the discriminant data of D 3 (initial comprehensive discriminator) can be added.
  • the addition of ⁇ logD 3 (x f1 ) can make the distribution of the misdiagnosis data generated by the first generator meet the requirements of the actual scene.
  • step S20 that is, before generating the simulated missed diagnosis data by the second generator, further includes:
  • the second initial generator receives the second random noise, and generates second simulation data
  • the second discriminator receives the real missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data;
  • the second adversarial neural network includes a second initial generator and a second discriminator.
  • the second random noise can be generated by a random algorithm.
  • the generator will obtain two larger second generation loss values, and adjust the calculation parameters in the second initial generator according to the second generation loss values, so that the data generated by the second initial generator gradually approaches the real Missed diagnosis data.
  • the second discriminator makes a misjudgment when discriminating the second analog data, it will also obtain two larger second discrimination loss values, and adjust the second discrimination according to the second discrimination loss value
  • the calculation parameters in the device make it more capable of distinguishing between the second simulated data and the real missed diagnosis data.
  • step of updating the second discriminator refers to repeating the steps related to the second discriminator in steps S201-S204.
  • step of updating the second generator refers to repeating the steps related to the second initial generator in steps S201-S204.
  • the update steps of the second discriminator and the second initial generator are performed simultaneously.
  • the second simulation judgment data is in the second preset range, it can be considered that the second adversarial neural network meets the second preset termination condition.
  • the second discriminator's second analog discrimination data for the second analog data is 0.5, it is difficult for the second discriminator to determine whether the second analog data output by the second initial generator is true. That is, the second simulation data generated by the second generator is very similar to the real missed diagnosis data. At this time, the second counter neural network has reached convergence.
  • the second initial generator receives the second random noise and generates second simulation data, where the second initial generator continuously generates new second simulation data .
  • the second discriminator receives the true missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data, where, The second discriminator simultaneously discriminates the real missed diagnosis data and the second simulated data, which can improve the ability of the second discriminator to discriminate the missed data.
  • the second discriminator is updated according to the second discriminant loss value, and the second initial generator is updated according to the second generation loss value.
  • the parameters of the respective models are gradually updated by the loss value to improve the performance of the model.
  • Accuracy Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in the second
  • the preset range is used to complete the training of the model.
  • the second initial generator that meets the second preset termination condition is determined as the second generator to obtain a second generator that can be used to generate second simulation data.
  • step S203 the calculation of the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data includes:
  • x r2 represents the real missed diagnosis data
  • D 2 (x) represents the second true discrimination data
  • E is the expected calculation symbol
  • x f2 represents the second simulation data
  • z 2 represents the second random noise
  • G 2 (z 2 ) represents the second simulation data
  • D 2 (G 2 (z 2 )) represents the second discrimination loss value
  • step S203 the calculating the second generation loss value of the second initial generator according to the second simulation discrimination data includes:
  • D 2 (x f2 ) is the second simulation discrimination data
  • D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data
  • is a hyperparameter
  • G 2 refers to the second initial generator
  • D 2 refers to the second discriminator.
  • the second initial generator is used to generate second simulation data
  • the second discriminator is used to discriminate the second simulation data and generate the second simulation discrimination data; it is also used to discriminate the true missed diagnosis data and generate the second real discrimination data.
  • the training of the second initial generator and the second discriminator are two processes of confrontation.
  • is a hyperparameter, which can be set before model training.
  • a loss item ( ⁇ logD 3 (x f2 )) including the discriminant data of D 3 (initial comprehensive discriminator) can be added.
  • the addition of ⁇ logD 3 (x f2 ) can make the distribution of the missed diagnosis data generated by the second generator meet the requirements of the actual scene.
  • step S30 that is, acquiring real normal data, using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator, includes :
  • S302 Calculate a comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
  • the missed diagnosis rate in the comprehensive discrimination data refers to the ratio of the number of missed diagnosed medical records determined by the initial comprehensive discriminator to the total number of discriminated medical records. For example, if the total number of discriminated medical records is 100, and the number of missed diagnoses determined by the initial comprehensive discriminator is 4, the missed diagnosis rate in the comprehensive discriminator data is 4%.
  • the misdiagnosis rate in the comprehensive discrimination data refers to the ratio of the number of misdiagnosed medical records determined by the initial comprehensive discriminator to the total number of discriminated medical records; the normal rate in the comprehensive discrimination data refers to the normal medical records judged by the initial comprehensive discriminator The ratio of the number to the total number of discriminative medical records.
  • the step of repeatedly updating the initial comprehensive discriminator refers to repeatedly performing steps S301-S303.
  • the preset convergence condition can mean that the comprehensive judgment loss value approaches a certain value.
  • step S302 that is, calculating the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosed rate, and the normal rate includes:
  • the missed diagnosis rate, the misdiagnosis rate, and the normal rate are processed by a comprehensive loss function to generate the comprehensive discrimination loss value, and the comprehensive loss function is:
  • the comprehensive discrimination loss value can be calculated by the comprehensive loss function.
  • the comprehensive discriminant loss value approaches a certain value.
  • an electronic medical record screening device based on a confrontation network is provided, and the electronic medical record screening device based on the confrontation network corresponds to the electronic medical record screening method based on the confrontation network in the above-mentioned embodiment in a one-to-one correspondence.
  • the electronic medical record screening device based on the confrontation network includes a first generation module 10, a second generation module 20, a training module 30, a determination discriminator module 40 and a screening module 50.
  • the detailed description of each functional module is as follows:
  • the first generating module 10 is configured to generate simulated misdiagnosis data through the first generator
  • the second generating module 20 is used to generate simulated missed diagnosis data through the second generator
  • the training module 30 is configured to obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator;
  • the determining discriminator module 40 is used to determine the trained initial comprehensive discriminator as a comprehensive discriminator after the training is completed;
  • the screening module 50 is configured to use the comprehensive discriminator to process the medical record to be screened, and obtain the processing result of the medical record to be screened.
  • the first generation module 10 includes:
  • the first initial generator receives the first random noise, and generates the first simulation data
  • the first discriminating unit is used for the first discriminator to receive the true misdiagnosis data with a label of 1, and generate first real discriminating data; the first discriminator receives the first simulation data with the label of 0, and generates a first simulation Discriminate data;
  • the first loss value calculating unit is configured to calculate the first discriminant loss value of the first discriminator according to the first true discriminant data and the first simulation discriminant data; calculate the first discriminant loss value of the first discriminator according to the first simulation discriminant data The first generation loss value of the first initial generator;
  • a first update unit configured to update the first discriminator according to the first discrimination loss value, and update the first initial generator according to the first generation loss value
  • the first iterative update unit is configured to repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is the The first simulation judgment data is in the first preset range;
  • a determining first generator unit is configured to determine the first initial generator meeting a first preset termination condition as the first generator.
  • the unit for calculating the first loss value includes:
  • a unit for calculating a first discriminant loss value is used to process the first true discriminant data and the first simulated discriminant data through a first discriminant loss function to generate the first discriminant loss value, and the first discriminant loss function is :
  • x r1 represents the true misdiagnosis data
  • D 1 (x) represents the first true discrimination data
  • E is the expected calculation symbol
  • x f1 represents the first simulation data
  • z 1 represents the first random noise
  • G 1 (z 1 ) represents the first simulation data
  • D 1 (G 1 (z 1 )) represents the first discrimination loss value
  • a unit for calculating a first generation loss value is configured to process the first simulation discrimination data through a first generation loss function to generate the first generation loss value, and the first generation loss function is:
  • D 1 (x f1 ) is the first simulation discrimination data
  • D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data
  • is a hyperparameter
  • the second generation module 20 includes:
  • the second initial generator receives the second random noise, and generates the second simulation data
  • the second discriminating unit is used for the second discriminator to receive the true missed diagnosis data with the tag of 1, and to generate second real discriminating data; the second discriminator receives the second simulation data with the tag of 0, and generates the second simulation Discriminate data;
  • the second loss value calculation unit is configured to calculate the second discriminant loss value of the second discriminator according to the second true discriminant data and the second simulation discriminant data; calculate the second discriminant loss value of the second discriminator according to the second simulation discriminant data The second generation loss value of the second initial generator;
  • a second update unit configured to update the second discriminator according to the second discrimination loss value, and update the second initial generator according to the second generation loss value
  • the second iterative update unit is configured to repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is satisfied, and the second preset termination condition is the
  • the second simulation discrimination data is in the second preset range
  • a determining second generator unit is configured to determine the second initial generator meeting a second preset termination condition as the second generator.
  • the unit for calculating the second loss value includes:
  • a unit for calculating a second discriminant loss value is used to process the second true discriminant data and the second simulated discriminant data through a second discriminant loss function to generate the second discriminant loss value, and the second discriminant loss function is :
  • x r2 represents the real missed diagnosis data
  • D 2 (x) represents the second true discrimination data
  • E is the expected calculation symbol
  • x f2 represents the second simulation data
  • z 2 represents the second random noise
  • G 2 (z 2 ) represents the second simulation data
  • D 2 (G 2 (z 2 )) represents the second discrimination loss value
  • the second generation loss value calculation unit is configured to process the second simulation discrimination data through a second generation loss function to generate the second generation loss value, and the second generation loss function is:
  • D 2 (x f2 ) is the second simulation discrimination data
  • D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data
  • is a hyperparameter
  • the training module 30 includes:
  • Generating a comprehensive discrimination data unit configured to use the initial comprehensive discriminator to discriminate the true normal data, the simulated misdiagnosis data and the simulated missed diagnosis data, and generate comprehensive discrimination data, the comprehensive discrimination data including missed diagnosis rate, Misdiagnosis rate and normal rate;
  • Updating the initial comprehensive discriminator unit configured to update the initial comprehensive discriminator according to the comprehensive discrimination loss value
  • the iterative update of the initial comprehensive discriminator unit is used to repeat the steps of updating the initial comprehensive discriminator until the comprehensive discrimination loss value meets the preset convergence condition.
  • a unit for generating a comprehensive discrimination loss value is further configured to process the missed diagnosis rate, the misdiagnosis rate, and the normal rate through a comprehensive loss function to generate the comprehensive discrimination loss value, and the comprehensive loss function is:
  • the various modules in the above-mentioned anti-network-based electronic medical record screening device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 9.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data involved in the above-mentioned electronic medical record screening method.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer-readable instructions are executed by the processor, a method for screening electronic medical records based on a counter-network is realized.
  • a computer device including a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  • one or more computer-readable storage media storing computer-readable instructions are provided.
  • the readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage media. Storage medium.
  • the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the following steps are implemented:
  • the comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Epidemiology (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

An electronic medical record screening method and apparatus based on an adversarial network, and a device and a medium. The method comprises: a first generator generating simulated misdiagnosis data (S10); a second generator generating simulated missed diagnosis data (S20); acquiring real normal data, and training an initial synthetic discriminator by using the real normal data, the simulated misdiagnosis data and the simulated missed diagnosis data (S30); after the training is completed, determining the trained initial synthetic discriminator to be a synthetic discriminator (S40); and processing, by using the synthetic discriminator, medical records to be subjected to screening to obtain a processing result regarding said medical records (S50). The problem of screening electronic medical records is solved, and medical record screening costs are reduced.

Description

基于对抗网络的电子病历筛查方法、装置、设备及介质Method, device, equipment and medium for screening electronic medical records based on confrontation network
本申请要求于2020年9月9日提交中国专利局、申请号为202010941842.2,发明名称为“基于对抗网络的电子病历筛查方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on September 9, 2020, with the application number 202010941842.2, and the invention title "Electronic Medical Record Screening Method, Apparatus, Equipment, and Media Based on Anti-Network", all of which The content is incorporated in this application by reference.
技术领域Technical field
本申请涉及机器学习领域,尤其涉及一种基于对抗网络的电子病历筛查方法、装置、设备及介质。This application relates to the field of machine learning, and in particular to a method, device, equipment, and medium for screening electronic medical records based on an adversarial network.
背景技术Background technique
病历,用于记录患者在诊疗过程中产生的治疗信息,对医生的诊疗诊断具有重要意义。随着计算机技术的普及化,病历也逐渐电子化,形成的电子病历存储于医院的电子病历库。The medical record is used to record the treatment information generated by the patient during the diagnosis and treatment process, which is of great significance to the doctor's diagnosis and treatment. With the popularization of computer technology, medical records are gradually becoming electronic, and the formed electronic medical records are stored in the electronic medical record database of the hospital.
然而,发明人发现,由于医生水平和经验不一,记录的病历质量也参差不齐,这样容易导致电子病历归档错误,产生隐形的医疗隐患。因而,需要对电子病历进行筛查,以及时查找出问题病历。若使用人工筛查,则会导致人力成本和时间成本高昂,大大加重医院的运营成本。因而,亟需寻找一种非人工处理的筛查方法,从电子病历中筛查出问题病历。However, the inventor found that due to differences in the level and experience of doctors, the quality of recorded medical records is also uneven, which can easily lead to errors in electronic medical record filing and invisible medical risks. Therefore, it is necessary to screen electronic medical records to find out problem medical records in a timely manner. If manual screening is used, it will lead to high labor and time costs, and greatly increase the operating costs of the hospital. Therefore, it is urgent to find a non-manual screening method to screen out problem medical records from electronic medical records.
申请内容Application content
基于此,有必要针对上述技术问题,提供一种基于对抗网络的电子病历筛查方法、装置、计算机设备及存储介质,以解决电子病历的筛查问题。Based on this, it is necessary to address the above technical problems and provide a method, device, computer equipment, and storage medium for screening electronic medical records based on a counter-network to solve the screening problem of electronic medical records.
一种基于对抗网络的电子病历筛查方法,包括:A screening method for electronic medical records based on a confrontation network, including:
通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
一种基于对抗网络的电子病历筛查装置,包括:An electronic medical record screening device based on a confrontation network, including:
第一生成模块,用于通过第一生成器生成模拟误诊数据;The first generating module is used to generate simulated misdiagnosis data through the first generator;
第二生成模块,用于通过第二生成器生成模拟漏诊数据;The second generating module is used to generate simulated missed diagnosis data through the second generator;
训练模块,用于获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;A training module, configured to obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator;
确定判别器模块,用于在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;The determining discriminator module is used to determine the trained initial comprehensive discriminator as the comprehensive discriminator after the training is completed;
筛查模块,用于使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The screening module is configured to use the comprehensive discriminator to process the medical record to be screened, and obtain the processing result of the medical record to be screened.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多 个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
上述基于对抗网络的电子病历筛查方法、装置、计算机设备及存储介质,通过第一生成器生成模拟误诊数据,以生成大量接近真实的模拟误诊数据,提高初始综合判别器对误诊数据的判别能力。通过第二生成器生成模拟漏诊数据,以生成大量接近真实的模拟漏诊数据,提高初始综合判别器对漏诊数据的判别能力。获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练,在此处,由于有足够多的数据对模型进行训练,模型的判别能力大大提升。在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器,以得到可以筛查病历的判别器。使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果,在此处,通过综合判别器筛查病历是否正常,可以提高病历筛查的准确率和效率,减少病历筛查的成本。本申请可解决电子病历的筛查问题。本申请可应用于智慧城市的智能医疗领域中,从而推动智慧城市的建设。The above-mentioned anti-network-based electronic medical record screening method, device, computer equipment and storage medium generate simulated misdiagnosis data through the first generator to generate a large amount of simulated misdiagnosis data close to real, and improve the initial comprehensive discriminator’s ability to discriminate misdiagnosis data . The simulated missed diagnosis data is generated by the second generator to generate a large amount of simulated missed diagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the missed diagnosis data is improved. Obtain real normal data, use the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data to train the initial comprehensive discriminator. Here, since there is enough data to train the model, the discriminative ability of the model Huge improvements. After the training is completed, the trained initial comprehensive discriminator is determined as the comprehensive discriminator to obtain a discriminator that can screen medical records. The comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened are obtained. Here, the comprehensive discriminator is used to screen whether the medical records are normal, which can improve the accuracy and efficiency of medical record screening, and reduce the medical records The cost of screening. This application can solve the screening problem of electronic medical records. This application can be applied to the smart medical field of smart cities, so as to promote the construction of smart cities.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请一实施例中基于对抗网络的电子病历筛查方法的一应用环境示意图;FIG. 1 is a schematic diagram of an application environment of an electronic medical record screening method based on a confrontation network in an embodiment of the present application;
图2是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;2 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application;
图3是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;FIG. 3 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application;
图4是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;FIG. 4 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application;
图5是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;FIG. 5 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application;
图6是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;FIG. 6 is a schematic flowchart of a method for screening electronic medical records based on a confrontation network in an embodiment of the present application;
图7是本申请一实施例中基于对抗网络的电子病历筛查方法的一流程示意图;FIG. 7 is a schematic flowchart of an electronic medical record screening method based on a confrontation network in an embodiment of the present application;
图8是本申请一实施例中基于对抗网络的电子病历筛查装置的一结构示意图;FIG. 8 is a schematic structural diagram of an electronic medical record screening device based on a confrontation network in an embodiment of the present application;
图9是本申请一实施例中计算机设备的一示意图。Fig. 9 is a schematic diagram of a computer device in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本实施例提供的基于对抗网络的电子病历筛查方法,可应用在如图1的应用环境中,其中,客户端与服务端进行通信。其中,客户端包括但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务端可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The electronic medical record screening method based on the confrontation network provided in this embodiment can be applied in the application environment as shown in FIG. 1, in which the client communicates with the server. Among them, the client includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented with an independent server or a server cluster composed of multiple servers.
在一实施例中,如图2所示,提供一种基于对抗网络的电子病历筛查方法,以该方法应用在图1中的服务端为例进行说明,包括如下步骤:In one embodiment, as shown in FIG. 2, a method for screening electronic medical records based on a confrontation network is provided. The method is applied to the server in FIG. 1 as an example for description, including the following steps:
S10、通过第一生成器生成模拟误诊数据。S10. Generate simulated misdiagnosis data through the first generator.
本实施例中,第一生成器指的是经对抗生成网络(Generative Adversarial Network,GAN)训练后获得模拟生成器,用于生成模拟误诊数据。第一生成器可以生成大量接近真实的模拟误诊数据,确保初始综合判别器有足够多的误诊训练数据,以提高对误诊数据的判别能力。模拟误诊数据是病历数据的一种,指的是存在误诊问题的病历。In this embodiment, the first generator refers to a simulation generator obtained after training by a Generative Adversarial Network (GAN), and is used to generate simulated misdiagnosis data. The first generator can generate a large amount of simulated misdiagnosis data close to the real, ensuring that the initial comprehensive discriminator has enough misdiagnosis training data to improve the ability to discriminate misdiagnosed data. Simulated misdiagnosis data is a type of medical record data, which refers to medical records that have misdiagnosis problems.
S20、通过第二生成器生成模拟漏诊数据。S20: Generate simulated missed diagnosis data through the second generator.
同样的,第二生成器也是经对抗生成网络(Generative Adversarial Network,GAN)训练后获得模拟生成器,用于生成模拟漏诊数据。第二生成器可以生成大量接近真实的模拟漏诊数据,确保初始综合判别器有足够多的漏诊训练数据,以提高对漏诊数据的判别能力。模拟漏诊数据是病历数据的一种,指的是存在漏诊问题的病历。Similarly, the second generator is also trained by a Generative Adversarial Network (GAN) to obtain a simulation generator for generating simulated missed diagnosis data. The second generator can generate a large amount of simulated missed diagnosis data that is close to real, ensuring that the initial comprehensive discriminator has enough missed diagnosis training data to improve the ability to discriminate the missed diagnosis data. The simulated missed diagnosis data is a type of medical record data, which refers to the medical records that have the problem of missed diagnosis.
S30、获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练。S30. Obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train an initial comprehensive discriminator.
真实正常数据指的是真实的,不存在漏诊、误诊情况的病历数据。在此处,初始综合判别器是一个三分类的分类器。在训练初始综合判别器时,同时使用真实正常数据、模拟误诊数据和模拟漏诊数据进行训练,可以大大提高初始综合判别器对这三类数据的判别能力,进而准确分辨出病历的类别。具体的,在训练时,初始综合判别器的判别数据可以返回给第一生成器和第二生成器,提高第一生成器、第二生成器与初始综合判别器之间的关联关系(基于损失函数),进一步提高初始综合判别器的判别能力。The true normal data refers to the true medical record data without any missed diagnosis or misdiagnosis. Here, the initial comprehensive discriminator is a three-class classifier. When training the initial comprehensive discriminator, using real normal data, simulated misdiagnosis data and simulated missed diagnosis data for training can greatly improve the initial comprehensive discriminator's ability to discriminate these three types of data, and then accurately distinguish the types of medical records. Specifically, during training, the discriminant data of the initial comprehensive discriminator can be returned to the first generator and the second generator to improve the correlation between the first generator, the second generator and the initial comprehensive discriminator (based on the loss Function) to further improve the discrimination ability of the initial comprehensive discriminator.
S40、在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器。S40. After the training is completed, determine the trained initial comprehensive discriminator as a comprehensive discriminator.
训练完毕时,初始综合判别器的判别数据收敛。此时,可以将该训练好的初始综合判别器确定为综合判别器。综合判别器可用于判别病历数据的类型。初始综合判别器融合了第一生成器和第二生成器的模拟数据(包括模拟漏诊数据和模拟误诊数据),最后确定的综合判别器具有良好的判别能力,可以精准区分电子病历的类别。When the training is completed, the discriminant data of the initial comprehensive discriminator converges. At this time, the trained initial comprehensive discriminator can be determined as the comprehensive discriminator. The comprehensive discriminator can be used to discriminate the type of medical record data. The initial comprehensive discriminator combines the simulated data of the first generator and the second generator (including simulated missed diagnosis data and simulated misdiagnosis data), and the final comprehensive discriminator has good discrimination capabilities and can accurately distinguish the types of electronic medical records.
S50、使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。S50. Use the comprehensive discriminator to process the medical record to be screened, and obtain a processing result of the medical record to be screened.
本实施例中,待筛查病历指的是需要进行筛查的病历。使用综合判别器处理待筛查病历,可以获得待筛查病历的处理结果。处理结果有三种,分别为漏诊病历、误诊病历和正常病历。其中,漏诊病历和误诊病历都属于异常病历(也可以叫问题病历)。由于本实施例提供的综合判别器具有良好的判别能力,可以高精度地区分正常病历和异常病历,大大减少病历筛查的处理时间和成本,降低医院的运营成本。In this embodiment, the medical record to be screened refers to the medical record that needs to be screened. The comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened can be obtained. There are three treatment results: missed diagnosis, misdiagnosis and normal medical records. Among them, missed medical records and misdiagnosed medical records are abnormal medical records (also called problem medical records). Since the comprehensive discriminator provided in this embodiment has good discrimination capabilities, it can distinguish between normal medical records and abnormal medical records with high accuracy, greatly reducing the processing time and cost of medical record screening, and reducing hospital operating costs.
步骤S10-S50中,通过第一生成器生成模拟误诊数据,以生成大量接近真实的模拟误诊数据,提高初始综合判别器对误诊数据的判别能力。通过第二生成器生成模拟漏诊数据,以生成大量接近真实的模拟漏诊数据,提高初始综合判别器对漏诊数据的判别能力。获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练,在此处,由于有足够多的数据对模型进行训练,模型的判别能力大大提升。在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器,以得到可以筛查病历的判别器。使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果,在此处,通过综合判别器筛查病历是否正常,可以提高病历筛查的准确率和效率,减少病历筛查的成本。In steps S10-S50, the simulated misdiagnosis data is generated by the first generator to generate a large amount of simulated misdiagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the misdiagnosis data is improved. The simulated missed diagnosis data is generated by the second generator to generate a large amount of simulated missed diagnosis data close to the real, and the initial comprehensive discriminator's ability to discriminate the missed diagnosis data is improved. Obtain real normal data, use the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data to train the initial comprehensive discriminator. Here, since there is enough data to train the model, the discriminative ability of the model Huge improvements. After the training is completed, the trained initial comprehensive discriminator is determined as the comprehensive discriminator to obtain a discriminator that can screen medical records. The comprehensive discriminator is used to process the medical records to be screened, and the processing results of the medical records to be screened are obtained. Here, the comprehensive discriminator is used to screen whether the medical records are normal, which can improve the accuracy and efficiency of medical record screening, and reduce the medical records The cost of screening.
可选的,如图3所示,步骤S10,即所述通过第一生成器生成模拟误诊数据之前,还包括:Optionally, as shown in FIG. 3, step S10, that is, before generating simulated misdiagnosis data by the first generator, further includes:
S101、在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据;S101. In the first confrontation neural network, the first initial generator receives the first random noise, and generates the first simulation data;
S102、第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据;S102. The first discriminator receives the true misdiagnosis data with a tag of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a tag of 0, and generates first simulation discrimination data;
S103、根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第 一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值;S103. Calculate the first discrimination loss value of the first discriminator according to the first true discrimination data and the first simulation discrimination data; calculate the first discrimination loss value of the first initial generator according to the first simulation discrimination data -Generate loss value;
S104、根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器;S104. Update the first discriminator according to the first discrimination loss value, and update the first initial generator according to the first generation loss value;
S105、重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围;S105. Repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is that the first simulation judgment data is in The first preset range;
S106、将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器。S106. Determine the first initial generator that meets the first preset termination condition as the first generator.
本实施例中,第一对抗神经网络包括第一初始生成器和第一判别器。第一随机噪音可通过随机算法产生。在训练初始阶段时,第一初始生成器处理第一随机噪音时,产生的第一模拟数据与真实误诊数据差别较大,很容易被第一判别器识别出。此时,生成器会获得一个较大的第一生成损失值,并根据该第一生成损失值调整第一初始生成器中的计算参数,从而使第一初始生成器生成的数据逐渐接近真实误诊数据。与此同时,若第一判别器在对第一模拟数据判别时发生了误判,则它同样会获得一个较大的第一判别损失值,并根据该第一判别损失值调整第一判别器中的计算参数,从而使它对第一模拟数据和真实误诊数据的区分能力更强。In this embodiment, the first confrontation neural network includes a first initial generator and a first discriminator. The first random noise can be generated by a random algorithm. In the initial stage of training, when the first initial generator processes the first random noise, the generated first simulation data is quite different from the real misdiagnosis data, and it is easy to be identified by the first discriminator. At this time, the generator will obtain a larger first generation loss value, and adjust the calculation parameters in the first initial generator according to the first generation loss value, so that the data generated by the first initial generator gradually approaches the true misdiagnosis data. At the same time, if the first discriminator makes a misjudgment when discriminating the first analog data, it will also obtain a larger first discrimination loss value, and adjust the first discriminator according to the first discrimination loss value Calculating parameters in, so that it has a stronger ability to distinguish between the first simulated data and the real misdiagnosed data.
重复更新第一判别器的步骤指的是重复步骤S101-S104中与第一判别器相关的步骤。重复更新第一生成器的步骤指的是重复步骤S101-S104中与第一初始生成器相关的步骤。第一判别器和第一初始生成器的更新步骤是同时进行的。Repeating the step of updating the first discriminator refers to repeating the steps related to the first discriminator in steps S101-S104. Repeating the step of updating the first generator refers to repeating the steps related to the first initial generator in steps S101-S104. The update steps of the first discriminator and the first initial generator are performed simultaneously.
当第一模拟判别数据处于第一预设范围时,可以认为第一对抗神经网络满足第一预设终止条件。例如,当第一判别器对第一模拟数据的第一模拟判别数据为0.5时,第一判别器难以判定第一初始生成器输出的第一模拟数据是否真实。也即是,第一生成器生成的第一模拟数据非常近似于真实误诊数据,此时,第一对抗神经网络达到了收敛。When the first simulation judgment data is in the first preset range, it can be considered that the first adversarial neural network meets the first preset termination condition. For example, when the first discrimination data of the first simulation data of the first discriminator is 0.5, it is difficult for the first discriminator to determine whether the first simulation data output by the first initial generator is true. That is, the first simulation data generated by the first generator is very similar to the real misdiagnosis data. At this time, the first adversarial neural network has reached convergence.
步骤S101-S106中,在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据,在此处,第一初始生成器不断地生成新的第一模拟数据。第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据,在此处,第一判别器同时对真实误诊数据和第一模拟数据进行判别,可以提高第一判别器对误诊数据的判别能力。根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值,计算出可用于更新模型参数的损失值。根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器,在此处,通过损失值逐步更新各自模型的参数,提高模型的精准度。重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围,以完成对模型的训练。将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器,以获得可用于生成第一模拟数据的第一生成器。In steps S101-S106, in the first counter neural network, the first initial generator receives the first random noise and generates first simulation data, where the first initial generator continuously generates new first simulation data . The first discriminator receives the true misdiagnosis data with a label of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a label of 0, and generates first simulation discrimination data, where, The first discriminator simultaneously discriminates the real misdiagnosed data and the first simulated data, which can improve the ability of the first discriminator to discriminate the misdiagnosed data. Calculate the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data; calculate the first generation of the first initial generator according to the first simulation discrimination data Loss value, calculate the loss value that can be used to update the model parameters. The first discriminator is updated according to the first discriminating loss value, and the first initial generator is updated according to the first generation loss value. Here, the parameters of the respective models are gradually updated through the loss value to improve the model’s performance Accuracy. Repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is that the first simulation judgment data is in the first The preset range is used to complete the training of the model. The first initial generator that satisfies the first preset termination condition is determined as the first generator, so as to obtain a first generator that can be used to generate first simulation data.
可选的,如图4所示,步骤S103中,所述根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值,包括:Optionally, as shown in FIG. 4, in step S103, the calculation of the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data includes:
S1031、通过第一判别损失函数处理所述第一真实判别数据和所述第一模拟判别数据,生成所述第一判别损失值,所述第一判别损失函数为:S1031. Process the first real discrimination data and the first simulation discrimination data by using a first discrimination loss function to generate the first discrimination loss value, and the first discrimination loss function is:
Figure PCTCN2020124219-appb-000001
Figure PCTCN2020124219-appb-000001
其中,x r1表示所述真实误诊数据,D 1(x)表示所述第一真实判别数据,E为期望计算符号,x f1表示所述第一模拟数据,z 1表示所述第一随机噪音,G 1(z 1)表示所述第一模拟数据,D 1(G 1(z 1))表示所述第一判别损失值,
Figure PCTCN2020124219-appb-000002
表示所述第一判别损失值;
Wherein, x r1 represents the true misdiagnosis data, D 1 (x) represents the first true discrimination data, E is the expected calculation symbol, x f1 represents the first simulation data, and z 1 represents the first random noise , G 1 (z 1 ) represents the first simulation data, D 1 (G 1 (z 1 )) represents the first discrimination loss value,
Figure PCTCN2020124219-appb-000002
Represents the first discriminant loss value;
步骤S103中,所述根据所述第一模拟判别数据计算所述第一初始生成器的第一生成 损失值,包括:In step S103, the calculating the first generation loss value of the first initial generator according to the first simulation discrimination data includes:
S1032、通过第一生成损失函数处理所述第一模拟判别数据,生成所述第一生成损失值,所述第一生成损失函数为:S1032. Process the first simulation discrimination data by using a first generation loss function to generate the first generation loss value, where the first generation loss function is:
Figure PCTCN2020124219-appb-000003
Figure PCTCN2020124219-appb-000003
其中,D 1(x f1)为所述第一模拟判别数据,D 3(x f1)为所述初始综合判别器对所述第一模拟数据进行判别后生成的判别数据,α为超参数,
Figure PCTCN2020124219-appb-000004
为所述第一生成损失值。
Wherein, D 1 (x f1 ) is the first simulation discrimination data, D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data, α is a hyperparameter,
Figure PCTCN2020124219-appb-000004
Is the first generation loss value.
在此处,G 1指的是第一初始生成器,D 1指的是第一判别器。第一初始生成器用于生成第一模拟数据,第一判别器用于判别第一模拟数据,并生成第一模拟判别数据;也用于判别真实误诊数据,并生成第一真实判别数据。第一初始生成器和第一判别器的训练是一个相互对抗的过程。α为超参数,可以在进行模型训练前进行设置。 Here, G 1 refers to the first initial generator, and D 1 refers to the first discriminator. The first initial generator is used to generate the first simulation data, and the first discriminator is used to discriminate the first simulation data and generate the first simulation discrimination data; it is also used to discriminate the real misdiagnosis data and generate the first real discrimination data. The training of the first initial generator and the first discriminator is a process of confrontation. α is a hyperparameter, which can be set before model training.
在计算第一生成损失值,可以增加包含D 3(初始综合判别器)判别数据的损失项(α·logD 3(x f1))。α·logD 3(x f1)的加入,可以使第一生成器生成的误诊数据的分布符合实际场景的需求。 When calculating the first generation loss value, a loss item (α·logD 3 (x f1 )) including the discriminant data of D 3 (initial comprehensive discriminator) can be added. The addition of α·logD 3 (x f1 ) can make the distribution of the misdiagnosis data generated by the first generator meet the requirements of the actual scene.
可选的,如图5所示,步骤S20,即所述通过第二生成器生成模拟漏诊数据之前,还包括:Optionally, as shown in FIG. 5, step S20, that is, before generating the simulated missed diagnosis data by the second generator, further includes:
S201、在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据;S201. In the second confrontation neural network, the second initial generator receives the second random noise, and generates second simulation data;
S202、第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据;S202. The second discriminator receives the real missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data;
S203、根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值;S203. Calculate the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data; calculate the second discrimination loss value of the second initial generator according to the second simulation discrimination data 2. Generate loss value;
S204、根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器;S204. Update the second discriminator according to the second discrimination loss value, and update the second initial generator according to the second generation loss value;
S205、重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围;S205. Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in Second preset range;
S206、将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器。S206. Determine the second initial generator that meets a second preset termination condition as the second generator.
本实施例中,第二对抗神经网络包括第二初始生成器和第二判别器。第二随机噪音可通过随机算法产生。在训练初始阶段时,第二初始生成器处理第二随机噪音时,产生的第二模拟数据与真实漏诊数据差别较大,很容易被第二判别器识别出。此时,生成器会获得二个较大的第二生成损失值,并根据该第二生成损失值调整第二初始生成器中的计算参数,从而使第二初始生成器生成的数据逐渐接近真实漏诊数据。与此同时,若第二判别器在对第二模拟数据判别时发生了误判,则它同样会获得二个较大的第二判别损失值,并根据该第二判别损失值调整第二判别器中的计算参数,从而使它对第二模拟数据和真实漏诊数据的区分能力更强。In this embodiment, the second adversarial neural network includes a second initial generator and a second discriminator. The second random noise can be generated by a random algorithm. In the initial stage of training, when the second initial generator processes the second random noise, the generated second simulated data is quite different from the real missed diagnosis data, and it is easy to be identified by the second discriminator. At this time, the generator will obtain two larger second generation loss values, and adjust the calculation parameters in the second initial generator according to the second generation loss values, so that the data generated by the second initial generator gradually approaches the real Missed diagnosis data. At the same time, if the second discriminator makes a misjudgment when discriminating the second analog data, it will also obtain two larger second discrimination loss values, and adjust the second discrimination according to the second discrimination loss value The calculation parameters in the device make it more capable of distinguishing between the second simulated data and the real missed diagnosis data.
重复更新第二判别器的步骤指的是重复步骤S201-S204中与第二判别器相关的步骤。重复更新第二生成器的步骤指的是重复步骤S201-S204中与第二初始生成器相关的步骤。第二判别器和第二初始生成器的更新步骤是同时进行的。Repeating the step of updating the second discriminator refers to repeating the steps related to the second discriminator in steps S201-S204. Repeating the step of updating the second generator refers to repeating the steps related to the second initial generator in steps S201-S204. The update steps of the second discriminator and the second initial generator are performed simultaneously.
当第二模拟判别数据处于第二预设范围时,可以认为第二对抗神经网络满足第二预设终止条件。例如,当第二判别器对第二模拟数据的第二模拟判别数据为0.5时,第二判别器难以判定第二初始生成器输出的第二模拟数据是否真实。也即是,第二生成器生成的第二模拟数据非常近似于真实漏诊数据,此时,第二对抗神经网络达到了收敛。When the second simulation judgment data is in the second preset range, it can be considered that the second adversarial neural network meets the second preset termination condition. For example, when the second discriminator's second analog discrimination data for the second analog data is 0.5, it is difficult for the second discriminator to determine whether the second analog data output by the second initial generator is true. That is, the second simulation data generated by the second generator is very similar to the real missed diagnosis data. At this time, the second counter neural network has reached convergence.
步骤S201-S206中,在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据,在此处,第二初始生成器不断地生成新的第二模拟数据。第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据,在此处,第二判别器同时对真实漏诊数据和第二模拟数据进行判别,可以提高第二判别器对漏诊数据的判别能力。根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值,计算出可用于更新模型参数的损失值。根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器,在此处,通过损失值逐步更新各自模型的参数,提高模型的精准度。重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围,以完成对模型的训练。将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器,以获得可用于生成第二模拟数据的第二生成器。In steps S201-S206, in the second counter neural network, the second initial generator receives the second random noise and generates second simulation data, where the second initial generator continuously generates new second simulation data . The second discriminator receives the true missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data, where, The second discriminator simultaneously discriminates the real missed diagnosis data and the second simulated data, which can improve the ability of the second discriminator to discriminate the missed data. Calculate the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data; calculate the second generation of the second initial generator according to the second simulation discrimination data Loss value, calculate the loss value that can be used to update the model parameters. The second discriminator is updated according to the second discriminant loss value, and the second initial generator is updated according to the second generation loss value. Here, the parameters of the respective models are gradually updated by the loss value to improve the performance of the model. Accuracy. Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in the second The preset range is used to complete the training of the model. The second initial generator that meets the second preset termination condition is determined as the second generator to obtain a second generator that can be used to generate second simulation data.
可选的,如图6所示,步骤S203中,所述根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值,包括:Optionally, as shown in FIG. 6, in step S203, the calculation of the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data includes:
S2031、通过第二判别损失函数处理所述第二真实判别数据和所述第二模拟判别数据,生成所述第二判别损失值,所述第二判别损失函数为:S2031. Process the second real discrimination data and the second simulation discrimination data by using a second discrimination loss function to generate the second discrimination loss value, and the second discrimination loss function is:
Figure PCTCN2020124219-appb-000005
Figure PCTCN2020124219-appb-000005
其中,x r2表示所述真实漏诊数据,D 2(x)表示所述第二真实判别数据,E为期望计算符号,x f2表示所述第二模拟数据,z 2表示所述第二随机噪音,G 2(z 2)表示所述第二模拟数据,D 2(G 2(z 2))表示所述第二判别损失值,
Figure PCTCN2020124219-appb-000006
表示所述第二判别损失值;
Wherein, x r2 represents the real missed diagnosis data, D 2 (x) represents the second true discrimination data, E is the expected calculation symbol, x f2 represents the second simulation data, and z 2 represents the second random noise , G 2 (z 2 ) represents the second simulation data, D 2 (G 2 (z 2 )) represents the second discrimination loss value,
Figure PCTCN2020124219-appb-000006
Represents the second discriminant loss value;
步骤S203中,所述根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值,包括:In step S203, the calculating the second generation loss value of the second initial generator according to the second simulation discrimination data includes:
S2032、通过第二生成损失函数处理所述第二模拟判别数据,生成所述第二生成损失值,所述第二生成损失函数为:S2032. Process the second simulation discrimination data by using a second generation loss function to generate the second generation loss value, where the second generation loss function is:
Figure PCTCN2020124219-appb-000007
Figure PCTCN2020124219-appb-000007
其中,D 2(x f2)为所述第二模拟判别数据,D 3(x f2)为所述初始综合判别器对所述第二模拟数据进行判别后生成的判别数据,β为超参数,
Figure PCTCN2020124219-appb-000008
为所述第二生成损失值。
Wherein, D 2 (x f2 ) is the second simulation discrimination data, D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data, β is a hyperparameter,
Figure PCTCN2020124219-appb-000008
Is the second generation loss value.
在此处,G 2指的是第二初始生成器,D 2指的是第二判别器。第二初始生成器用于生成第二模拟数据,第二判别器用于判别第二模拟数据,并生成第二模拟判别数据;也用于判别真实漏诊数据,并生成第二真实判别数据。第二初始生成器和第二判别器的训练是二个相互对抗的过程。β为超参数,可以在进行模型训练前进行设置。 Here, G 2 refers to the second initial generator, and D 2 refers to the second discriminator. The second initial generator is used to generate second simulation data, and the second discriminator is used to discriminate the second simulation data and generate the second simulation discrimination data; it is also used to discriminate the true missed diagnosis data and generate the second real discrimination data. The training of the second initial generator and the second discriminator are two processes of confrontation. β is a hyperparameter, which can be set before model training.
在计算第二生成损失值,可以增加包含D 3(初始综合判别器)判别数据的损失项(β·logD 3(x f2))。β·logD 3(x f2)的加入,可以使第二生成器生成的漏诊数据的分布符合实际场景的需求。 When calculating the second generation loss value, a loss item (β·logD 3 (x f2 )) including the discriminant data of D 3 (initial comprehensive discriminator) can be added. The addition of β·logD 3 (x f2 ) can make the distribution of the missed diagnosis data generated by the second generator meet the requirements of the actual scene.
可选的,如图7所示,步骤S30,即所述获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对所述初始综合判别器进行训练,包括:Optionally, as shown in FIG. 7, step S30, that is, acquiring real normal data, using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator, includes :
S301、使用所述初始综合判别器对所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据进行判别,生成综合判别数据,所述综合判别数据包括漏诊率、误诊率和正常率;S301. Use the initial comprehensive discriminator to discriminate the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data, and generate comprehensive discrimination data, the comprehensive discrimination data including a missed diagnosis rate, a misdiagnosis rate, and a normal rate;
S302、根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值;S302: Calculate a comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
S303、根据所述综合判别损失值更新所述初始综合判别器;S303. Update the initial comprehensive discriminator according to the comprehensive discrimination loss value;
S304、重复更新初始综合判别器的步骤,直至所述综合判别损失值满足预设收敛条件。S304. Repeat the steps of updating the initial comprehensive discriminator until the comprehensive discriminating loss value meets the preset convergence condition.
在本实施例中,综合判别数据中的漏诊率,指的是初始综合判别器判定的漏诊病历数量与判别病历总数量的比值。例如,判别病历总数量为100,初始综合判别器判定的漏诊病历数量为4,则综合判别数据中的漏诊率为4%。同样的,综合判别数据中的误诊率,指的是初始综合判别器判定的误诊病历数量与判别病历总数量的比值;综合判别数据中的正常率,指的是初始综合判别器判定的正常病历数量与判别病历总数量的比值。In this embodiment, the missed diagnosis rate in the comprehensive discrimination data refers to the ratio of the number of missed diagnosed medical records determined by the initial comprehensive discriminator to the total number of discriminated medical records. For example, if the total number of discriminated medical records is 100, and the number of missed diagnoses determined by the initial comprehensive discriminator is 4, the missed diagnosis rate in the comprehensive discriminator data is 4%. Similarly, the misdiagnosis rate in the comprehensive discrimination data refers to the ratio of the number of misdiagnosed medical records determined by the initial comprehensive discriminator to the total number of discriminated medical records; the normal rate in the comprehensive discrimination data refers to the normal medical records judged by the initial comprehensive discriminator The ratio of the number to the total number of discriminative medical records.
在计算综合判别损失值时,除了初始综合判别器生成综合判别数据外,还需要结合真实的漏诊率、误诊率和正常率。When calculating the comprehensive discrimination loss value, in addition to the initial comprehensive discriminator to generate comprehensive discrimination data, it is also necessary to combine the true missed diagnosis rate, misdiagnosis rate and normal rate.
重复更新初始综合判别器的步骤指的是重复执行步骤S301-S303。The step of repeatedly updating the initial comprehensive discriminator refers to repeatedly performing steps S301-S303.
预设收敛条件可以指综合判别损失值趋近某一定值。The preset convergence condition can mean that the comprehensive judgment loss value approaches a certain value.
可选的,步骤S302,即所述根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值,包括:Optionally, step S302, that is, calculating the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosed rate, and the normal rate includes:
通过综合损失函数处理所述漏诊率、所述误诊率和所述正常率,生成所述综合判别损失值,所述综合损失函数为:The missed diagnosis rate, the misdiagnosis rate, and the normal rate are processed by a comprehensive loss function to generate the comprehensive discrimination loss value, and the comprehensive loss function is:
Figure PCTCN2020124219-appb-000009
Figure PCTCN2020124219-appb-000009
其中,k=1时,y 1为真实的漏诊率,
Figure PCTCN2020124219-appb-000010
为所述综合判别数据中的漏诊率;k=2时,y 2为真实的误诊率,
Figure PCTCN2020124219-appb-000011
为所述综合判别数据中的误诊率;k=3时,y 3为真实的正常率,
Figure PCTCN2020124219-appb-000012
为所述综合判别数据中的正常率;
Figure PCTCN2020124219-appb-000013
为所述综合判别损失值。
Among them, when k=1, y 1 is the true missed diagnosis rate,
Figure PCTCN2020124219-appb-000010
Is the missed diagnosis rate in the comprehensive discrimination data; when k=2, y 2 is the true misdiagnosis rate,
Figure PCTCN2020124219-appb-000011
Is the misdiagnosis rate in the comprehensive discrimination data; when k=3, y 3 is the true normal rate,
Figure PCTCN2020124219-appb-000012
Is the normal rate in the comprehensive discrimination data;
Figure PCTCN2020124219-appb-000013
Is the comprehensive discrimination loss value.
在此处,可通过综合损失函数计算出综合判别损失值。当初始综合判别器收敛时,综合判别损失值趋近某一定值。Here, the comprehensive discrimination loss value can be calculated by the comprehensive loss function. When the initial comprehensive discriminator converges, the comprehensive discriminant loss value approaches a certain value.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种基于对抗网络的电子病历筛查装置,该基于对抗网络的电子病历筛查装置与上述实施例中基于对抗网络的电子病历筛查方法一一对应。如图8所示,该基于对抗网络的电子病历筛查装置包括第一生成模块10、第二生成模块20、训练模块30、确定判别器模块40和筛查模块50。各功能模块详细说明如下:In one embodiment, an electronic medical record screening device based on a confrontation network is provided, and the electronic medical record screening device based on the confrontation network corresponds to the electronic medical record screening method based on the confrontation network in the above-mentioned embodiment in a one-to-one correspondence. As shown in FIG. 8, the electronic medical record screening device based on the confrontation network includes a first generation module 10, a second generation module 20, a training module 30, a determination discriminator module 40 and a screening module 50. The detailed description of each functional module is as follows:
第一生成模块10,用于通过第一生成器生成模拟误诊数据;The first generating module 10 is configured to generate simulated misdiagnosis data through the first generator;
第二生成模块20,用于通过第二生成器生成模拟漏诊数据;The second generating module 20 is used to generate simulated missed diagnosis data through the second generator;
训练模块30,用于获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;The training module 30 is configured to obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator;
确定判别器模块40,用于在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;The determining discriminator module 40 is used to determine the trained initial comprehensive discriminator as a comprehensive discriminator after the training is completed;
筛查模块50,用于使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The screening module 50 is configured to use the comprehensive discriminator to process the medical record to be screened, and obtain the processing result of the medical record to be screened.
可选的,第一生成模块10包括:Optionally, the first generation module 10 includes:
生成第一模拟数据单元,用于在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据;Generating a first simulation data unit, used in the first counter neural network, the first initial generator receives the first random noise, and generates the first simulation data;
第一判别单元,用于第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据;The first discriminating unit is used for the first discriminator to receive the true misdiagnosis data with a label of 1, and generate first real discriminating data; the first discriminator receives the first simulation data with the label of 0, and generates a first simulation Discriminate data;
计算第一损失值单元,用于根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值;The first loss value calculating unit is configured to calculate the first discriminant loss value of the first discriminator according to the first true discriminant data and the first simulation discriminant data; calculate the first discriminant loss value of the first discriminator according to the first simulation discriminant data The first generation loss value of the first initial generator;
第一更新单元,用于根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器;A first update unit, configured to update the first discriminator according to the first discrimination loss value, and update the first initial generator according to the first generation loss value;
第一迭代更新单元,用于重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围;The first iterative update unit is configured to repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is the The first simulation judgment data is in the first preset range;
确定第一生成器单元,用于将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器。A determining first generator unit is configured to determine the first initial generator meeting a first preset termination condition as the first generator.
可选的,计算第一损失值单元包括:Optionally, the unit for calculating the first loss value includes:
计算第一判别损失值单元,用于通过第一判别损失函数处理所述第一真实判别数据和所述第一模拟判别数据,生成所述第一判别损失值,所述第一判别损失函数为:A unit for calculating a first discriminant loss value is used to process the first true discriminant data and the first simulated discriminant data through a first discriminant loss function to generate the first discriminant loss value, and the first discriminant loss function is :
Figure PCTCN2020124219-appb-000014
Figure PCTCN2020124219-appb-000014
其中,x r1表示所述真实误诊数据,D 1(x)表示所述第一真实判别数据,E为期望计算符号,x f1表示所述第一模拟数据,z 1表示所述第一随机噪音,G 1(z 1)表示所述第一模拟数据,D 1(G 1(z 1))表示所述第一判别损失值,
Figure PCTCN2020124219-appb-000015
表示所述第一判别损失值;
Wherein, x r1 represents the true misdiagnosis data, D 1 (x) represents the first true discrimination data, E is the expected calculation symbol, x f1 represents the first simulation data, and z 1 represents the first random noise , G 1 (z 1 ) represents the first simulation data, D 1 (G 1 (z 1 )) represents the first discrimination loss value,
Figure PCTCN2020124219-appb-000015
Represents the first discriminant loss value;
计算第一生成损失值单元,用于通过第一生成损失函数处理所述第一模拟判别数据,生成所述第一生成损失值,所述第一生成损失函数为:A unit for calculating a first generation loss value is configured to process the first simulation discrimination data through a first generation loss function to generate the first generation loss value, and the first generation loss function is:
Figure PCTCN2020124219-appb-000016
Figure PCTCN2020124219-appb-000016
其中,D 1(x f1)为所述第一模拟判别数据,D 3(x f1)为所述初始综合判别器对所述第一模拟数据进行判别后生成的判别数据,α为超参数,
Figure PCTCN2020124219-appb-000017
为所述第一生成损失值。
Wherein, D 1 (x f1 ) is the first simulation discrimination data, D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data, α is a hyperparameter,
Figure PCTCN2020124219-appb-000017
Is the first generation loss value.
可选的,第二生成模块20包括:Optionally, the second generation module 20 includes:
生成第二模拟数据单元,用于在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据;Generating a second simulation data unit, used in the second counter neural network, the second initial generator receives the second random noise, and generates the second simulation data;
第二判别单元,用于第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据;The second discriminating unit is used for the second discriminator to receive the true missed diagnosis data with the tag of 1, and to generate second real discriminating data; the second discriminator receives the second simulation data with the tag of 0, and generates the second simulation Discriminate data;
计算第二损失值单元,用于根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值;The second loss value calculation unit is configured to calculate the second discriminant loss value of the second discriminator according to the second true discriminant data and the second simulation discriminant data; calculate the second discriminant loss value of the second discriminator according to the second simulation discriminant data The second generation loss value of the second initial generator;
第二更新单元,用于根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器;A second update unit, configured to update the second discriminator according to the second discrimination loss value, and update the second initial generator according to the second generation loss value;
第二迭代更新单元,用于重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围;The second iterative update unit is configured to repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is satisfied, and the second preset termination condition is the The second simulation discrimination data is in the second preset range;
确定第二生成器单元,用于将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器。A determining second generator unit is configured to determine the second initial generator meeting a second preset termination condition as the second generator.
可选的,计算第二损失值单元包括:Optionally, the unit for calculating the second loss value includes:
计算第二判别损失值单元,用于通过第二判别损失函数处理所述第二真实判别数据和所述第二模拟判别数据,生成所述第二判别损失值,所述第二判别损失函数为:A unit for calculating a second discriminant loss value is used to process the second true discriminant data and the second simulated discriminant data through a second discriminant loss function to generate the second discriminant loss value, and the second discriminant loss function is :
Figure PCTCN2020124219-appb-000018
Figure PCTCN2020124219-appb-000018
其中,x r2表示所述真实漏诊数据,D 2(x)表示所述第二真实判别数据,E为期望计算符号,x f2表示所述第二模拟数据,z 2表示所述第二随机噪音,G 2(z 2)表示所述第二模拟数据,D 2(G 2(z 2))表示所述第二判别损失值,
Figure PCTCN2020124219-appb-000019
表示所述第二判别损失值;
Wherein, x r2 represents the real missed diagnosis data, D 2 (x) represents the second true discrimination data, E is the expected calculation symbol, x f2 represents the second simulation data, and z 2 represents the second random noise , G 2 (z 2 ) represents the second simulation data, D 2 (G 2 (z 2 )) represents the second discrimination loss value,
Figure PCTCN2020124219-appb-000019
Represents the second discriminant loss value;
计算第二生成损失值单元,用于通过第二生成损失函数处理所述第二模拟判别数据,生成所述第二生成损失值,所述第二生成损失函数为:The second generation loss value calculation unit is configured to process the second simulation discrimination data through a second generation loss function to generate the second generation loss value, and the second generation loss function is:
Figure PCTCN2020124219-appb-000020
Figure PCTCN2020124219-appb-000020
其中,D 2(x f2)为所述第二模拟判别数据,D 3(x f2)为所述初始综合判别器对所述第二模拟数据进行判别后生成的判别数据,β为超参数,
Figure PCTCN2020124219-appb-000021
为所述第二生成损失值。
Wherein, D 2 (x f2 ) is the second simulation discrimination data, D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data, β is a hyperparameter,
Figure PCTCN2020124219-appb-000021
Is the second generation loss value.
可选的,训练模块30包括:Optionally, the training module 30 includes:
生成综合判别数据单元,用于使用所述初始综合判别器对所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据进行判别,生成综合判别数据,所述综合判别数据包括漏诊率、误诊率和正常率;Generating a comprehensive discrimination data unit, configured to use the initial comprehensive discriminator to discriminate the true normal data, the simulated misdiagnosis data and the simulated missed diagnosis data, and generate comprehensive discrimination data, the comprehensive discrimination data including missed diagnosis rate, Misdiagnosis rate and normal rate;
生成综合判别损失值单元,用于根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值;Generating a comprehensive discrimination loss value unit for calculating a comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
更新初始综合判别器单元,用于根据所述综合判别损失值更新所述初始综合判别器;Updating the initial comprehensive discriminator unit, configured to update the initial comprehensive discriminator according to the comprehensive discrimination loss value;
迭代更新初始综合判别器单元,用于重复更新初始综合判别器的步骤,直至所述综合判别损失值满足预设收敛条件。The iterative update of the initial comprehensive discriminator unit is used to repeat the steps of updating the initial comprehensive discriminator until the comprehensive discrimination loss value meets the preset convergence condition.
可选的,生成综合判别损失值单元,还用于通过综合损失函数处理所述漏诊率、所述误诊率和所述正常率,生成所述综合判别损失值,所述综合损失函数为:Optionally, a unit for generating a comprehensive discrimination loss value is further configured to process the missed diagnosis rate, the misdiagnosis rate, and the normal rate through a comprehensive loss function to generate the comprehensive discrimination loss value, and the comprehensive loss function is:
Figure PCTCN2020124219-appb-000022
Figure PCTCN2020124219-appb-000022
其中,k=1时,y 1为真实的漏诊率,
Figure PCTCN2020124219-appb-000023
为所述综合判别数据中的漏诊率;k=2时,y 2为真实的误诊率,
Figure PCTCN2020124219-appb-000024
为所述综合判别数据中的误诊率;k=3时,y 3为真实的正常率,
Figure PCTCN2020124219-appb-000025
为所述综合判别数据中的正常率;
Figure PCTCN2020124219-appb-000026
为所述综合判别损失值。
Among them, when k=1, y 1 is the true missed diagnosis rate,
Figure PCTCN2020124219-appb-000023
Is the missed diagnosis rate in the comprehensive discrimination data; when k=2, y 2 is the true misdiagnosis rate,
Figure PCTCN2020124219-appb-000024
Is the misdiagnosis rate in the comprehensive discrimination data; when k=3, y 3 is the true normal rate,
Figure PCTCN2020124219-appb-000025
Is the normal rate in the comprehensive discrimination data;
Figure PCTCN2020124219-appb-000026
Is the comprehensive discrimination loss value.
关于基于对抗网络的电子病历筛查装置的具体限定可以参见上文中对于基于对抗网络的电子病历筛查方法的限定,在此不再赘述。上述基于对抗网络的电子病历筛查装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。Regarding the specific limitation of the electronic medical record screening device based on the confrontation network, please refer to the above limitation of the electronic medical record screening method based on the confrontation network, which will not be repeated here. The various modules in the above-mentioned anti-network-based electronic medical record screening device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图9所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储上述电子病历筛查方法所涉及的数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于对抗网络的电子病历筛查方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 9. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store the data involved in the above-mentioned electronic medical record screening method. The network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer-readable instructions are executed by the processor, a method for screening electronic medical records based on a counter-network is realized.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现以下步骤:In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
在一个实施例中,提供了一个或多个存储有计算机可读指令的计算机可读存储介质,本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质。可读存储介质上存储有计算机可读指令,计算机可读指令被一个或多个处理器执行时实现以下步骤:In one embodiment, one or more computer-readable storage media storing computer-readable instructions are provided. The readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage media. Storage medium. The readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the following steps are implemented:
通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性可读取存储介质或易失性可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile memory. In a readable storage medium or a volatile readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种基于对抗网络的电子病历筛查方法,其中,包括:A screening method for electronic medical records based on a confrontation network, which includes:
    通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
    通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
    获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
    在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
    使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  2. 如权利要求1所述的基于对抗网络的电子病历筛查方法,其中,所述通过第一生成器生成模拟误诊数据之前,还包括:The method for screening electronic medical records based on a confrontation network according to claim 1, wherein before said generating simulated misdiagnosis data by the first generator, the method further comprises:
    在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据;In the first confrontation neural network, the first initial generator receives the first random noise and generates the first simulation data;
    第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据;The first discriminator receives the true misdiagnosis data with a tag of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a tag of 0, and generates first simulation discrimination data;
    根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值;Calculate the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data; calculate the first generation of the first initial generator according to the first simulation discrimination data Loss value
    根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器;Update the first discriminator according to the first discriminant loss value, and update the first initial generator according to the first generation loss value;
    重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围;Repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is that the first simulation judgment data is in the first Preset range
    将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器。The first initial generator that meets the first preset termination condition is determined as the first generator.
  3. 如权利要求2所述的基于对抗网络的电子病历筛查方法,其中,所述根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值,包括:The method for screening electronic medical records based on a confrontation network according to claim 2, wherein the first discrimination loss value of the first discriminator is calculated according to the first real discrimination data and the first simulation discrimination data ,include:
    通过第一判别损失函数处理所述第一真实判别数据和所述第一模拟判别数据,生成所述第一判别损失值,所述第一判别损失函数为:The first real discrimination data and the first simulation discrimination data are processed by a first discrimination loss function to generate the first discrimination loss value, and the first discrimination loss function is:
    Figure PCTCN2020124219-appb-100001
    Figure PCTCN2020124219-appb-100001
    其中,x r1表示所述真实误诊数据,D 1(x)表示所述第一真实判别数据,E为期望计算符号,x f1表示所述第一模拟数据,z 1表示所述第一随机噪音,G 1(z 1)表示所述第一模拟数据,D 1(G 1(z 1))表示所述第一判别损失值,
    Figure PCTCN2020124219-appb-100002
    表示所述第一判别损失值;
    Wherein, x r1 represents the true misdiagnosis data, D 1 (x) represents the first true discrimination data, E is the expected calculation symbol, x f1 represents the first simulation data, and z 1 represents the first random noise , G 1 (z 1 ) represents the first simulation data, D 1 (G 1 (z 1 )) represents the first discrimination loss value,
    Figure PCTCN2020124219-appb-100002
    Represents the first discriminant loss value;
    所述根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值,包括:The calculating the first generation loss value of the first initial generator according to the first simulation discrimination data includes:
    通过第一生成损失函数处理所述第一模拟判别数据,生成所述第一生成损失值,所述第一生成损失函数为:The first simulation discrimination data is processed by a first generation loss function to generate the first generation loss value, and the first generation loss function is:
    Figure PCTCN2020124219-appb-100003
    Figure PCTCN2020124219-appb-100003
    其中,D 1(x f1)为所述第一模拟判别数据,D 3(x f1)为所述初始综合判别器对所述第一模拟数据进行判别后生成的判别数据,α为超参数,
    Figure PCTCN2020124219-appb-100004
    为所述第一生成损失值。
    Wherein, D 1 (x f1 ) is the first simulation discrimination data, D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data, α is a hyperparameter,
    Figure PCTCN2020124219-appb-100004
    Is the first generation loss value.
  4. 如权利要求1所述的基于对抗网络的电子病历筛查方法,其中,所述通过第二生成器生成模拟漏诊数据之前,还包括:The method for screening electronic medical records based on a confrontation network according to claim 1, wherein before said generating simulated missed diagnosis data by the second generator, the method further comprises:
    在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据;In the second confrontation neural network, the second initial generator receives the second random noise and generates the second simulation data;
    第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据;The second discriminator receives the real missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data;
    根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值;Calculate the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data; calculate the second generation of the second initial generator according to the second simulation discrimination data Loss value
    根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器;Update the second discriminator according to the second discriminant loss value, and update the second initial generator according to the second generation loss value;
    重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围;Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in the second Preset range
    将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器。The second initial generator that meets the second preset termination condition is determined as the second generator.
  5. 如权利要求4所述的基于对抗网络的电子病历筛查方法,其中,所述根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值,包括:The method for screening electronic medical records based on a confrontation network according to claim 4, wherein the second discriminant loss value of the second discriminator is calculated according to the second real discriminant data and the second simulated discriminant data ,include:
    通过第二判别损失函数处理所述第二真实判别数据和所述第二模拟判别数据,生成所述第二判别损失值,所述第二判别损失函数为:The second real discrimination data and the second simulation discrimination data are processed by a second discrimination loss function to generate the second discrimination loss value, and the second discrimination loss function is:
    Figure PCTCN2020124219-appb-100005
    Figure PCTCN2020124219-appb-100005
    其中,x r2表示所述真实漏诊数据,D 2(x)表示所述第二真实判别数据,E为期望计算符号,x f2表示所述第二模拟数据,z 2表示所述第二随机噪音,G 2(z 2)表示所述第二模拟数据,D 2(G 2(z 2))表示所述第二判别损失值,
    Figure PCTCN2020124219-appb-100006
    表示所述第二判别损失值;
    Wherein, x r2 represents the real missed diagnosis data, D 2 (x) represents the second true discrimination data, E is the expected calculation symbol, x f2 represents the second simulation data, and z 2 represents the second random noise , G 2 (z 2 ) represents the second simulation data, D 2 (G 2 (z 2 )) represents the second discrimination loss value,
    Figure PCTCN2020124219-appb-100006
    Represents the second discriminant loss value;
    所述根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值,包括:The calculating the second generation loss value of the second initial generator according to the second simulation discrimination data includes:
    通过第二生成损失函数处理所述第二模拟判别数据,生成所述第二生成损失值,所述第二生成损失函数为:The second simulation discrimination data is processed by a second generation loss function to generate the second generation loss value, and the second generation loss function is:
    Figure PCTCN2020124219-appb-100007
    Figure PCTCN2020124219-appb-100007
    其中,D 2(x f2)为所述第二模拟判别数据,D 3(x f2)为所述初始综合判别器对所述第二模拟数据进行判别后生成的判别数据,β为超参数,
    Figure PCTCN2020124219-appb-100008
    为所述第二生成损失值。
    Wherein, D 2 (x f2 ) is the second simulation discrimination data, D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data, β is a hyperparameter,
    Figure PCTCN2020124219-appb-100008
    Is the second generation loss value.
  6. 如权利要求1所述的基于对抗网络的电子病历筛查方法,其中,所述获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对所述初始综合判别器进行训练,包括:The method for screening electronic medical records based on a confrontation network according to claim 1, wherein said acquiring real normal data uses said real normal data, said simulated misdiagnosis data and said simulated misdiagnosis data to make said initial comprehensive judgment Training, including:
    使用所述初始综合判别器对所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据进行判别,生成综合判别数据,所述综合判别数据包括漏诊率、误诊率和正常率;Use the initial comprehensive discriminator to discriminate the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data to generate comprehensive discrimination data, the comprehensive discrimination data including missed diagnosis rate, misdiagnosis rate, and normal rate;
    根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值;Calculating the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
    根据所述综合判别损失值更新所述初始综合判别器;Updating the initial comprehensive discriminator according to the comprehensive discrimination loss value;
    重复更新初始综合判别器的步骤,直至所述综合判别损失值满足预设收敛条件。The steps of updating the initial comprehensive discriminator are repeated until the comprehensive discriminating loss value meets the preset convergence condition.
  7. 如权利要求6所述的基于对抗网络的电子病历筛查方法,其中,所述根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值,包括:The method for screening electronic medical records based on the adversarial network according to claim 6, wherein the calculation of the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate includes :
    通过综合损失函数处理所述漏诊率、所述误诊率和所述正常率,生成所述综合判别损失值,所述综合损失函数为:The missed diagnosis rate, the misdiagnosis rate, and the normal rate are processed by a comprehensive loss function to generate the comprehensive discrimination loss value, and the comprehensive loss function is:
    Figure PCTCN2020124219-appb-100009
    Figure PCTCN2020124219-appb-100009
    其中,k=1时,y 1为真实的漏诊率,
    Figure PCTCN2020124219-appb-100010
    为所述综合判别数据中的漏诊率;k=2时,y 2为真实的误诊率,
    Figure PCTCN2020124219-appb-100011
    为所述综合判别数据中的误诊率;k=3时,y 3为真实的正常率,
    Figure PCTCN2020124219-appb-100012
    为所述综合判别数据中的正常率;
    Figure PCTCN2020124219-appb-100013
    为所述综合判别损失值。
    Among them, when k=1, y 1 is the true missed diagnosis rate,
    Figure PCTCN2020124219-appb-100010
    Is the missed diagnosis rate in the comprehensive discrimination data; when k=2, y 2 is the true misdiagnosis rate,
    Figure PCTCN2020124219-appb-100011
    Is the misdiagnosis rate in the comprehensive discrimination data; when k=3, y 3 is the true normal rate,
    Figure PCTCN2020124219-appb-100012
    Is the normal rate in the comprehensive discrimination data;
    Figure PCTCN2020124219-appb-100013
    Is the comprehensive discrimination loss value.
  8. 一种基于对抗网络的电子病历筛查装置,其中,包括:An electronic medical record screening device based on a confrontation network, which includes:
    第一生成模块,用于通过第一生成器生成模拟误诊数据;The first generating module is used to generate simulated misdiagnosis data through the first generator;
    第二生成模块,用于通过第二生成器生成模拟漏诊数据;The second generating module is used to generate simulated missed diagnosis data through the second generator;
    训练模块,用于获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;A training module, configured to obtain real normal data, and use the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data to train the initial comprehensive discriminator;
    确定判别器模块,用于在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;The determining discriminator module is used to determine the trained initial comprehensive discriminator as the comprehensive discriminator after the training is completed;
    筛查模块,用于使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The screening module is configured to use the comprehensive discriminator to process the medical record to be screened, and obtain the processing result of the medical record to be screened.
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:
    通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
    通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
    获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
    在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
    使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  10. 如权利要求9所述的计算机设备,其中,所述通过第一生成器生成模拟误诊数据之前,还包括:9. The computer device according to claim 9, wherein before said generating the simulated misdiagnosis data by the first generator, the method further comprises:
    在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据;In the first confrontation neural network, the first initial generator receives the first random noise and generates the first simulation data;
    第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据;The first discriminator receives the true misdiagnosis data with a tag of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a tag of 0, and generates first simulation discrimination data;
    根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值;Calculate the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data; calculate the first generation of the first initial generator according to the first simulation discrimination data Loss value
    根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器;Update the first discriminator according to the first discriminant loss value, and update the first initial generator according to the first generation loss value;
    重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围;Repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is that the first simulation judgment data is in the first Preset range
    将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器。The first initial generator that meets the first preset termination condition is determined as the first generator.
  11. 如权利要求10所述的计算机设备,其中,所述根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值,包括:11. The computer device according to claim 10, wherein said calculating a first discriminant loss value of said first discriminator based on said first real discriminant data and said first simulated discriminant data comprises:
    通过第一判别损失函数处理所述第一真实判别数据和所述第一模拟判别数据,生成所述第一判别损失值,所述第一判别损失函数为:The first real discrimination data and the first simulation discrimination data are processed by a first discrimination loss function to generate the first discrimination loss value, and the first discrimination loss function is:
    Figure PCTCN2020124219-appb-100014
    Figure PCTCN2020124219-appb-100014
    其中,x r1表示所述真实误诊数据,D 1(x)表示所述第一真实判别数据,E为期望计算符号,x f1表示所述第一模拟数据,z 1表示所述第一随机噪音,G 1(z 1)表示所述第一模拟数据,D 1(G 1(z 1))表示所述第一判别损失值,
    Figure PCTCN2020124219-appb-100015
    表示所述第一判别损失值;
    Wherein, x r1 represents the true misdiagnosis data, D 1 (x) represents the first true discrimination data, E is the expected calculation symbol, x f1 represents the first simulation data, and z 1 represents the first random noise , G 1 (z 1 ) represents the first simulation data, D 1 (G 1 (z 1 )) represents the first discrimination loss value,
    Figure PCTCN2020124219-appb-100015
    Represents the first discriminant loss value;
    所述根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值,包括:The calculating the first generation loss value of the first initial generator according to the first simulation discrimination data includes:
    通过第一生成损失函数处理所述第一模拟判别数据,生成所述第一生成损失值,所述第一生成损失函数为:The first simulation discrimination data is processed by a first generation loss function to generate the first generation loss value, and the first generation loss function is:
    Figure PCTCN2020124219-appb-100016
    Figure PCTCN2020124219-appb-100016
    其中,D 1(x f1)为所述第一模拟判别数据,D 3(x f1)为所述初始综合判别器对所述第一 模拟数据进行判别后生成的判别数据,α为超参数,
    Figure PCTCN2020124219-appb-100017
    为所述第一生成损失值。
    Wherein, D 1 (x f1 ) is the first simulation discrimination data, D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data, α is a hyperparameter,
    Figure PCTCN2020124219-appb-100017
    Is the first generation loss value.
  12. 如权利要求9所述的计算机设备,其中,所述通过第二生成器生成模拟漏诊数据之前,还包括:9. The computer device according to claim 9, wherein before said generating the simulated missed diagnosis data by the second generator, it further comprises:
    在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据;In the second confrontation neural network, the second initial generator receives the second random noise and generates the second simulation data;
    第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据;The second discriminator receives the real missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data;
    根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值;Calculate the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data; calculate the second generation of the second initial generator according to the second simulation discrimination data Loss value
    根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器;Update the second discriminator according to the second discriminant loss value, and update the second initial generator according to the second generation loss value;
    重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围;Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in the second Preset range
    将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器。The second initial generator that meets the second preset termination condition is determined as the second generator.
  13. 如权利要求12所述的计算机设备,其中,所述根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值,包括:The computer device according to claim 12, wherein the calculating the second discriminant loss value of the second discriminator based on the second real discriminant data and the second simulation discriminant data comprises:
    通过第二判别损失函数处理所述第二真实判别数据和所述第二模拟判别数据,生成所述第二判别损失值,所述第二判别损失函数为:The second real discrimination data and the second simulation discrimination data are processed by a second discrimination loss function to generate the second discrimination loss value, and the second discrimination loss function is:
    Figure PCTCN2020124219-appb-100018
    Figure PCTCN2020124219-appb-100018
    其中,x r2表示所述真实漏诊数据,D D(x)表示所述第二真实判别数据,E为期望计算符号,x f2表示所述第二模拟数据,z 2表示所述第二随机噪音,G 2(z 2)表示所述第二模拟数据,D 2(G 2(z 2))表示所述第二判别损失值,
    Figure PCTCN2020124219-appb-100019
    表示所述第二判别损失值;
    Wherein, x r2 represents the true missed diagnosis data, D D (x) represents the second true discrimination data, E is the expected calculation symbol, x f2 represents the second simulation data, and z 2 represents the second random noise , G 2 (z 2 ) represents the second simulation data, D 2 (G 2 (z 2 )) represents the second discrimination loss value,
    Figure PCTCN2020124219-appb-100019
    Represents the second discriminant loss value;
    所述根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值,包括:The calculating the second generation loss value of the second initial generator according to the second simulation discrimination data includes:
    通过第二生成损失函数处理所述第二模拟判别数据,生成所述第二生成损失值,所述第二生成损失函数为:The second simulation discrimination data is processed by a second generation loss function to generate the second generation loss value, and the second generation loss function is:
    Figure PCTCN2020124219-appb-100020
    Figure PCTCN2020124219-appb-100020
    其中,D 2(x f2)为所述第二模拟判别数据,D 3(x f2)为所述初始综合判别器对所述第二模拟数据进行判别后生成的判别数据,β为超参数,
    Figure PCTCN2020124219-appb-100021
    为所述第二生成损失值。
    Wherein, D 2 (x f2 ) is the second simulation discrimination data, D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data, β is a hyperparameter,
    Figure PCTCN2020124219-appb-100021
    Is the second generation loss value.
  14. 如权利要求9所述的计算机设备,其中,所述获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对所述初始综合判别器进行训练,包括:9. The computer device according to claim 9, wherein said acquiring real normal data and using said real normal data, said simulated misdiagnosis data and said simulated misdiagnosis data to train said initial comprehensive discriminator comprises:
    使用所述初始综合判别器对所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据进行判别,生成综合判别数据,所述综合判别数据包括漏诊率、误诊率和正常率;Use the initial comprehensive discriminator to discriminate the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data to generate comprehensive discrimination data, the comprehensive discrimination data including missed diagnosis rate, misdiagnosis rate, and normal rate;
    根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值;Calculating the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
    根据所述综合判别损失值更新所述初始综合判别器;Updating the initial comprehensive discriminator according to the comprehensive discrimination loss value;
    重复更新初始综合判别器的步骤,直至所述综合判别损失值满足预设收敛条件。The steps of updating the initial comprehensive discriminator are repeated until the comprehensive discriminating loss value meets the preset convergence condition.
  15. 一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    通过第一生成器生成模拟误诊数据;Generate simulated misdiagnosis data through the first generator;
    通过第二生成器生成模拟漏诊数据;Generate simulated missed diagnosis data through the second generator;
    获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对初始综合判别器进行训练;Acquiring real normal data, and training the initial comprehensive discriminator using the real normal data, the simulated misdiagnosis data, and the simulated misdiagnosis data;
    在训练完毕后,将训练好的所述初始综合判别器确定为综合判别器;After the training is completed, determine the trained initial comprehensive discriminator as the comprehensive discriminator;
    使用所述综合判别器处理待筛查病历,获得所述待筛查病历的处理结果。The comprehensive discriminator is used to process the medical record to be screened, and the processing result of the medical record to be screened is obtained.
  16. 如权利要求15所述的可读存储介质,其中,所述通过第一生成器生成模拟误诊数据之前,还包括:15. The readable storage medium according to claim 15, wherein before generating the simulated misdiagnosis data by the first generator, the method further comprises:
    在第一对抗神经网络中,第一初始生成器接收第一随机噪音,并生成第一模拟数据;In the first confrontation neural network, the first initial generator receives the first random noise and generates the first simulation data;
    第一判别器接收标签为1的真实误诊数据,生成第一真实判别数据;所述第一判别器接收标签为0的所述第一模拟数据,并生成第一模拟判别数据;The first discriminator receives the true misdiagnosis data with a tag of 1, and generates first real discrimination data; the first discriminator receives the first simulation data with a tag of 0, and generates first simulation discrimination data;
    根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值;根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值;Calculate the first discrimination loss value of the first discriminator according to the first real discrimination data and the first simulation discrimination data; calculate the first generation of the first initial generator according to the first simulation discrimination data Loss value
    根据所述第一判别损失值更新所述第一判别器,根据所述第一生成损失值更新所述第一初始生成器;Update the first discriminator according to the first discriminant loss value, and update the first initial generator according to the first generation loss value;
    重复更新所述第一判别器的步骤和更新所述第一初始生成器的步骤,直至满足第一预设终止条件,所述第一预设终止条件为所述第一模拟判别数据处于第一预设范围;Repeat the step of updating the first discriminator and the step of updating the first initial generator until the first preset termination condition is satisfied, and the first preset termination condition is that the first simulation judgment data is in the first Preset range
    将满足第一预设终止条件的所述第一初始生成器确定为所述第一生成器。The first initial generator that meets the first preset termination condition is determined as the first generator.
  17. 如权利要求16所述的可读存储介质,其中,所述根据所述第一真实判别数据和所述第一模拟判别数据计算所述第一判别器的第一判别损失值,包括:15. The readable storage medium of claim 16, wherein the calculating the first discriminant loss value of the first discriminator based on the first real discriminant data and the first simulated discriminant data comprises:
    通过第一判别损失函数处理所述第一真实判别数据和所述第一模拟判别数据,生成所述第一判别损失值,所述第一判别损失函数为:The first real discrimination data and the first simulation discrimination data are processed by a first discrimination loss function to generate the first discrimination loss value, and the first discrimination loss function is:
    Figure PCTCN2020124219-appb-100022
    Figure PCTCN2020124219-appb-100022
    其中,x r1表示所述真实误诊数据,D 1(x)表示所述第一真实判别数据,E为期望计算符号,x f1表示所述第一模拟数据,z 1表示所述第一随机噪音,G 1(z 1)表示所述第一模拟数据,D 1(G 1(z 1))表示所述第一判别损失值,
    Figure PCTCN2020124219-appb-100023
    表示所述第一判别损失值;
    Wherein, x r1 represents the true misdiagnosis data, D 1 (x) represents the first true discrimination data, E is the expected calculation symbol, x f1 represents the first simulation data, and z 1 represents the first random noise , G 1 (z 1 ) represents the first simulation data, D 1 (G 1 (z 1 )) represents the first discrimination loss value,
    Figure PCTCN2020124219-appb-100023
    Represents the first discriminant loss value;
    所述根据所述第一模拟判别数据计算所述第一初始生成器的第一生成损失值,包括:The calculating the first generation loss value of the first initial generator according to the first simulation discrimination data includes:
    通过第一生成损失函数处理所述第一模拟判别数据,生成所述第一生成损失值,所述第一生成损失函数为:The first simulation discrimination data is processed by a first generation loss function to generate the first generation loss value, and the first generation loss function is:
    Figure PCTCN2020124219-appb-100024
    Figure PCTCN2020124219-appb-100024
    其中,D 1(x f1)为所述第一模拟判别数据,D 3(x f1)为所述初始综合判别器对所述第一模拟数据进行判别后生成的判别数据,α为超参数,
    Figure PCTCN2020124219-appb-100025
    为所述第一生成损失值。
    Wherein, D 1 (x f1 ) is the first simulation discrimination data, D 3 (x f1 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the first simulation data, α is a hyperparameter,
    Figure PCTCN2020124219-appb-100025
    Is the first generation loss value.
  18. 如权利要求15所述的可读存储介质,其中,所述通过第二生成器生成模拟漏诊数据之前,还包括:15. The readable storage medium according to claim 15, wherein before generating the simulated missed diagnosis data by the second generator, the method further comprises:
    在第二对抗神经网络中,第二初始生成器接收第二随机噪音,并生成第二模拟数据;In the second confrontation neural network, the second initial generator receives the second random noise and generates the second simulation data;
    第二判别器接收标签为1的真实漏诊数据,生成第二真实判别数据;所述第二判别器接收标签为0的所述第二模拟数据,并生成第二模拟判别数据;The second discriminator receives the real missed diagnosis data with a label of 1, and generates second real discrimination data; the second discriminator receives the second simulated data with a label of 0, and generates second simulated discrimination data;
    根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值;根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值;Calculate the second discrimination loss value of the second discriminator according to the second real discrimination data and the second simulation discrimination data; calculate the second generation of the second initial generator according to the second simulation discrimination data Loss value
    根据所述第二判别损失值更新所述第二判别器,根据所述第二生成损失值更新所述第二初始生成器;Update the second discriminator according to the second discriminant loss value, and update the second initial generator according to the second generation loss value;
    重复更新所述第二判别器的步骤和更新所述第二初始生成器的步骤,直至满足第二预 设终止条件,所述第二预设终止条件为所述第二模拟判别数据处于第二预设范围;Repeat the step of updating the second discriminator and the step of updating the second initial generator until a second preset termination condition is met, and the second preset termination condition is that the second simulation judgment data is in the second Preset range
    将满足第二预设终止条件的所述第二初始生成器确定为所述第二生成器。The second initial generator that meets the second preset termination condition is determined as the second generator.
  19. 如权利要求18所述的可读存储介质,其中,所述根据所述第二真实判别数据和所述第二模拟判别数据计算所述第二判别器的第二判别损失值,包括:17. The readable storage medium of claim 18, wherein the calculating the second discriminant loss value of the second discriminator based on the second real discriminant data and the second simulated discriminant data comprises:
    通过第二判别损失函数处理所述第二真实判别数据和所述第二模拟判别数据,生成所述第二判别损失值,所述第二判别损失函数为:The second real discrimination data and the second simulation discrimination data are processed by a second discrimination loss function to generate the second discrimination loss value, and the second discrimination loss function is:
    Figure PCTCN2020124219-appb-100026
    Figure PCTCN2020124219-appb-100026
    其中,x r2表示所述真实漏诊数据,D 2(x)表示所述第二真实判别数据,E为期望计算符号,x f2表示所述第二模拟数据,z 2表示所述第二随机噪音,G 2(z 2)表示所述第二模拟数据,D 2(G 2(z 2))表示所述第二判别损失值,
    Figure PCTCN2020124219-appb-100027
    表示所述第二判别损失值;
    Wherein, x r2 represents the real missed diagnosis data, D 2 (x) represents the second true discrimination data, E is the expected calculation symbol, x f2 represents the second simulation data, and z 2 represents the second random noise , G 2 (z 2 ) represents the second simulation data, D 2 (G 2 (z 2 )) represents the second discrimination loss value,
    Figure PCTCN2020124219-appb-100027
    Represents the second discriminant loss value;
    所述根据所述第二模拟判别数据计算所述第二初始生成器的第二生成损失值,包括:The calculating the second generation loss value of the second initial generator according to the second simulation discrimination data includes:
    通过第二生成损失函数处理所述第二模拟判别数据,生成所述第二生成损失值,所述第二生成损失函数为:The second simulation discrimination data is processed by a second generation loss function to generate the second generation loss value, and the second generation loss function is:
    Figure PCTCN2020124219-appb-100028
    Figure PCTCN2020124219-appb-100028
    其中,D 2(x f2)为所述第二模拟判别数据,D 3(x f2)为所述初始综合判别器对所述第二模拟数据进行判别后生成的判别数据,β为超参数,
    Figure PCTCN2020124219-appb-100029
    为所述第二生成损失值。
    Wherein, D 2 (x f2 ) is the second simulation discrimination data, D 3 (x f2 ) is the discrimination data generated after the initial comprehensive discriminator discriminates the second simulation data, β is a hyperparameter,
    Figure PCTCN2020124219-appb-100029
    Is the second generation loss value.
  20. 如权利要求15所述的可读存储介质,其中,所述获取真实正常数据,使用所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据对所述初始综合判别器进行训练,包括:The readable storage medium according to claim 15, wherein said acquiring real normal data and using said real normal data, said simulated misdiagnosis data and said simulated misdiagnosis data to train said initial comprehensive discriminator comprises :
    使用所述初始综合判别器对所述真实正常数据、所述模拟误诊数据和所述模拟漏诊数据进行判别,生成综合判别数据,所述综合判别数据包括漏诊率、误诊率和正常率;Use the initial comprehensive discriminator to discriminate the real normal data, the simulated misdiagnosis data, and the simulated missed diagnosis data to generate comprehensive discrimination data, the comprehensive discrimination data including missed diagnosis rate, misdiagnosis rate, and normal rate;
    根据所述漏诊率、所述误诊率和所述正常率计算所述初始综合判别器的综合判别损失值;Calculating the comprehensive discrimination loss value of the initial comprehensive discriminator according to the missed diagnosis rate, the misdiagnosis rate, and the normal rate;
    根据所述综合判别损失值更新所述初始综合判别器;Updating the initial comprehensive discriminator according to the comprehensive discrimination loss value;
    重复更新初始综合判别器的步骤,直至所述综合判别损失值满足预设收敛条件。The steps of updating the initial comprehensive discriminator are repeated until the comprehensive discriminating loss value meets the preset convergence condition.
PCT/CN2020/124219 2020-09-09 2020-10-28 Electronic medical record screening method and apparatus based on adversarial network, and device and medium WO2021151326A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010941842.2A CN112017790B (en) 2020-09-09 2020-09-09 Electronic medical record screening method, device, equipment and medium based on countermeasure network
CN202010941842.2 2020-09-09

Publications (1)

Publication Number Publication Date
WO2021151326A1 true WO2021151326A1 (en) 2021-08-05

Family

ID=73521424

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124219 WO2021151326A1 (en) 2020-09-09 2020-10-28 Electronic medical record screening method and apparatus based on adversarial network, and device and medium

Country Status (2)

Country Link
CN (1) CN112017790B (en)
WO (1) WO2021151326A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160180022A1 (en) * 2014-12-18 2016-06-23 Fortinet, Inc. Abnormal behaviour and fraud detection based on electronic medical records
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN110060774A (en) * 2019-04-29 2019-07-26 赵蕾 A kind of thyroid nodule recognition methods based on production confrontation network
CN110808095A (en) * 2019-09-18 2020-02-18 平安科技(深圳)有限公司 Method for identifying diagnosis result, method for training model, computer device and storage medium
CN110910976A (en) * 2019-10-12 2020-03-24 平安国际智慧城市科技股份有限公司 Medical record detection method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170099737A (en) * 2016-02-23 2017-09-01 노을 주식회사 Contact-type staining patch and staining method using the same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160180022A1 (en) * 2014-12-18 2016-06-23 Fortinet, Inc. Abnormal behaviour and fraud detection based on electronic medical records
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN110060774A (en) * 2019-04-29 2019-07-26 赵蕾 A kind of thyroid nodule recognition methods based on production confrontation network
CN110808095A (en) * 2019-09-18 2020-02-18 平安科技(深圳)有限公司 Method for identifying diagnosis result, method for training model, computer device and storage medium
CN110910976A (en) * 2019-10-12 2020-03-24 平安国际智慧城市科技股份有限公司 Medical record detection method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112017790A (en) 2020-12-01
CN112017790B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
WO2020232877A1 (en) Question answer selection method and apparatus, computer device, and storage medium
WO2022042123A1 (en) Image recognition model generation method and apparatus, computer device and storage medium
US10872692B2 (en) Data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations
CN112037912A (en) Triage model training method, device and equipment based on medical knowledge map
US20220326970A1 (en) User interface for providing a medical best practice recommendation based on a user-entered medical observation of a patient
WO2021114620A1 (en) Medical-record quality control method, apparatus, computer device, and storage medium
WO2023056723A1 (en) Fault diagnosis method and apparatus, and electronic device and storage medium
CN109325118B (en) Unbalanced sample data preprocessing method and device and computer equipment
WO2021151358A1 (en) Triage information recommendation method and apparatus based on interpretation model, and device and medium
WO2020034801A1 (en) Medical feature screening method and apparatus, computer device, and storage medium
WO2021151311A1 (en) Group convolution number searching method and apparatus
WO2021068524A1 (en) Image matching method and apparatus, computer device, and storage medium
WO2021159814A1 (en) Text data error detection method and apparatus, terminal device, and storage medium
CN113627159B (en) Training data determining method, device, medium and product of error correction model
WO2021151326A1 (en) Electronic medical record screening method and apparatus based on adversarial network, and device and medium
CN115859128B (en) Analysis method and system based on interaction similarity of archive data
WO2022257468A1 (en) Method and apparatus for updating dialogue management system, and computer device and storage medium
WO2022227169A1 (en) Image classification method and apparatus, and electronic device and storage medium
CN112542244B (en) Auxiliary information generation method, related device and computer program product
CN111859985B (en) AI customer service model test method and device, electronic equipment and storage medium
CN114613512A (en) Screening method, device, equipment and storage medium for anti-breast cancer candidate drugs
CN113658711A (en) Medical data localization method and device, computer equipment and storage medium
CN115063621B (en) Multi-view clustering method, device, computer equipment and storage medium
WO2021016995A1 (en) Data processing method and apparatus, computer device, and storage medium
WO2024109083A1 (en) Network traffic inspection method, electronic device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20916998

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20916998

Country of ref document: EP

Kind code of ref document: A1