WO2021151291A1 - Disease risk analysis method, apparatus, electronic device, and computer storage medium - Google Patents

Disease risk analysis method, apparatus, electronic device, and computer storage medium Download PDF

Info

Publication number
WO2021151291A1
WO2021151291A1 PCT/CN2020/112330 CN2020112330W WO2021151291A1 WO 2021151291 A1 WO2021151291 A1 WO 2021151291A1 CN 2020112330 W CN2020112330 W CN 2020112330W WO 2021151291 A1 WO2021151291 A1 WO 2021151291A1
Authority
WO
WIPO (PCT)
Prior art keywords
data set
target
data
disease
classification
Prior art date
Application number
PCT/CN2020/112330
Other languages
French (fr)
Chinese (zh)
Inventor
李映雪
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151291A1 publication Critical patent/WO2021151291A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Definitions

  • This application relates to the field of data processing technology, and in particular to a disease risk analysis method, device, electronic equipment, and computer-readable storage medium.
  • the methods for analyzing the risk of disease mainly use logistic regression, decision tree and other interpretable methods.
  • the inventor realized that there are many human subjective factors in this method, and the prediction accuracy is not high. And the forecasting efficiency is low. Therefore, how to achieve high-precision and high-efficiency analysis of disease risk has become an urgent problem to be solved.
  • a disease risk analysis method provided by this application includes:
  • the disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  • the present application also provides a disease risk analysis device, which includes:
  • the data classification module is used to obtain a training data set, and classify the training data set to obtain a classification data set;
  • the model training module is used to train a plurality of pre-built weak classifiers using the classification data set, and select a plurality of target weak classifiers from the plurality of weak classifiers after training, and the target weak classifier Classifiers are aggregated into disease models;
  • the target data acquisition module is used to acquire the user data set to be judged, and preprocess the user data set to be judged to obtain the target data set;
  • An index relationship establishment module configured to establish an index relationship between the target data set and the classification data set
  • a data matching module configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set
  • the analysis and calculation module is used to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
  • This application also provides an electronic device, which includes:
  • Memory storing at least one instruction
  • the processor executes the instructions stored in the memory to implement the following steps:
  • the disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  • This application also provides a computer-readable storage medium, including a storage data area and a storage program area.
  • the storage data area stores data created according to the use of blockchain nodes, and the storage program area stores a computer program; wherein, the computer The following steps are implemented when the program is executed by the processor:
  • the disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  • FIG. 1 is a schematic flowchart of a disease risk analysis method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of modules of a disease risk analysis device provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of the internal structure of an electronic device for implementing a disease risk analysis method provided by an embodiment of the application;
  • the execution subject of the disease risk analysis method provided in the embodiment of the present application includes, but is not limited to, at least one of the electronic devices that can be configured to execute the method provided in the embodiment of the present application, such as a server and a terminal.
  • the disease risk analysis method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform.
  • the server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, etc.
  • Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the underlying platform of the blockchain can include processing modules such as user management, basic services, smart contracts, and operation monitoring.
  • the user management module is responsible for the identity information management of all blockchain participants, including the maintenance of public and private key generation (account management), key management, and maintenance of the correspondence between the user’s real identity and the blockchain address (authority management), etc.
  • authorization supervise and audit certain real-identity transactions, and provide risk control rule configuration (risk control audit); basic service modules are deployed on all blockchain node devices to verify the validity of business requests, After completing the consensus on the valid request, it is recorded on the storage.
  • the basic service For a new business request, the basic service first performs interface adaptation analysis and authentication processing (interface adaptation), and then encrypts the business information through the consensus algorithm (consensus management), After encryption, it is completely and consistently transmitted to the shared ledger (network communication), and recorded and stored; the smart contract module is responsible for contract registration and issuance, contract triggering and contract execution.
  • interface adaptation interface adaptation
  • consensus algorithm consensus algorithm
  • the smart contract module is responsible for contract registration and issuance, contract triggering and contract execution.
  • the operation monitoring module is mainly responsible for the deployment of the product release process , Configuration modification, contract settings, cloud adaptation, and visual output of real-time status during product operation, such as: alarms, monitoring network conditions, monitoring node equipment health status, etc.
  • FIG. 1 it is a schematic flowchart of a disease risk analysis method provided by an embodiment of this application.
  • the method can be executed by a device, and the device can be implemented by software and/or hardware.
  • the disease risk analysis method includes:
  • the training data set is a data set that records the user's disease history information.
  • the training data set includes, but is not limited to: information related to physical conditions (such as gender, age, allergy history, etc.), disease Name, disease type, disease symptoms, disease medication.
  • the training data can be obtained from a database used for storing patient data in various hospitals, and the data stored in the database is the data after desensitizing the patient data.
  • the training data set when the training data set is classified, the training data set is classified according to different characteristics. For example, the training data in the training data set is classified according to the same disease type; or, the training data set is classified according to the same disease symptoms, and the classification data in the classification data set corresponds to different diseases.
  • the embodiment of the present application stores the classification data set in a pre-built disease database
  • the disease database may be a mysql database, an Oracle database, and the like.
  • the weak classifier is:
  • h( ⁇ i , p, ⁇ ) is the classification result of the weak classifier
  • ⁇ i is the classification data in the classification data set
  • p is the indicator parameter of the preset inequality sign direction
  • is the preset classification threshold
  • multiple different classification thresholds are preset for ⁇ to obtain multiple weak classifiers.
  • Using multiple weak classifiers to classify the classification data set can obtain a variety of classification results, where each weak classifier corresponds to a classification result.
  • an error rate function is used to select the plurality of pre-trained weak classifiers according to the classification results to obtain Multiple target weak classifiers.
  • the error rate function is:
  • w i is the classification data set
  • yi is the classification result of the classification data in the classification data set.
  • multiple weak classifiers whose error rate is less than a preset error threshold are selected as the target weak classifiers.
  • the number of the target weak classifiers is consistent with the number of categories of the classification data in the classification data set.
  • the disease model is as follows:
  • t is the number of the target weak classifier
  • f k is the target weak classifier
  • F is the set of all target weak classifiers
  • a target weak classifier with higher accuracy can be obtained, and a higher accuracy can be obtained.
  • Weak classifiers are aggregated into disease models to improve the accuracy of disease models.
  • the user data set to be determined may be stored in a blockchain node.
  • this application may use a pre-edited java statement to call the user data set to be judged from the nodes used in one or more blockchains.
  • the data set of the user to be judged includes but is not limited to: information of the user to be judged (such as gender, age, etc.), the historical illness of the user to be judged, the historical medication of the user to be judged, and the user’s information to be judged.
  • information of the user to be judged such as gender, age, etc.
  • the historical illness of the user to be judged the historical medication of the user to be judged
  • the user’s information to be judged For historical disease symptoms, the number of users to be determined may be one or more.
  • the preprocessing includes, but is not limited to: data filling, data correction, data deletion, and data standardization.
  • the preprocessing the user data set to be judged to obtain the target data set includes:
  • the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged.
  • the user data to be judged includes multiple attribute data and corresponding data with the judged user.
  • Numerical value for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set.
  • said generating the prediction data of the missing data in the missing data set includes:
  • mice function to select the adjacent data of any missing data in the missing data set
  • the embodiment of the application uses the mice function to set a length threshold with the position of any missing data in the missing data set in the user data set to be judged as the center point, and select adjacent data within the length threshold, And use the following average algorithm to calculate the average value of the adjacent data to obtain the predicted data Avg:
  • V is the number of adjacent data
  • D v is any adjacent data
  • the embodiment of the present application fills in the user data to be judged with missing data, which can make the data to be judged more complete, which is beneficial to improve the accuracy of model training.
  • said establishing an index relationship between the target data set and the classification data set in the disease database includes:
  • An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  • the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.
  • the search is performed according to the historical disease symptoms of the user contained in the target data set.
  • the category corresponding to the target data in the target data set and the classified data in the classification data set is the index relationship.
  • the method described in the embodiment of the present application further includes: transmitting the target data set to the disease database through the TCP/IP protocol, so
  • the TCP/IP protocol is a data transmission protocol, and the data transmission interface of the disease database can be called according to the TCP/IP protocol, thereby facilitating the efficient transmission of the target data set to the disease database.
  • matching the target data set with the classified data set according to the index relationship to obtain a matching data set includes:
  • the multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
  • a preset character grabber can be used to extract characters from the target data in the target data set, where the character grabber is a python sentence, and the python sentence is used for character grabbing.
  • the embodiment of the present application matches the multiple character data sets with the classification data set through the index relationship to generate a matching data set, that is, according to The index relationship finds the category corresponding to the classification data in the character data set and the classification data set.
  • the matching data set includes target data in the target data set and classification data in a classification data set corresponding to the target data.
  • the present application further includes using the following array aggregation algorithm to perform array aggregation on the character data set to generate an array data set:
  • J is the array data set
  • ⁇ i is the character in the character data set
  • m is the number of characters in the character data set.
  • Performing array aggregation on the character data set to generate an array data set, and aggregating the data together can further accelerate the efficiency of subsequent data processing.
  • the analysis result is the probability that the user to be judged corresponding to the target data in the target data set suffers from the disease corresponding to the classification data in the classification data set.
  • the embodiment of the present application uses the following analysis algorithm to perform the analysis calculation to obtain the analysis result
  • x i is the matching data in the matching data set
  • t is the number of weak classifiers in the disease model
  • f t (x i ) is the output of the weak classifier
  • the embodiment of the present application further includes sending a treatment plan reminder according to the analysis result of the disease.
  • the method further includes: comparing the disease analysis result with a preset result threshold;
  • the reminder can be sent directly to the user to be judged corresponding to the data set of the user to be judged.
  • the first treatment plan and the second treatment plan may be different treatment plans corresponding to different disease severity.
  • the reminder of the treatment plan includes an analysis of the cause of the disease.
  • sending reminders of the treatment plan based on the results of the disease analysis is beneficial for relevant personnel to quickly obtain personalized demand information.
  • the obtained training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training.
  • Weak classifier which aggregates the weak classifiers into disease models; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result.
  • the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis.
  • the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately.
  • determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
  • FIG. 2 it is a schematic diagram of modules of the disease risk analysis device of the present application.
  • the disease risk analysis apparatus 100 described in this application can be installed in an electronic device.
  • the disease risk analysis device may include a data classification module 101, a model training module 102, a target data acquisition module 103, an index relationship establishment module 104, a data matching module 105, and an analysis calculation module 106.
  • the module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
  • each module/unit is as follows:
  • the data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set;
  • the model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models;
  • the target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set;
  • the index module 104 is configured to establish an index relationship between the target data set and the classified data set;
  • the data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;
  • the analysis and calculation module 106 is configured to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
  • each module of the disease risk analysis device is as follows:
  • the data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set.
  • the training data set is a data set that records the user's disease history information.
  • the training data set includes, but is not limited to: information related to physical conditions (such as gender, age, allergy history, etc.), disease Name, disease type, disease symptoms, disease medication.
  • the training data can be obtained from a database used for storing patient data in various hospitals, and the data stored in the database is the data after desensitizing the patient data.
  • the training data set when the training data set is classified, the training data set is classified according to different characteristics. For example, the training data in the training data set is classified according to the same disease type; or, the training data set is classified according to the same disease symptoms, and the classification data in the classification data set corresponds to different diseases.
  • the embodiment of the application stores the classification data set in a pre-built disease database
  • the disease database may be a mysql database, an Oracle database, or the like.
  • the model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models.
  • the weak classifier is:
  • h( ⁇ i , p, ⁇ ) is the classification result of the weak classifier
  • ⁇ i is the classification data in the classification data set
  • p is the indicator parameter of the preset inequality sign direction
  • is the preset classification threshold
  • multiple different classification thresholds are preset for ⁇ to obtain multiple weak classifiers.
  • Using multiple weak classifiers to classify the classification data set can obtain a variety of classification results, where each weak classifier corresponds to a classification result.
  • an error rate function is used to select the plurality of pre-trained weak classifiers according to the classification results to obtain Multiple target weak classifiers.
  • the error rate function is:
  • w i is the classification data set
  • yi is the classification result of the classification data in the classification data set.
  • multiple weak classifiers whose error rate is less than a preset error threshold are selected as the target weak classifiers.
  • the number of the target weak classifiers is consistent with the number of categories of the classification data in the classification data set.
  • the disease model is as follows:
  • t is the number of the target weak classifier
  • f k is the target weak classifier
  • F is the set of all target weak classifiers
  • a target weak classifier with higher accuracy can be obtained, and a higher accuracy can be obtained.
  • Weak classifiers are aggregated into disease models to improve the accuracy of disease models.
  • the target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set.
  • the user data set to be determined may be stored in a blockchain node.
  • this application may use a pre-edited java statement to call the user data set to be judged from the nodes used in one or more blockchains.
  • the data set of the user to be judged includes but is not limited to: information of the user to be judged (such as gender, age, etc.), the historical illness of the user to be judged, the historical medication of the user to be judged, and the user’s information to be judged.
  • information of the user to be judged such as gender, age, etc.
  • the historical illness of the user to be judged the historical medication of the user to be judged
  • the user’s information to be judged For historical disease symptoms, the number of users to be determined may be one or more.
  • the preprocessing includes, but is not limited to: data filling, data correction, data deletion, and data standardization.
  • the target data acquisition module 103 is specifically configured to:
  • the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged.
  • the user data to be judged includes multiple attribute data and corresponding data with the judged user.
  • Numerical value for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set.
  • said generating the prediction data of the missing data in the missing data set includes:
  • mice function to select the adjacent data of any missing data in the missing data set
  • the embodiment of the application uses the mice function to set a length threshold with the position of any missing data in the missing data set in the user data set to be judged as the center point, and select adjacent data within the length threshold, And use the following average algorithm to calculate the average value of the adjacent data to obtain the predicted data Avg:
  • V is the number of adjacent data
  • D v is any adjacent data
  • the embodiment of the present application fills in the user data to be judged with missing data, which can make the data to be judged more complete, which is beneficial to improve the accuracy of model training.
  • the index relationship establishment module 104 is configured to establish an index relationship between the target data set and the classification data set.
  • the index relationship establishment module 104 is specifically configured to:
  • An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  • the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.
  • the search is performed according to the historical disease symptoms of the user contained in the target data set.
  • the category corresponding to the target data in the target data set and the classified data in the classification data set is the index relationship.
  • the method described in the embodiment of the present application further includes: transmitting the target data set to the disease database through the TCP/IP protocol, so
  • the TCP/IP protocol is a data transmission protocol, and the data transmission interface of the disease database can be called according to the TCP/IP protocol, thereby facilitating the efficient transmission of the target data set to the disease database.
  • the data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set.
  • the data matching module 105 is specifically configured to: extract characters from multiple target data in the target data set to generate multiple character data corresponding to the multiple target data set;
  • the multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
  • a preset character grabber can be used to extract characters from the target data in the target data set, where the character grabber is a python sentence, and the python sentence is used for character grabbing.
  • the embodiment of the present application matches the multiple character data sets with the classification data set through the index relationship to generate a matching data set, that is, according to The index relationship finds the category corresponding to the classification data in the character data set and the classification data set.
  • the matching data set includes target data in the target data set and classification data in a classification data set corresponding to the target data.
  • the present application further includes using the following array aggregation algorithm to perform array aggregation on the character data set to generate an array data set:
  • J is the array data set
  • ⁇ i is the character in the character data set
  • m is the number of characters in the character data set.
  • Performing array aggregation on the character data set to generate an array data set, and aggregating the data together can further accelerate the efficiency of subsequent data processing.
  • the analysis and calculation module 106 is configured to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
  • the analysis result is the probability that the user to be judged corresponding to the target data in the target data set suffers from the disease corresponding to the classification data in the classification data set.
  • the embodiment of the present application uses the following analysis algorithm to perform the analysis calculation to obtain the analysis result
  • x i is the matching data in the matching data set
  • t is the number of weak classifiers in the disease model
  • f t (x i ) is the output of the weak classifier
  • the embodiment of the present application further includes sending a treatment plan reminder according to the analysis result of the disease.
  • the device further includes a message sending module, and the message sending module is configured to:
  • the reminder can be sent directly to the user to be judged corresponding to the data set of the user to be judged.
  • the first treatment plan and the second treatment plan may be different treatment plans corresponding to different disease severity.
  • the reminder of the treatment plan includes an analysis of the cause of the disease.
  • sending reminders of the treatment plan based on the results of the disease analysis is beneficial for relevant personnel to quickly obtain personalized demand information.
  • the acquired training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training.
  • Weak classifier which aggregates the weak classifier into a disease model; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result.
  • the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis.
  • the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately.
  • determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
  • FIG. 3 it is a schematic diagram of the structure of an electronic device implementing the disease risk analysis method of the present application.
  • the electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a disease risk analysis program 12.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1.
  • the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the disease risk analysis program 12, etc., but also to temporarily store data that has been output or will be output.
  • the processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc.
  • the processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules stored in the memory 11 (such as executing Disease risk analysis programs, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
  • the bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
  • the obtained training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training.
  • Weak classifier which aggregates the weak classifiers into disease models; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result.
  • the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis.
  • the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately.
  • determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
  • FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.
  • the electronic device 1 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface.
  • the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may also include a user interface.
  • the user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)).
  • the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
  • the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.
  • the disease risk analysis program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
  • the disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  • the integrated module/unit of the electronic device 1 may be stored in a computer readable storage medium, and the computer readable storage
  • the medium can be volatile or non-volatile.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
  • the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store a block chain node Use the created data, etc.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Abstract

Provided is a disease risk analysis method, relating to data processing technology, comprising: obtaining a training data set and performing classification to obtain a classified data set (S1); using the classified data set to train a plurality of pre-built weak classifiers, and selecting a plurality of target weak classifiers from the plurality of trained weak classifiers and aggregating the target weak classifiers into a disease model (S2); obtaining a user data set to be determined, and pre-processing the user data set to be determined to obtain a target data set (S3); establishing an index relationship between the target data set and the classified data set (S4); matching the target data set with the classified data set according to the index relationship to obtain a matched data set (S5); using the disease model to analyze and calculate the matched data set to obtain a disease analysis result (S6). In addition, the invention also relates to blockchain technology; basic data and/or feature data can be stored in a blockchain node. The invention can solve the problems of low analysis efficiency and low accuracy of disease risk analysis.

Description

疾病风险的分析方法、装置、电子设备及计算机存储介质Disease risk analysis method, device, electronic equipment and computer storage medium
本申请要求于2020年5月26日提交中国专利局、申请号为CN202010459737.5、名称为“疾病风险的分析方法、装置、电子设备及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 26, 2020, the application number is CN202010459737.5, and the title is "Disease Risk Analysis Methods, Devices, Electronic Equipment, and Computer Storage Media", all of which The content is incorporated in this application by reference.
技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种疾病风险的分析方法、装置、电子设备及计算机可读存储介质。This application relates to the field of data processing technology, and in particular to a disease risk analysis method, device, electronic equipment, and computer-readable storage medium.
背景技术Background technique
随着大数据的兴起,数据处理技术被应用于各领域中,在人们愈发关注身体健康的今天,医疗领域中,也不乏利用病人的各项信息数据对人们的健康状况、患病风险进行分析,从而对健康状况进行评估的数据处理技术。With the rise of big data, data processing technology has been applied in various fields. Today, when people are paying more and more attention to physical health, there is no shortage of patients’ information and data in the medical field to assess people’s health and disease risks. Data processing technology that analyzes and evaluates health status.
现有技术中,对疾病的患病风险进行分析的方法主要使用逻辑回归,决策树等可解释性强的方法,发明人意识到该方法中人为主观因素较多,导致的预测准确性不高且预测效率低下。因此,如何对疾病的患病风险实现高精度,高效率分析成为一个亟待解决的问题。In the prior art, the methods for analyzing the risk of disease mainly use logistic regression, decision tree and other interpretable methods. The inventor realized that there are many human subjective factors in this method, and the prediction accuracy is not high. And the forecasting efficiency is low. Therefore, how to achieve high-precision and high-efficiency analysis of disease risk has become an urgent problem to be solved.
发明内容Summary of the invention
本申请提供的一种疾病风险的分析方法,包括:A disease risk analysis method provided by this application includes:
获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
本申请还提供一种疾病风险的分析装置,所述装置包括:The present application also provides a disease risk analysis device, which includes:
数据分类模块,用于获取训练数据集,将所述训练数据集进行分类得到分类数据集;The data classification module is used to obtain a training data set, and classify the training data set to obtain a classification data set;
模型训练模块,用于利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;The model training module is used to train a plurality of pre-built weak classifiers using the classification data set, and select a plurality of target weak classifiers from the plurality of weak classifiers after training, and the target weak classifier Classifiers are aggregated into disease models;
目标数据获取模块,用于获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;The target data acquisition module is used to acquire the user data set to be judged, and preprocess the user data set to be judged to obtain the target data set;
索引关系建立模块,用于将所述目标数据集与所述分类数据集建立索引关系;An index relationship establishment module, configured to establish an index relationship between the target data set and the classification data set;
数据匹配模块,用于根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集;A data matching module, configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;
分析计算模块,用于利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The analysis and calculation module is used to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
本申请还提供一种电子设备,所述电子设备包括:This application also provides an electronic device, which includes:
存储器,存储至少一个指令;及Memory, storing at least one instruction; and
处理器,执行所述存储器中存储的指令以实现如下步骤:The processor executes the instructions stored in the memory to implement the following steps:
获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
本申请还提供一种计算机可读存储介质,包括存储数据区和存储程序区,存储数据区存储根据区块链节点的使用所创建的数据,存储程序区存储有计算机程序;其中,所述计算机程序被处理器执行时实现如下步骤:This application also provides a computer-readable storage medium, including a storage data area and a storage program area. The storage data area stores data created according to the use of blockchain nodes, and the storage program area stores a computer program; wherein, the computer The following steps are implemented when the program is executed by the processor:
获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
附图说明Description of the drawings
图1为本申请一实施例提供的疾病风险的分析方法的流程示意图;FIG. 1 is a schematic flowchart of a disease risk analysis method provided by an embodiment of the application;
图2为本申请一实施例提供的疾病风险的分析装置的模块示意图;2 is a schematic diagram of modules of a disease risk analysis device provided by an embodiment of the application;
图3为本申请一实施例提供的实现疾病风险的分析方法的电子设备的内部结构示意图;3 is a schematic diagram of the internal structure of an electronic device for implementing a disease risk analysis method provided by an embodiment of the application;
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请实施例提供的疾病风险的分析方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述疾病风险的分析方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。The execution subject of the disease risk analysis method provided in the embodiment of the present application includes, but is not limited to, at least one of the electronic devices that can be configured to execute the method provided in the embodiment of the present application, such as a server and a terminal. In other words, the disease risk analysis method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, etc.
区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
区块链底层平台可以包括用户管理、基础服务、智能合约以及运营监控等处理模块。其中,用户管理模块负责所有区块链参与者的身份信息管理,包括维护公私钥生成(账户管理)、密钥管理以及用户真实身份和区块链地址对应关系维护(权限管理)等,并且在授权的情况下,监管和审计某些真实身份的交易情况,提供风险控制的规则配置(风控审计);基础服务模块部署在所有区块链节点设备上,用来验证业务请求的有效性,并对有效请求完成共识后记录到存储上,对于一个新的业务请求,基础服务先对接口适配解析和鉴权处理(接口适配),然后通过共识算法将业务信息加密(共识管理),在加密之后完整一致的传输至共享账本上(网络通信),并进行记录存储;智能合约模块负责合约的注册发行以及合约触发和合约执行,开发人员可以通过某种编程语言定义合约逻辑,发布到区块链上(合约注册),根据合约条款的逻辑,调用密钥或者其它的事件触发执行,完成合约逻辑,同时还提供对合约升级注销的功能;运营监控模块主要负责产品发布过程中的部署、配置的修改、合约设置、云适配以及产品运行中的实时状态的可视化输出,例如:告警、监控网络情况、监控节点设备健康状态等。The underlying platform of the blockchain can include processing modules such as user management, basic services, smart contracts, and operation monitoring. Among them, the user management module is responsible for the identity information management of all blockchain participants, including the maintenance of public and private key generation (account management), key management, and maintenance of the correspondence between the user’s real identity and the blockchain address (authority management), etc. In the case of authorization, supervise and audit certain real-identity transactions, and provide risk control rule configuration (risk control audit); basic service modules are deployed on all blockchain node devices to verify the validity of business requests, After completing the consensus on the valid request, it is recorded on the storage. For a new business request, the basic service first performs interface adaptation analysis and authentication processing (interface adaptation), and then encrypts the business information through the consensus algorithm (consensus management), After encryption, it is completely and consistently transmitted to the shared ledger (network communication), and recorded and stored; the smart contract module is responsible for contract registration and issuance, contract triggering and contract execution. Developers can define the contract logic through a certain programming language and publish it to On the blockchain (contract registration), according to the logic of the contract terms, call keys or other events to trigger execution, complete the contract logic, and also provide the function of contract upgrade and cancellation; the operation monitoring module is mainly responsible for the deployment of the product release process , Configuration modification, contract settings, cloud adaptation, and visual output of real-time status during product operation, such as: alarms, monitoring network conditions, monitoring node equipment health status, etc.
本申请提供一种疾病风险的分析方法。参照图1所示,为本申请一实施例提供的疾病 风险的分析方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。This application provides a disease risk analysis method. Referring to FIG. 1, it is a schematic flowchart of a disease risk analysis method provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.
在本实施例中,疾病风险的分析方法包括:In this embodiment, the disease risk analysis method includes:
S1、获取训练数据集,将所述训练数据集进行分类得到分类数据集。S1. Obtain a training data set, and classify the training data set to obtain a classification data set.
本申请实施例中,所述训练数据集为记录用户疾病历史信息的数据集,所述训练数据集包括但不限于:与身体状况相关的信息(如,性别、年龄、过敏史等)、疾病名称、疾病类型、疾病症状、疾病用药。所述训练数据可从各医院用于存储病人数据的数据库中获取,该数据库中存储的为对病人数据进行脱敏处理之后的数据。In the embodiment of the present application, the training data set is a data set that records the user's disease history information. The training data set includes, but is not limited to: information related to physical conditions (such as gender, age, allergy history, etc.), disease Name, disease type, disease symptoms, disease medication. The training data can be obtained from a database used for storing patient data in various hospitals, and the data stored in the database is the data after desensitizing the patient data.
进一步地,本申请实施例中,将所述训练数据集进行分类时,将所述训练数据集按照不同的特征进行分类。例如,将所述训练数据集中的训练数据按照相同的疾病类型进行分类;或者,将所述训练数据集按照相同的疾病症状进行分类,所述分类数据集中的分类数据对应着不同的疾病。Further, in the embodiment of the present application, when the training data set is classified, the training data set is classified according to different characteristics. For example, the training data in the training data set is classified according to the same disease type; or, the training data set is classified according to the same disease symptoms, and the classification data in the classification data set corresponds to different diseases.
较佳地,为了后续更好的对所述分类数据集进行利用,本申请实施例将所述分类数据集存储至预先构建的疾病数据库中,所述疾病数据库可以为mysql数据库,Oracle数据库等。Preferably, in order to better utilize the classification data set later, the embodiment of the present application stores the classification data set in a pre-built disease database, and the disease database may be a mysql database, an Oracle database, and the like.
S2、利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型。S2. Use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into Disease model.
具体的,本申请一可选实施例中,所述弱分类器为:Specifically, in an optional embodiment of the present application, the weak classifier is:
h(δ i,p,θ)=pδ i<pθ h(δ i ,p,θ)=pδ i <pθ
其中,h(δ i,p,θ)为所述弱分类器的分类结果,δ i为分类数据集中的分类数据,p为预设不等号方向的指示参数,θ为预设的分类阈值。 Wherein, h(δ i , p, θ) is the classification result of the weak classifier, δ i is the classification data in the classification data set, p is the indicator parameter of the preset inequality sign direction, and θ is the preset classification threshold.
具体实施时,对θ预设多个不同的分类阈值,从而得到多个弱分类器。利用多个弱分类器对分类数据集进行分类,可得到多种分类结果,其中,每个弱分类器对应一种分类结果。During specific implementation, multiple different classification thresholds are preset for θ to obtain multiple weak classifiers. Using multiple weak classifiers to classify the classification data set can obtain a variety of classification results, where each weak classifier corresponds to a classification result.
可选地,本申请实施例中,在得到多个弱分类器并根据弱分类器进行分分类之后,根据分类结果利用错误率函数对所述预先训练好的多个弱分类器进行选取,得到多个目标弱分类器。Optionally, in the embodiment of the present application, after obtaining a plurality of weak classifiers and performing classification according to the weak classifiers, an error rate function is used to select the plurality of pre-trained weak classifiers according to the classification results to obtain Multiple target weak classifiers.
所述错误率函数为:The error rate function is:
Figure PCTCN2020112330-appb-000001
Figure PCTCN2020112330-appb-000001
其中,其中,w i为所述分类数据集,y i为所述分类数据集中分类数据的分类结果。 Wherein, w i is the classification data set, and yi is the classification result of the classification data in the classification data set.
较佳地,本申请实施例中,筛选出错误率小于预设错误阈值的多个弱分类器为目标弱分类器。Preferably, in the embodiment of the present application, multiple weak classifiers whose error rate is less than a preset error threshold are selected as the target weak classifiers.
较佳地,所述目标弱分类器的个数与所述分类数据集中分类数据的类别个数一致。Preferably, the number of the target weak classifiers is consistent with the number of categories of the classification data in the classification data set.
详细地,所述疾病模型如下:In detail, the disease model is as follows:
Figure PCTCN2020112330-appb-000002
Figure PCTCN2020112330-appb-000002
其中,t为所述目标弱分类器的个数,f k为所述目标弱分类器,F为所有目标弱分类器的集合,
Figure PCTCN2020112330-appb-000003
为所述疾病模型的输出结果。
Where t is the number of the target weak classifier, f k is the target weak classifier, F is the set of all target weak classifiers,
Figure PCTCN2020112330-appb-000003
Is the output result of the disease model.
通过对多个弱分类器的训练,利用错误率函数对所述预先训练好的多个弱分类器进行选取,可以得到准确性更高的目标弱分类器,进而将准确行更高的多个弱分类器进行聚合为疾病模型,提高疾病模型的准确性。By training multiple weak classifiers and using the error rate function to select the multiple pre-trained weak classifiers, a target weak classifier with higher accuracy can be obtained, and a higher accuracy can be obtained. Weak classifiers are aggregated into disease models to improve the accuracy of disease models.
S3、获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集。S3. Obtain a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set.
本申请一优选实施例中,所述待判断用户数据集可存储与区块链节点中。In a preferred embodiment of the present application, the user data set to be determined may be stored in a blockchain node.
具体地,本申请可利用预先编辑好的java语句从用于一个或多个区块链的节点中调用所述待判断用户数据集。Specifically, this application may use a pre-edited java statement to call the user data set to be judged from the nodes used in one or more blockchains.
本申请实施例中,所述待判断用户数据集包括但不限于:待判断用户的信息(如,性别、年龄等),待判断用户的历史疾病,待判断用户的历史用药、待判断用户的历史疾病症状,所述待判断用户的数量可以为一个或多个。In the embodiment of this application, the data set of the user to be judged includes but is not limited to: information of the user to be judged (such as gender, age, etc.), the historical illness of the user to be judged, the historical medication of the user to be judged, and the user’s information to be judged. For historical disease symptoms, the number of users to be determined may be one or more.
所述预处理包括但不限于:数据填充,数据修正,数据删除,数据标准化。The preprocessing includes, but is not limited to: data filling, data correction, data deletion, and data standardization.
进一步的,在本申请一可选实施例中,所述对所述待判断用户数据集进行预处理,得到目标数据集,包括:Further, in an optional embodiment of the present application, the preprocessing the user data set to be judged to obtain the target data set includes:
识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;
生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。较佳地,本申请实施例可利用预先编辑好的java语句,对所述待判断用户数据集中的待判断用户数据进行长度检测,待判断用户数据中包含多个带判断用户的属性数据和对应数值,如,所述待判断用户数据集中存在着待判断用户的年龄数据,以及年龄数据对应的数值;在具体检测时,检测待判断用户数据中的各个属性数据对应的数值,当检测到数值长度不为0或不为null时,则确定该属性数据的值未缺失,继续进行检测;当检测到数值长度为0或为null时,则确定该属性数据的值缺失,将所有缺失数值的属性数据和对应数值得到缺失数据的集合,即缺失数据集。优选的,本申请实施例中,所述生成所述缺失数据集中缺失数据的预测数据,包括:Filling the prediction data into the user data set to be judged to obtain the target data set. Preferably, the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged. The user data to be judged includes multiple attribute data and corresponding data with the judged user. Numerical value, for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set. Preferably, in the embodiment of the present application, said generating the prediction data of the missing data in the missing data set includes:
利用mice函数选取所述缺失数据集中任一缺失数据的临近数据;Using the mice function to select the adjacent data of any missing data in the missing data set;
计算所述临近数据的均值,得到预测数据。Calculate the mean value of the adjacent data to obtain predicted data.
详细地,本申请实施例利用mice函数,以所述缺失数据集中任一缺失数据在所述待判断用户数据集中的位置为中心点,设定长度阈值,选取所述长度阈值内的临近数据,并利用如下均值算法计算所述计算所述临近数据的均值,得到预测数据Avg:In detail, the embodiment of the application uses the mice function to set a length threshold with the position of any missing data in the missing data set in the user data set to be judged as the center point, and select adjacent data within the length threshold, And use the following average algorithm to calculate the average value of the adjacent data to obtain the predicted data Avg:
Figure PCTCN2020112330-appb-000004
Figure PCTCN2020112330-appb-000004
其中,V为临近数据的数目,D v为任一临近数据。 Among them, V is the number of adjacent data, and D v is any adjacent data.
本申请实施例通过对存在缺失的待判断用户数据进行数据填充,可以使待判断数据更加完整,有利于提高模型训练的准确度。The embodiment of the present application fills in the user data to be judged with missing data, which can make the data to be judged more complete, which is beneficial to improve the accuracy of model training.
S4、将所述目标数据集与所述分类数据集建立索引关系。S4. Establish an index relationship between the target data set and the classified data set.
较佳地,所述将所述目标数据集与所述疾病数据库中的分类数据集建立索引关系,包括:Preferably, said establishing an index relationship between the target data set and the classification data set in the disease database includes:
根据所述分类数据集包含的类别在疾病数据库中创建类别数据表;Creating a category data table in the disease database according to the categories contained in the classification data set;
确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。优选地,所述按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系指的是,根据所述目标数据集中包含的用户的历史疾病,用户的历史用药、用户的历史疾病症状等任一数据,在所述类别数据表中进行检索,并将所述目标数据集中的目标数据根据检索的结果归类为相应的类别。An index relationship is established between the target data in the target data set and the classified data set according to the target category. Preferably, the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.
例如,当所述分类数据集是按照所述分类数据集中包含的疾病症状来分类的,在进行所述检索的时候,便按照所述目标数据集中包含的用户的历史疾病症状来进行检索,检索到所述目标数据集中的目标数据与所述分类数据集中分类数据相应的类别,即为所述索引关系。For example, when the classification data set is classified according to the disease symptoms contained in the classification data set, when the search is performed, the search is performed according to the historical disease symptoms of the user contained in the target data set. The category corresponding to the target data in the target data set and the classified data in the classification data set is the index relationship.
进一步地,所述将所述目标数据集与所述分类数据集建立索引关系之前,本申请实施例所述方法还包括:将所述目标数据集通过TCP/IP协议传输至疾病数据库中,所述TCP/IP 协议为一种数据传输协议,可根据所述TCP/IP协议调用所述疾病数据库的数据传输接口,从而利于将所述目标数据集高效地传输至所述疾病数据库中。Further, before the indexing relationship between the target data set and the classification data set is established, the method described in the embodiment of the present application further includes: transmitting the target data set to the disease database through the TCP/IP protocol, so The TCP/IP protocol is a data transmission protocol, and the data transmission interface of the disease database can be called according to the TCP/IP protocol, thereby facilitating the efficient transmission of the target data set to the disease database.
S5、根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集。S5. Match the target data set with the classified data set according to the index relationship to obtain a matched data set.
进一步地,本申请另一可选实施例中,根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集,包括:Further, in another optional embodiment of the present application, matching the target data set with the classified data set according to the index relationship to obtain a matching data set includes:
将所述目标数据集中多个目标数据进行字符提取,生成所述多个目标数据对应的多个字符数据集;Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;
将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集。The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
详细地,可利用预设的字符抓取器对所述目标数据集中的目标数据进行字符提取,其中,所述字符抓取器为python语句,该python语句用于字符抓取。In detail, a preset character grabber can be used to extract characters from the target data in the target data set, where the character grabber is a python sentence, and the python sentence is used for character grabbing.
具体地,优选地,当获得所述多个字符数据集后,本申请实施例将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集,即根据所述索引关系找到所述字符数据集与所述分类数据集中分类数据相应的类别。所述匹配数据集包括所述目标数据集中的目标数据与所述目标数据对应的分类数据集中的分类数据。Specifically, preferably, after the multiple character data sets are obtained, the embodiment of the present application matches the multiple character data sets with the classification data set through the index relationship to generate a matching data set, that is, according to The index relationship finds the category corresponding to the classification data in the character data set and the classification data set. The matching data set includes target data in the target data set and classification data in a classification data set corresponding to the target data.
较佳地,本申请还包括利用如下数组聚合算法将所述字符数据集进行数组聚合,生成数组数据集:Preferably, the present application further includes using the following array aggregation algorithm to perform array aggregation on the character data set to generate an array data set:
Figure PCTCN2020112330-appb-000005
Figure PCTCN2020112330-appb-000005
其中,J为所述数组数据集,β i为所述字符数据集中的字符,m为所述字符数据集中字符的个数。 Wherein, J is the array data set, β i is the character in the character data set, and m is the number of characters in the character data set.
将所述字符数据集进行数组聚合,生成数组数据集,将数据聚合在一起,可进一步加快后续数据处理时的效率。Performing array aggregation on the character data set to generate an array data set, and aggregating the data together can further accelerate the efficiency of subsequent data processing.
S6、利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。S6. Use the disease model to analyze and calculate the matching data set to obtain a disease analysis result.
本申请实施例中,所述分析结果为所述目标数据集中目标数据对应的待判断用户患上所述分类数据集中的分类数据对应的疾病的概率。In the embodiment of the present application, the analysis result is the probability that the user to be judged corresponding to the target data in the target data set suffers from the disease corresponding to the classification data in the classification data set.
优选地,本申请实施例利用如下分析算法进行所述分析计算,得到所述分析结果
Figure PCTCN2020112330-appb-000006
Preferably, the embodiment of the present application uses the following analysis algorithm to perform the analysis calculation to obtain the analysis result
Figure PCTCN2020112330-appb-000006
Figure PCTCN2020112330-appb-000007
Figure PCTCN2020112330-appb-000007
其中,x i为所述匹配数据集中的匹配数据,t为所述疾病模型中弱分类器的个数,f t(x i)为所述弱分类器的输出。 Wherein, x i is the matching data in the matching data set, t is the number of weak classifiers in the disease model, and f t (x i ) is the output of the weak classifier.
进一步地,本申请实施例还包括根据所述疾病分析结果发送治疗方案提醒。Further, the embodiment of the present application further includes sending a treatment plan reminder according to the analysis result of the disease.
详细地,在所述得到疾病分析结果之后,所述方法还包括:将所述疾病分析结果与预设的结果阈值进行对比;In detail, after the disease analysis result is obtained, the method further includes: comparing the disease analysis result with a preset result threshold;
当所述疾病分析结果小于或等于所述结果阈值时,发送第一治疗方案提醒;When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;
当所述疾病分析结果大于所述结果阈值时,发送第二治疗方案提醒。When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
当发送治疗方案提醒时,可以直接向待判断用户数据集对应的待判断用户发送提醒。When sending a treatment plan reminder, the reminder can be sent directly to the user to be judged corresponding to the data set of the user to be judged.
本实施例中,所述第一治疗方案和第二治疗方案可以为不同疾病严重程度时对应的不同治疗方案。In this embodiment, the first treatment plan and the second treatment plan may be different treatment plans corresponding to different disease severity.
进一步的,所述治疗方案提醒包括患病原因分析。Further, the reminder of the treatment plan includes an analysis of the cause of the disease.
本实施例通过疾病分析结果发送治疗方案提醒有利于相关人员快速获取个性化的需求信息。In this embodiment, sending reminders of the treatment plan based on the results of the disease analysis is beneficial for relevant personnel to quickly obtain personalized demand information.
本申请实施例中,将获取到的训练数据集进行分类,利用分类得到的分类数据集对预先构建的多个弱分类器进行训练,并从训练后的多个弱分类器中选取多个目标弱分类器,将弱分类器聚合为疾病模型;在获取到待判断用户数据集之后,对待判断数据集进行预处理,得到目标数据集,将目标数据集与分类数据集建立索引关系;根据索引关系将目标数据集与所述分类数据集进行匹配,得到匹配数据集;利用疾病模型对匹配数据集进行分析计算,得到疾病分析结果。通过在训练模型前将训练数据集进行分类,可以提高模型训练的效率,通过利用不同类别的分类数据集对基模型进行训练,可以提高疾病模型的精确度,进而有利于提高疾病模型分析的准确性;同时,在对待判断用户数据集进行数据分析时,将待判断用户数据集预处理得到的目标数据集与分类数据集建立索引关系,便于快速准确地查找到目标数据集与分类数据集的对应关系,确定目标数据集对应的类别,有利于快速地通过疾病模型根据该类别准确识别待判断用户数据对应的疾病风险。In the embodiment of this application, the obtained training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training. Weak classifier, which aggregates the weak classifiers into disease models; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result. By classifying the training data set before training the model, the efficiency of model training can be improved. By using different types of classification data sets to train the base model, the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis. At the same time, when performing data analysis on the user data set to be judged, the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately. Correspondence, determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
如图2所示,是本申请疾病风险的分析装置的模块示意图。As shown in Figure 2, it is a schematic diagram of modules of the disease risk analysis device of the present application.
本申请所述疾病风险的分析装置100可以安装于电子设备中。根据实现的功能,所述疾病风险的分析装置可以包括数据分类模块101、模型训练模块102、目标数据获取模块103、索引关系建立模块104、数据匹配模块105和分析计算模块106。本发所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The disease risk analysis apparatus 100 described in this application can be installed in an electronic device. According to the realized functions, the disease risk analysis device may include a data classification module 101, a model training module 102, a target data acquisition module 103, an index relationship establishment module 104, a data matching module 105, and an analysis calculation module 106. The module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述数据分类模块101,用于获取训练数据集,将所述训练数据集进行分类得到分类数据集;The data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set;
所述模型训练模块102,用于利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;The model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models;
所述目标数据获取模块103,用于获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;The target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set;
所述索引模块104,用于将所述目标数据集与所述分类数据集建立索引关系;The index module 104 is configured to establish an index relationship between the target data set and the classified data set;
所述数据匹配模块105,用于根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集;The data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;
所述分析计算模块106,用于利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The analysis and calculation module 106 is configured to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
详细地,所述疾病风险的分析装置各模块的具体实施实施如下:In detail, the specific implementation of each module of the disease risk analysis device is as follows:
所述数据分类模块101,用于获取训练数据集,将所述训练数据集进行分类得到分类数据集。The data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set.
本申请实施例中,所述训练数据集为记录用户疾病历史信息的数据集,所述训练数据集包括但不限于:与身体状况相关的信息(如,性别、年龄、过敏史等)、疾病名称、疾病类型、疾病症状、疾病用药。所述训练数据可从各医院用于存储病人数据的数据库中获取,该数据库中存储的为对病人数据进行脱敏处理之后的数据。In the embodiment of the present application, the training data set is a data set that records the user's disease history information. The training data set includes, but is not limited to: information related to physical conditions (such as gender, age, allergy history, etc.), disease Name, disease type, disease symptoms, disease medication. The training data can be obtained from a database used for storing patient data in various hospitals, and the data stored in the database is the data after desensitizing the patient data.
进一步地,本申请实施例中,将所述训练数据集进行分类时,将所述训练数据集按照不同的特征进行分类。例如,将所述训练数据集中的训练数据按照相同的疾病类型进行分 类;或者,将所述训练数据集按照相同的疾病症状进行分类,所述分类数据集中的分类数据对应着不同的疾病。Further, in the embodiment of the present application, when the training data set is classified, the training data set is classified according to different characteristics. For example, the training data in the training data set is classified according to the same disease type; or, the training data set is classified according to the same disease symptoms, and the classification data in the classification data set corresponds to different diseases.
较佳地,为了后续更好的对所述分类数据集进行利用,本申请实施例将所述分类数据集存储至预先构建的疾病数据库中,所述疾病数据库可以为mysql数据库,Oracle数据库等。Preferably, in order to better utilize the classification data set later, the embodiment of the application stores the classification data set in a pre-built disease database, and the disease database may be a mysql database, an Oracle database, or the like.
所述模型训练模块102,用于利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型。The model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models.
具体的,本申请一可选实施例中,所述弱分类器为:Specifically, in an optional embodiment of the present application, the weak classifier is:
h(δ i,p,θ)=pδ i<pθ h(δ i ,p,θ)=pδ i <pθ
其中,h(δ i,p,θ)为所述弱分类器的分类结果,δ i为分类数据集中的分类数据,p为预设不等号方向的指示参数,θ为预设的分类阈值。 Wherein, h(δ i , p, θ) is the classification result of the weak classifier, δ i is the classification data in the classification data set, p is the indicator parameter of the preset inequality sign direction, and θ is the preset classification threshold.
具体实施时,对θ预设多个不同的分类阈值,从而得到多个弱分类器。利用多个弱分类器对分类数据集进行分类,可得到多种分类结果,其中,每个弱分类器对应一种分类结果。During specific implementation, multiple different classification thresholds are preset for θ to obtain multiple weak classifiers. Using multiple weak classifiers to classify the classification data set can obtain a variety of classification results, where each weak classifier corresponds to a classification result.
可选地,本申请实施例中,在得到多个弱分类器并根据弱分类器进行分分类之后,根据分类结果利用错误率函数对所述预先训练好的多个弱分类器进行选取,得到多个目标弱分类器。Optionally, in the embodiment of the present application, after obtaining a plurality of weak classifiers and performing classification according to the weak classifiers, an error rate function is used to select the plurality of pre-trained weak classifiers according to the classification results to obtain Multiple target weak classifiers.
所述错误率函数为:The error rate function is:
Figure PCTCN2020112330-appb-000008
Figure PCTCN2020112330-appb-000008
其中,其中,w i为所述分类数据集,y i为所述分类数据集中分类数据的分类结果。 Wherein, w i is the classification data set, and yi is the classification result of the classification data in the classification data set.
较佳地,本申请实施例中,筛选出错误率小于预设错误阈值的多个弱分类器为目标弱分类器。Preferably, in the embodiment of the present application, multiple weak classifiers whose error rate is less than a preset error threshold are selected as the target weak classifiers.
较佳地,所述目标弱分类器的个数与所述分类数据集中分类数据的类别个数一致。Preferably, the number of the target weak classifiers is consistent with the number of categories of the classification data in the classification data set.
详细地,所述疾病模型如下:In detail, the disease model is as follows:
Figure PCTCN2020112330-appb-000009
Figure PCTCN2020112330-appb-000009
其中,t为所述目标弱分类器的个数,f k为所述目标弱分类器,F为所有目标弱分类器的集合,
Figure PCTCN2020112330-appb-000010
为所述疾病模型的输出结果。
Where t is the number of the target weak classifier, f k is the target weak classifier, F is the set of all target weak classifiers,
Figure PCTCN2020112330-appb-000010
Is the output result of the disease model.
通过对多个弱分类器的训练,利用错误率函数对所述预先训练好的多个弱分类器进行选取,可以得到准确性更高的目标弱分类器,进而将准确行更高的多个弱分类器进行聚合为疾病模型,提高疾病模型的准确性。By training multiple weak classifiers and using the error rate function to select the multiple pre-trained weak classifiers, a target weak classifier with higher accuracy can be obtained, and a higher accuracy can be obtained. Weak classifiers are aggregated into disease models to improve the accuracy of disease models.
所述目标数据获取模块103,用于获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集。The target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set.
本申请一优选实施例中,所述待判断用户数据集可存储与区块链节点中。In a preferred embodiment of the present application, the user data set to be determined may be stored in a blockchain node.
具体地,本申请可利用预先编辑好的java语句从用于一个或多个区块链的节点中调用所述待判断用户数据集。Specifically, this application may use a pre-edited java statement to call the user data set to be judged from the nodes used in one or more blockchains.
本申请实施例中,所述待判断用户数据集包括但不限于:待判断用户的信息(如,性别、年龄等),待判断用户的历史疾病,待判断用户的历史用药、待判断用户的历史疾病症状,所述待判断用户的数量可以为一个或多个。In the embodiment of this application, the data set of the user to be judged includes but is not limited to: information of the user to be judged (such as gender, age, etc.), the historical illness of the user to be judged, the historical medication of the user to be judged, and the user’s information to be judged. For historical disease symptoms, the number of users to be determined may be one or more.
所述预处理包括但不限于:数据填充,数据修正,数据删除,数据标准化。The preprocessing includes, but is not limited to: data filling, data correction, data deletion, and data standardization.
进一步的,在本申请一可选实施例中,所述目标数据获取模块103具体用于:Further, in an optional embodiment of the present application, the target data acquisition module 103 is specifically configured to:
获取待判断用户数据集,识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Acquiring a user data set to be judged, identifying missing data with missing data in the user data set to be judged, and obtaining a missing data set;
生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。Filling the prediction data into the user data set to be judged to obtain the target data set.
较佳地,本申请实施例可利用预先编辑好的java语句,对所述待判断用户数据集中的待判断用户数据进行长度检测,待判断用户数据中包含多个带判断用户的属性数据和对应数值,如,所述待判断用户数据集中存在着待判断用户的年龄数据,以及年龄数据对应的数值;在具体检测时,检测待判断用户数据中的各个属性数据对应的数值,当检测到数值长度不为0或不为null时,则确定该属性数据的值未缺失,继续进行检测;当检测到数值长度为0或为null时,则确定该属性数据的值缺失,将所有缺失数值的属性数据和对应数值得到缺失数据的集合,即缺失数据集。Preferably, the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged. The user data to be judged includes multiple attribute data and corresponding data with the judged user. Numerical value, for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set.
优选的,本申请实施例中,所述生成所述缺失数据集中缺失数据的预测数据,包括:Preferably, in the embodiment of the present application, said generating the prediction data of the missing data in the missing data set includes:
利用mice函数选取所述缺失数据集中任一缺失数据的临近数据;Using the mice function to select the adjacent data of any missing data in the missing data set;
计算所述临近数据的均值,得到预测数据。Calculate the mean value of the adjacent data to obtain predicted data.
详细地,本申请实施例利用mice函数,以所述缺失数据集中任一缺失数据在所述待判断用户数据集中的位置为中心点,设定长度阈值,选取所述长度阈值内的临近数据,并利用如下均值算法计算所述计算所述临近数据的均值,得到预测数据Avg:In detail, the embodiment of the application uses the mice function to set a length threshold with the position of any missing data in the missing data set in the user data set to be judged as the center point, and select adjacent data within the length threshold, And use the following average algorithm to calculate the average value of the adjacent data to obtain the predicted data Avg:
Figure PCTCN2020112330-appb-000011
Figure PCTCN2020112330-appb-000011
其中,V为临近数据的数目,D v为任一临近数据。 Among them, V is the number of adjacent data, and D v is any adjacent data.
本申请实施例通过对存在缺失的待判断用户数据进行数据填充,可以使待判断数据更加完整,有利于提高模型训练的准确度。The embodiment of the present application fills in the user data to be judged with missing data, which can make the data to be judged more complete, which is beneficial to improve the accuracy of model training.
所述索引关系建立模块104,用于将所述目标数据集与所述分类数据集建立索引关系。The index relationship establishment module 104 is configured to establish an index relationship between the target data set and the classification data set.
较佳地,所述索引关系建立模块104具体用于:Preferably, the index relationship establishment module 104 is specifically configured to:
根据所述分类数据集包含的类别在疾病数据库中创建类别数据表;Creating a category data table in the disease database according to the categories contained in the classification data set;
确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。An index relationship is established between the target data in the target data set and the classified data set according to the target category.
优选地,所述按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系指的是,根据所述目标数据集中包含的用户的历史疾病,用户的历史用药、用户的历史疾病症状等任一数据,在所述类别数据表中进行检索,并将所述目标数据集中的目标数据根据检索的结果归类为相应的类别。Preferably, the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.
例如,当所述分类数据集是按照所述分类数据集中包含的疾病症状来分类的,在进行所述检索的时候,便按照所述目标数据集中包含的用户的历史疾病症状来进行检索,检索到所述目标数据集中的目标数据与所述分类数据集中分类数据相应的类别,即为所述索引关系。For example, when the classification data set is classified according to the disease symptoms contained in the classification data set, when the search is performed, the search is performed according to the historical disease symptoms of the user contained in the target data set. The category corresponding to the target data in the target data set and the classified data in the classification data set is the index relationship.
进一步地,所述将所述目标数据集与所述分类数据集建立索引关系之前,本申请实施例所述方法还包括:将所述目标数据集通过TCP/IP协议传输至疾病数据库中,所述TCP/IP协议为一种数据传输协议,可根据所述TCP/IP协议调用所述疾病数据库的数据传输接口,从而利于将所述目标数据集高效地传输至所述疾病数据库中。Further, before the indexing relationship between the target data set and the classification data set is established, the method described in the embodiment of the present application further includes: transmitting the target data set to the disease database through the TCP/IP protocol, so The TCP/IP protocol is a data transmission protocol, and the data transmission interface of the disease database can be called according to the TCP/IP protocol, thereby facilitating the efficient transmission of the target data set to the disease database.
所述数据匹配模块105,用于根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集。The data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set.
进一步地,本申请另一可选实施例中,所述数据匹配模块105具体用于:将所述目标数据集中多个目标数据进行字符提取,生成所述多个目标数据对应的多个字符数据集;Further, in another optional embodiment of the present application, the data matching module 105 is specifically configured to: extract characters from multiple target data in the target data set to generate multiple character data corresponding to the multiple target data set;
将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集。The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
详细地,可利用预设的字符抓取器对所述目标数据集中的目标数据进行字符提取,其中,所述字符抓取器为python语句,该python语句用于字符抓取。In detail, a preset character grabber can be used to extract characters from the target data in the target data set, where the character grabber is a python sentence, and the python sentence is used for character grabbing.
具体地,优选地,当获得所述多个字符数据集后,本申请实施例将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集,即根据所述索引关系找到所述字符数据集与所述分类数据集中分类数据相应的类别。所述匹配数据集包括所述目标数据集中的目标数据与所述目标数据对应的分类数据集中的分类数据。Specifically, preferably, after the multiple character data sets are obtained, the embodiment of the present application matches the multiple character data sets with the classification data set through the index relationship to generate a matching data set, that is, according to The index relationship finds the category corresponding to the classification data in the character data set and the classification data set. The matching data set includes target data in the target data set and classification data in a classification data set corresponding to the target data.
较佳地,本申请还包括利用如下数组聚合算法将所述字符数据集进行数组聚合,生成数组数据集:Preferably, the present application further includes using the following array aggregation algorithm to perform array aggregation on the character data set to generate an array data set:
Figure PCTCN2020112330-appb-000012
Figure PCTCN2020112330-appb-000012
其中,J为所述数组数据集,β i为所述字符数据集中的字符,m为所述字符数据集中字符的个数。 Wherein, J is the array data set, β i is the character in the character data set, and m is the number of characters in the character data set.
将所述字符数据集进行数组聚合,生成数组数据集,将数据聚合在一起,可进一步加快后续数据处理时的效率。Performing array aggregation on the character data set to generate an array data set, and aggregating the data together can further accelerate the efficiency of subsequent data processing.
所述分析计算模块106,用于利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The analysis and calculation module 106 is configured to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
本申请实施例中,所述分析结果为所述目标数据集中目标数据对应的待判断用户患上所述分类数据集中的分类数据对应的疾病的概率。In the embodiment of the present application, the analysis result is the probability that the user to be judged corresponding to the target data in the target data set suffers from the disease corresponding to the classification data in the classification data set.
优选地,本申请实施例利用如下分析算法进行所述分析计算,得到所述分析结果
Figure PCTCN2020112330-appb-000013
Preferably, the embodiment of the present application uses the following analysis algorithm to perform the analysis calculation to obtain the analysis result
Figure PCTCN2020112330-appb-000013
Figure PCTCN2020112330-appb-000014
Figure PCTCN2020112330-appb-000014
其中,x i为所述匹配数据集中的匹配数据,t为所述疾病模型中弱分类器的个数,f t(x i)为所述弱分类器的输出。 Wherein, x i is the matching data in the matching data set, t is the number of weak classifiers in the disease model, and f t (x i ) is the output of the weak classifier.
进一步地,本申请实施例还包括根据所述疾病分析结果发送治疗方案提醒。Further, the embodiment of the present application further includes sending a treatment plan reminder according to the analysis result of the disease.
详细地,在所述得到疾病分析结果之后,所述装置还包括消息发送模块,所述消息发送模块用于:In detail, after the disease analysis result is obtained, the device further includes a message sending module, and the message sending module is configured to:
将所述疾病分析结果与预设的结果阈值进行对比;Comparing the result of the disease analysis with a preset result threshold;
当所述疾病分析结果小于或等于所述结果阈值时,发送第一治疗方案提醒;When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;
当所述疾病分析结果大于所述结果阈值时,发送第二治疗方案提醒。When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
当发送治疗方案提醒时,可以直接向待判断用户数据集对应的待判断用户发送提醒。When sending a treatment plan reminder, the reminder can be sent directly to the user to be judged corresponding to the data set of the user to be judged.
本实施例中,所述第一治疗方案和第二治疗方案可以为不同疾病严重程度时对应的不同治疗方案。In this embodiment, the first treatment plan and the second treatment plan may be different treatment plans corresponding to different disease severity.
进一步的,所述治疗方案提醒包括患病原因分析。Further, the reminder of the treatment plan includes an analysis of the cause of the disease.
本实施例通过疾病分析结果发送治疗方案提醒有利于相关人员快速获取个性化的需求信息。In this embodiment, sending reminders of the treatment plan based on the results of the disease analysis is beneficial for relevant personnel to quickly obtain personalized demand information.
本申请实施例中,将获取到的训练数据集进行分类,利用分类得到的分类数据集对预先构建的多个弱分类器进行训练,并从训练后的多个弱分类器中选取多个目标弱分类器,将弱分类器聚合为疾病模型;在获取到待判断用户数据集之后,对待判断数据集进行预处理,得到目标数据集,将目标数据集与分类数据集建立索引关系;根据索引关系将目标数 据集与所述分类数据集进行匹配,得到匹配数据集;利用疾病模型对匹配数据集进行分析计算,得到疾病分析结果。通过在训练模型前将训练数据集进行分类,可以提高模型训练的效率,通过利用不同类别的分类数据集对基模型进行训练,可以提高疾病模型的精确度,进而有利于提高疾病模型分析的准确性;同时,在对待判断用户数据集进行数据分析时,将待判断用户数据集预处理得到的目标数据集与分类数据集建立索引关系,便于快速准确地查找到目标数据集与分类数据集的对应关系,确定目标数据集对应的类别,有利于快速地通过疾病模型根据该类别准确识别待判断用户数据对应的疾病风险。In the embodiment of this application, the acquired training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training. Weak classifier, which aggregates the weak classifier into a disease model; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result. By classifying the training data set before training the model, the efficiency of model training can be improved. By using different types of classification data sets to train the base model, the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis. At the same time, when performing data analysis on the user data set to be judged, the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately. Correspondence, determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
如图3所示,是本申请实现疾病风险的分析方法的电子设备的结构示意图。As shown in FIG. 3, it is a schematic diagram of the structure of an electronic device implementing the disease risk analysis method of the present application.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如疾病风险的分析程序12。The electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a disease risk analysis program 12.
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如疾病风险的分析程序12的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the disease risk analysis program 12, etc., but also to temporarily store data that has been output or will be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如执行疾病风险的分析程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。The processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc. The processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules stored in the memory 11 (such as executing Disease risk analysis programs, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
本申请实施例中,将获取到的训练数据集进行分类,利用分类得到的分类数据集对预先构建的多个弱分类器进行训练,并从训练后的多个弱分类器中选取多个目标弱分类器,将弱分类器聚合为疾病模型;在获取到待判断用户数据集之后,对待判断数据集进行预处理,得到目标数据集,将目标数据集与分类数据集建立索引关系;根据索引关系将目标数据集与所述分类数据集进行匹配,得到匹配数据集;利用疾病模型对匹配数据集进行分析计算,得到疾病分析结果。通过在训练模型前将训练数据集进行分类,可以提高模型训练的效率,通过利用不同类别的分类数据集对基模型进行训练,可以提高疾病模型的精确度,进而有利于提高疾病模型分析的准确性;同时,在对待判断用户数据集进行数据分析时,将待判断用户数据集预处理得到的目标数据集与分类数据集建立索引关系,便于快速准确地查找到目标数据集与分类数据集的对应关系,确定目标数据集对应的类别,有利于快速地通过疾病模型根据该类别准确识别待判断用户数据对应的疾病风险。In the embodiment of this application, the obtained training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training. Weak classifier, which aggregates the weak classifiers into disease models; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result. By classifying the training data set before training the model, the efficiency of model training can be improved. By using different types of classification data sets to train the base model, the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis. At the same time, when performing data analysis on the user data set to be judged, the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately. Correspondence, determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池), 优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may also include a user interface. The user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的疾病风险的分析程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The disease risk analysis program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中,所述计算机可读取存储介质可以是易失性,也可以是非易失性。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer readable storage medium, and the computer readable storage The medium can be volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
进一步地,所述计算机可用存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store a block chain node Use the created data, etc.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the present application.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含 义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图表记视为限制所涉及的权利要求。Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any accompanying diagrams in the claims should not be regarded as limiting the claims involved.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。In addition, it is obvious that the word "including" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices stated in the system claims can also be implemented by one unit or device through software or hardware. The second class words are used to indicate names, and do not indicate any specific order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种疾病风险的分析方法,其中,所述方法包括:A disease risk analysis method, wherein the method includes:
    获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
    利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
    获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
    将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
    根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
    利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  2. 如权利要求1所述的疾病风险的分析方法,其中,所述对所述待判断用户数据集进行预处理,得到目标数据集,包括:The disease risk analysis method of claim 1, wherein the preprocessing the user data set to be judged to obtain the target data set comprises:
    识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;
    生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
    将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。Filling the prediction data into the user data set to be judged to obtain the target data set.
  3. 如权利要求1所述的疾病风险的分析方法,其中,所述将所述目标数据集与所述分类数据集建立索引关系,包括:The disease risk analysis method according to claim 1, wherein said establishing an index relationship between said target data set and said classification data set comprises:
    根据所述分类数据集包含的类别在疾病数据库中创建类别数据表;Creating a category data table in the disease database according to the categories contained in the classification data set;
    确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
    按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  4. 如权利要求1所述的疾病风险的分析方法,其中,所述根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集,包括:The disease risk analysis method of claim 1, wherein the matching the target data set with the classification data set according to the index relationship to obtain a matching data set comprises:
    将所述目标数据集中多个目标数据进行字符提取,生成所述多个目标数据对应的多个字符数据集;Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;
    将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集。The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
  5. 如权利要求1至4中任一项所述的疾病风险的分析方法,其中,所述得到疾病分析结果之后,所述方法还包括:The disease risk analysis method according to any one of claims 1 to 4, wherein, after the disease analysis result is obtained, the method further comprises:
    将所述疾病分析结果与预设的结果阈值进行对比;Comparing the result of the disease analysis with a preset result threshold;
    当所述疾病分析结果小于或等于所述结果阈值时,发送第一治疗方案提醒;When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;
    当所述疾病分析结果大于所述结果阈值时,发送第二治疗方案提醒。When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
  6. 如权利要求2所述的疾病风险的分析方法,其中,所述生成所述缺失数据集中缺失数据的预测数据包括:The disease risk analysis method according to claim 2, wherein said generating prediction data of missing data in said missing data set comprises:
    利用mice函数选取所述缺失数据集中任一缺失数据的临近数据;Using the mice function to select the adjacent data of any missing data in the missing data set;
    计算所述临近数据的均值,得到预测数据。Calculate the mean value of the adjacent data to obtain predicted data.
  7. 一种疾病风险的分析装置,其中,所述装置包括:A disease risk analysis device, wherein the device includes:
    数据分类模块,用于获取训练数据集,将所述训练数据集进行分类得到分类数据集;The data classification module is used to obtain a training data set, and classify the training data set to obtain a classification data set;
    模型训练模块,用于利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;The model training module is used to train a plurality of pre-built weak classifiers using the classification data set, and select a plurality of target weak classifiers from the plurality of weak classifiers after training, and the target weak classifier Classifiers are aggregated into disease models;
    目标数据获取模块,用于获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;The target data acquisition module is used to acquire the user data set to be judged, and preprocess the user data set to be judged to obtain the target data set;
    索引关系建立模块,用于将所述目标数据集与所述分类数据集建立索引关系;An index relationship establishment module, configured to establish an index relationship between the target data set and the classification data set;
    数据匹配模块,用于根据所述索引关系将所述目标数据集与所分类数据集进行匹配,得到匹配数据集;A data matching module, configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;
    分析计算模块,用于利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The analysis and calculation module is used to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
  8. 如权利要求7所述的疾病风险的分析装置,其中,所述目标数据获取模块具体用于:The disease risk analysis device according to claim 7, wherein the target data acquisition module is specifically used for:
    获取待判断用户数据集,识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Acquiring a user data set to be judged, identifying missing data with missing data in the user data set to be judged, and obtaining a missing data set;
    生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
    将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。Filling the prediction data into the user data set to be judged to obtain the target data set.
  9. 如权利要求7所述的疾病风险的分析装置,其中,所述索引关系建立模块具体用于:The disease risk analysis device according to claim 7, wherein the index relationship establishment module is specifically configured to:
    根据所述分类数据集包含的类别在所述疾病数据库中创建类别数据表;Creating a category data table in the disease database according to categories included in the classification data set;
    确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
    按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  10. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device includes:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下步骤:The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the following steps:
    获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
    利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
    获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
    将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
    根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
    利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  11. 如权利要求10所述的电子设备,其中,所述对所述待判断用户数据集进行预处理,得到目标数据集,包括:11. The electronic device according to claim 10, wherein said preprocessing said user data set to be judged to obtain a target data set comprises:
    识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;
    生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
    将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。Filling the prediction data into the user data set to be judged to obtain the target data set.
  12. 如权利要求10所述的电子设备,其中,所述将所述目标数据集与所述分类数据集建立索引关系,包括:11. The electronic device according to claim 10, wherein said establishing an index relationship between said target data set and said classification data set comprises:
    根据所述分类数据集包含的类别在疾病数据库中创建类别数据表;Creating a category data table in the disease database according to the categories contained in the classification data set;
    确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
    按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  13. 如权利要求10所述的电子设备,其中,所述根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集,包括:11. The electronic device according to claim 10, wherein said matching said target data set with said classification data set according to said index relationship to obtain a matching data set comprises:
    将所述目标数据集中多个目标数据进行字符提取,生成所述多个目标数据对应的多个字符数据集;Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;
    将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集。The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
  14. 如权利要求10至13中任一项所述的电子设备,其中,所述得到疾病分析结果之后,所述指令被所述至少一个处理器执行时还实现如下步骤:The electronic device according to any one of claims 10 to 13, wherein, after the disease analysis result is obtained, when the instruction is executed by the at least one processor, the following steps are further implemented:
    将所述疾病分析结果与预设的结果阈值进行对比;Comparing the result of the disease analysis with a preset result threshold;
    当所述疾病分析结果小于或等于所述结果阈值时,发送第一治疗方案提醒;When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;
    当所述疾病分析结果大于所述结果阈值时,发送第二治疗方案提醒。When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
  15. 如权利要求11所述的电子设备,其中,所述生成所述缺失数据集中缺失数据的预测数据包括:11. The electronic device according to claim 11, wherein said generating prediction data of missing data in said missing data set comprises:
    利用mice函数选取所述缺失数据集中任一缺失数据的临近数据;Using the mice function to select the adjacent data of any missing data in the missing data set;
    计算所述临近数据的均值,得到预测数据。Calculate the mean value of the adjacent data to obtain predicted data.
  16. 一种计算机可读存储介质,包括存储数据区和存储程序区,存储数据区存储根据区块链节点的使用所创建的数据,存储程序区存储有计算机程序;其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium includes a storage data area and a storage program area. The storage data area stores data created according to the use of blockchain nodes, and the storage program area stores a computer program; wherein the computer program is stored by a processor The following steps are implemented during execution:
    获取训练数据集,将所述训练数据集进行分类得到分类数据集;Acquiring a training data set, and classifying the training data set to obtain a classification data set;
    利用所述分类数据集对预先构建的多个弱分类器进行训练,并从训练后的所述多个弱分类器中选取多个目标弱分类器,将所述目标弱分类器聚合为疾病模型;Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ;
    获取待判断用户数据集,对所述待判断用户数据集进行预处理,得到目标数据集;Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;
    将所述目标数据集与所述分类数据集建立索引关系;Establishing an index relationship between the target data set and the classification data set;
    根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集;Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;
    利用所述疾病模型对所述匹配数据集进行分析计算,得到疾病分析结果。The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述对所述待判断用户数据集进行预处理,得到目标数据集,包括:15. The computer-readable storage medium of claim 16, wherein the preprocessing the user data set to be judged to obtain the target data set comprises:
    识别所述待判断用户数据集中存在数据缺失的缺失数据,得到缺失数据集;Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;
    生成所述缺失数据集中缺失数据的预测数据;Generating prediction data of missing data in the missing data set;
    将所述预测数据填充至所述待判断用户数据集,得到所述目标数据集。Filling the prediction data into the user data set to be judged to obtain the target data set.
  18. 如权利要求16所述的计算机可读存储介质,其中,所述将所述目标数据集与所述分类数据集建立索引关系,包括:15. The computer-readable storage medium according to claim 16, wherein said establishing an index relationship between said target data set and said classification data set comprises:
    根据所述分类数据集包含的类别在疾病数据库中创建类别数据表;Creating a category data table in the disease database according to the categories contained in the classification data set;
    确定所述目标数据集中目标数据在所述类别数据表中所属的目标类别;Determine the target category to which the target data in the target data set belongs in the category data table;
    按照所述目标类别将所述目标数据集中目标数据与所述分类数据集建立索引关系。An index relationship is established between the target data in the target data set and the classified data set according to the target category.
  19. 如权利要求16所述的计算机可读存储介质,其中,所述根据所述索引关系将所述目标数据集与所述分类数据集进行匹配,得到匹配数据集,包括:15. The computer-readable storage medium according to claim 16, wherein said matching said target data set with said classification data set according to said index relationship to obtain a matching data set comprises:
    将所述目标数据集中多个目标数据进行字符提取,生成所述多个目标数据对应的多个字符数据集;Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;
    将所述多个字符数据集与所述分类数据集通过所述索引关系进行匹配,生成匹配数据集。The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
  20. 如权利要求16至19中任一项所述的计算机可读存储介质,其中,所述得到疾病分析结果之后,所述计算机程序被处理器执行时还实现如下步骤:The computer-readable storage medium according to any one of claims 16 to 19, wherein, after the disease analysis result is obtained, the following steps are further implemented when the computer program is executed by the processor:
    将所述疾病分析结果与预设的结果阈值进行对比;Comparing the result of the disease analysis with a preset result threshold;
    当所述疾病分析结果小于或等于所述结果阈值时,发送第一治疗方案提醒;When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;
    当所述疾病分析结果大于所述结果阈值时,发送第二治疗方案提醒。When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
PCT/CN2020/112330 2020-05-26 2020-08-30 Disease risk analysis method, apparatus, electronic device, and computer storage medium WO2021151291A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010459737.5 2020-05-26
CN202010459737.5A CN111696663A (en) 2020-05-26 2020-05-26 Disease risk analysis method and device, electronic equipment and computer storage medium

Publications (1)

Publication Number Publication Date
WO2021151291A1 true WO2021151291A1 (en) 2021-08-05

Family

ID=72478392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/112330 WO2021151291A1 (en) 2020-05-26 2020-08-30 Disease risk analysis method, apparatus, electronic device, and computer storage medium

Country Status (2)

Country Link
CN (1) CN111696663A (en)
WO (1) WO2021151291A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936183A (en) * 2021-09-10 2022-01-14 南方电网深圳数字电网研究院有限公司 Data prediction method and device based on model training

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448954B (en) * 2021-06-29 2024-02-06 平安证券股份有限公司 Service data execution method and device, electronic equipment and computer storage medium
CN114496264B (en) * 2022-04-14 2022-07-19 深圳市瑞安医疗服务有限公司 Health index analysis method, device, equipment and medium based on multidimensional data
CN116130110A (en) * 2022-05-11 2023-05-16 云南升玥信息技术有限公司 Biological big data analysis, disease precise identification, classification and prediction system based on algorithm and blockchain and application

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493972A (en) * 2018-10-30 2019-03-19 平安医疗健康管理股份有限公司 Data processing method, device, server and storage medium based on prediction model
CN109935326A (en) * 2019-02-28 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of probability of illness prediction meanss and storage medium
CN110211690A (en) * 2019-04-19 2019-09-06 平安科技(深圳)有限公司 Disease risks prediction technique, device, computer equipment and computer storage medium
CN110363090A (en) * 2019-06-14 2019-10-22 平安科技(深圳)有限公司 Intelligent heart disease detection method, device and computer readable storage medium
US20190384863A1 (en) * 2018-06-13 2019-12-19 Stardog Union System and method for providing prediction-model-based generation of a graph data model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190384863A1 (en) * 2018-06-13 2019-12-19 Stardog Union System and method for providing prediction-model-based generation of a graph data model
CN109493972A (en) * 2018-10-30 2019-03-19 平安医疗健康管理股份有限公司 Data processing method, device, server and storage medium based on prediction model
CN109935326A (en) * 2019-02-28 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of probability of illness prediction meanss and storage medium
CN110211690A (en) * 2019-04-19 2019-09-06 平安科技(深圳)有限公司 Disease risks prediction technique, device, computer equipment and computer storage medium
CN110363090A (en) * 2019-06-14 2019-10-22 平安科技(深圳)有限公司 Intelligent heart disease detection method, device and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936183A (en) * 2021-09-10 2022-01-14 南方电网深圳数字电网研究院有限公司 Data prediction method and device based on model training

Also Published As

Publication number Publication date
CN111696663A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
WO2021151291A1 (en) Disease risk analysis method, apparatus, electronic device, and computer storage medium
US20240006038A1 (en) Team-based tele-diagnostics blockchain-enabled system
WO2021189904A1 (en) Data anomaly detection method and apparatus, and electronic device and storage medium
WO2021218336A1 (en) User information discrimination method and apparatus, and device and computer readable storage medium
CN111652280B (en) Behavior-based target object data analysis method, device and storage medium
CN112562836A (en) Doctor recommendation method and device, electronic equipment and storage medium
WO2021189855A1 (en) Image recognition method and apparatus based on ct sequence, and electronic device and medium
WO2021120688A1 (en) Medical misdiagnosis detection method and apparatus, electronic device and storage medium
WO2021012904A1 (en) Data updating method and related device
WO2022222943A1 (en) Department recommendation method and apparatus, electronic device and storage medium
US20180210925A1 (en) Reliability measurement in data analysis of altered data sets
CN112216361A (en) Follow-up plan list generation method, device, terminal and medium based on artificial intelligence
WO2022194062A1 (en) Disease label detection method and apparatus, electronic device, and storage medium
CN112016905B (en) Information display method and device based on approval process, electronic equipment and medium
CN111950625A (en) Risk identification method and device based on artificial intelligence, computer equipment and medium
CN114220541A (en) Disease prediction method, disease prediction device, electronic device, and storage medium
WO2019085464A1 (en) Violation document scoring method and device, and computer readable storage medium
WO2022247007A1 (en) Medical image grading method and apparatus, electronic device, and readable storage medium
CN116483976A (en) Registration department recommendation method, device, equipment and storage medium
CN112634017A (en) Remote card opening activation method and device, electronic equipment and computer storage medium
CN111651452A (en) Data storage method and device, computer equipment and storage medium
CN116843481A (en) Knowledge graph analysis method, device, equipment and storage medium
CN116779184A (en) Method, system and equipment for quasi-real-time monitoring of vaccine safety and storage medium
CN116719891A (en) Clustering method, device, equipment and computer storage medium for traditional Chinese medicine information packet
CN116434934A (en) Message queue-based patient waiting method and device, electronic equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20916921

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/01/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20916921

Country of ref document: EP

Kind code of ref document: A1