WO2021151291A1

WO2021151291A1 - Disease risk analysis method, apparatus, electronic device, and computer storage medium

Info

Publication number: WO2021151291A1
Application number: PCT/CN2020/112330
Authority: WO
Inventors: 李映雪
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-05-26
Filing date: 2020-08-30
Publication date: 2021-08-05
Also published as: CN111696663A

Abstract

Provided is a disease risk analysis method, relating to data processing technology, comprising: obtaining a training data set and performing classification to obtain a classified data set (S1); using the classified data set to train a plurality of pre-built weak classifiers, and selecting a plurality of target weak classifiers from the plurality of trained weak classifiers and aggregating the target weak classifiers into a disease model (S2); obtaining a user data set to be determined, and pre-processing the user data set to be determined to obtain a target data set (S3); establishing an index relationship between the target data set and the classified data set (S4); matching the target data set with the classified data set according to the index relationship to obtain a matched data set (S5); using the disease model to analyze and calculate the matched data set to obtain a disease analysis result (S6). In addition, the invention also relates to blockchain technology; basic data and/or feature data can be stored in a blockchain node. The invention can solve the problems of low analysis efficiency and low accuracy of disease risk analysis.

Description

Disease risk analysis method, device, electronic equipment and computer storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 26, 2020, the application number is CN202010459737.5, and the title is "Disease Risk Analysis Methods, Devices, Electronic Equipment, and Computer Storage Media", all of which The content is incorporated in this application by reference.

Technical field

This application relates to the field of data processing technology, and in particular to a disease risk analysis method, device, electronic equipment, and computer-readable storage medium.

Background technique

With the rise of big data, data processing technology has been applied in various fields. Today, when people are paying more and more attention to physical health, there is no shortage of patients’ information and data in the medical field to assess people’s health and disease risks. Data processing technology that analyzes and evaluates health status.

In the prior art, the methods for analyzing the risk of disease mainly use logistic regression, decision tree and other interpretable methods. The inventor realized that there are many human subjective factors in this method, and the prediction accuracy is not high. And the forecasting efficiency is low. Therefore, how to achieve high-precision and high-efficiency analysis of disease risk has become an urgent problem to be solved.

Summary of the invention

A disease risk analysis method provided by this application includes:

Acquiring a training data set, and classifying the training data set to obtain a classification data set;

Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ；

Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;

Establishing an index relationship between the target data set and the classification data set;

Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;

The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.

The present application also provides a disease risk analysis device, which includes:

The data classification module is used to obtain a training data set, and classify the training data set to obtain a classification data set;

The model training module is used to train a plurality of pre-built weak classifiers using the classification data set, and select a plurality of target weak classifiers from the plurality of weak classifiers after training, and the target weak classifier Classifiers are aggregated into disease models;

The target data acquisition module is used to acquire the user data set to be judged, and preprocess the user data set to be judged to obtain the target data set;

An index relationship establishment module, configured to establish an index relationship between the target data set and the classification data set;

A data matching module, configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;

The analysis and calculation module is used to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.

This application also provides an electronic device, which includes:

Memory, storing at least one instruction; and

The processor executes the instructions stored in the memory to implement the following steps:

This application also provides a computer-readable storage medium, including a storage data area and a storage program area. The storage data area stores data created according to the use of blockchain nodes, and the storage program area stores a computer program; wherein, the computer The following steps are implemented when the program is executed by the processor:

Description of the drawings

FIG. 1 is a schematic flowchart of a disease risk analysis method provided by an embodiment of the application;

2 is a schematic diagram of modules of a disease risk analysis device provided by an embodiment of the application;

3 is a schematic diagram of the internal structure of an electronic device for implementing a disease risk analysis method provided by an embodiment of the application;

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.

The execution subject of the disease risk analysis method provided in the embodiment of the present application includes, but is not limited to, at least one of the electronic devices that can be configured to execute the method provided in the embodiment of the present application, such as a server and a terminal. In other words, the disease risk analysis method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, etc.

Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

The underlying platform of the blockchain can include processing modules such as user management, basic services, smart contracts, and operation monitoring. Among them, the user management module is responsible for the identity information management of all blockchain participants, including the maintenance of public and private key generation (account management), key management, and maintenance of the correspondence between the user’s real identity and the blockchain address (authority management), etc. In the case of authorization, supervise and audit certain real-identity transactions, and provide risk control rule configuration (risk control audit); basic service modules are deployed on all blockchain node devices to verify the validity of business requests, After completing the consensus on the valid request, it is recorded on the storage. For a new business request, the basic service first performs interface adaptation analysis and authentication processing (interface adaptation), and then encrypts the business information through the consensus algorithm (consensus management), After encryption, it is completely and consistently transmitted to the shared ledger (network communication), and recorded and stored; the smart contract module is responsible for contract registration and issuance, contract triggering and contract execution. Developers can define the contract logic through a certain programming language and publish it to On the blockchain (contract registration), according to the logic of the contract terms, call keys or other events to trigger execution, complete the contract logic, and also provide the function of contract upgrade and cancellation; the operation monitoring module is mainly responsible for the deployment of the product release process , Configuration modification, contract settings, cloud adaptation, and visual output of real-time status during product operation, such as: alarms, monitoring network conditions, monitoring node equipment health status, etc.

This application provides a disease risk analysis method. Referring to FIG. 1, it is a schematic flowchart of a disease risk analysis method provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the disease risk analysis method includes:

S1. Obtain a training data set, and classify the training data set to obtain a classification data set.

In the embodiment of the present application, the training data set is a data set that records the user's disease history information. The training data set includes, but is not limited to: information related to physical conditions (such as gender, age, allergy history, etc.), disease Name, disease type, disease symptoms, disease medication. The training data can be obtained from a database used for storing patient data in various hospitals, and the data stored in the database is the data after desensitizing the patient data.

Further, in the embodiment of the present application, when the training data set is classified, the training data set is classified according to different characteristics. For example, the training data in the training data set is classified according to the same disease type; or, the training data set is classified according to the same disease symptoms, and the classification data in the classification data set corresponds to different diseases.

Preferably, in order to better utilize the classification data set later, the embodiment of the present application stores the classification data set in a pre-built disease database, and the disease database may be a mysql database, an Oracle database, and the like.

S2. Use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into Disease model.

Specifically, in an optional embodiment of the present application, the weak classifier is:

h(δ _i ,p,θ)=pδ _i <pθ

Wherein, h(δ _i , p, θ) is the classification result of the weak classifier, δ _i is the classification data in the classification data set, p is the indicator parameter of the preset inequality sign direction, and θ is the preset classification threshold.

During specific implementation, multiple different classification thresholds are preset for θ to obtain multiple weak classifiers. Using multiple weak classifiers to classify the classification data set can obtain a variety of classification results, where each weak classifier corresponds to a classification result.

Optionally, in the embodiment of the present application, after obtaining a plurality of weak classifiers and performing classification according to the weak classifiers, an error rate function is used to select the plurality of pre-trained weak classifiers according to the classification results to obtain Multiple target weak classifiers.

The error rate function is:

Wherein, w _i is the classification data set, and _yi is the classification result of the classification data in the classification data set.

Preferably, in the embodiment of the present application, multiple weak classifiers whose error rate is less than a preset error threshold are selected as the target weak classifiers.

Preferably, the number of the target weak classifiers is consistent with the number of categories of the classification data in the classification data set.

In detail, the disease model is as follows:

Where t is the number of the target weak classifier, f _k is the target weak classifier, F is the set of all target weak classifiers,

Is the output result of the disease model.

By training multiple weak classifiers and using the error rate function to select the multiple pre-trained weak classifiers, a target weak classifier with higher accuracy can be obtained, and a higher accuracy can be obtained. Weak classifiers are aggregated into disease models to improve the accuracy of disease models.

S3. Obtain a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set.

In a preferred embodiment of the present application, the user data set to be determined may be stored in a blockchain node.

Specifically, this application may use a pre-edited java statement to call the user data set to be judged from the nodes used in one or more blockchains.

In the embodiment of this application, the data set of the user to be judged includes but is not limited to: information of the user to be judged (such as gender, age, etc.), the historical illness of the user to be judged, the historical medication of the user to be judged, and the user’s information to be judged. For historical disease symptoms, the number of users to be determined may be one or more.

The preprocessing includes, but is not limited to: data filling, data correction, data deletion, and data standardization.

Further, in an optional embodiment of the present application, the preprocessing the user data set to be judged to obtain the target data set includes:

Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set. Preferably, the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged. The user data to be judged includes multiple attribute data and corresponding data with the judged user. Numerical value, for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set. Preferably, in the embodiment of the present application, said generating the prediction data of the missing data in the missing data set includes:

Using the mice function to select the adjacent data of any missing data in the missing data set;

Calculate the mean value of the adjacent data to obtain predicted data.

In detail, the embodiment of the application uses the mice function to set a length threshold with the position of any missing data in the missing data set in the user data set to be judged as the center point, and select adjacent data within the length threshold, And use the following average algorithm to calculate the average value of the adjacent data to obtain the predicted data Avg:

Among them, V is the number of adjacent data, and D _v is any adjacent data.

The embodiment of the present application fills in the user data to be judged with missing data, which can make the data to be judged more complete, which is beneficial to improve the accuracy of model training.

S4. Establish an index relationship between the target data set and the classified data set.

Preferably, said establishing an index relationship between the target data set and the classification data set in the disease database includes:

Creating a category data table in the disease database according to the categories contained in the classification data set;

Determine the target category to which the target data in the target data set belongs in the category data table;

An index relationship is established between the target data in the target data set and the classified data set according to the target category. Preferably, the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.

For example, when the classification data set is classified according to the disease symptoms contained in the classification data set, when the search is performed, the search is performed according to the historical disease symptoms of the user contained in the target data set. The category corresponding to the target data in the target data set and the classified data in the classification data set is the index relationship.

Further, before the indexing relationship between the target data set and the classification data set is established, the method described in the embodiment of the present application further includes: transmitting the target data set to the disease database through the TCP/IP protocol, so The TCP/IP protocol is a data transmission protocol, and the data transmission interface of the disease database can be called according to the TCP/IP protocol, thereby facilitating the efficient transmission of the target data set to the disease database.

S5. Match the target data set with the classified data set according to the index relationship to obtain a matched data set.

Further, in another optional embodiment of the present application, matching the target data set with the classified data set according to the index relationship to obtain a matching data set includes:

Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;

The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.

In detail, a preset character grabber can be used to extract characters from the target data in the target data set, where the character grabber is a python sentence, and the python sentence is used for character grabbing.

Specifically, preferably, after the multiple character data sets are obtained, the embodiment of the present application matches the multiple character data sets with the classification data set through the index relationship to generate a matching data set, that is, according to The index relationship finds the category corresponding to the classification data in the character data set and the classification data set. The matching data set includes target data in the target data set and classification data in a classification data set corresponding to the target data.

Preferably, the present application further includes using the following array aggregation algorithm to perform array aggregation on the character data set to generate an array data set:

Wherein, J is the array data set, β _i is the character in the character data set, and m is the number of characters in the character data set.

Performing array aggregation on the character data set to generate an array data set, and aggregating the data together can further accelerate the efficiency of subsequent data processing.

S6. Use the disease model to analyze and calculate the matching data set to obtain a disease analysis result.

In the embodiment of the present application, the analysis result is the probability that the user to be judged corresponding to the target data in the target data set suffers from the disease corresponding to the classification data in the classification data set.

Preferably, the embodiment of the present application uses the following analysis algorithm to perform the analysis calculation to obtain the analysis result

Wherein, x _i is the matching data in the matching data set, t is the number of weak classifiers in the disease model, and f _t (x _i ) is the output of the weak classifier.

Further, the embodiment of the present application further includes sending a treatment plan reminder according to the analysis result of the disease.

In detail, after the disease analysis result is obtained, the method further includes: comparing the disease analysis result with a preset result threshold;

When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;

When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.

When sending a treatment plan reminder, the reminder can be sent directly to the user to be judged corresponding to the data set of the user to be judged.

In this embodiment, the first treatment plan and the second treatment plan may be different treatment plans corresponding to different disease severity.

Further, the reminder of the treatment plan includes an analysis of the cause of the disease.

In this embodiment, sending reminders of the treatment plan based on the results of the disease analysis is beneficial for relevant personnel to quickly obtain personalized demand information.

In the embodiment of this application, the obtained training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training. Weak classifier, which aggregates the weak classifiers into disease models; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result. By classifying the training data set before training the model, the efficiency of model training can be improved. By using different types of classification data sets to train the base model, the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis. At the same time, when performing data analysis on the user data set to be judged, the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately. Correspondence, determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.

As shown in Figure 2, it is a schematic diagram of modules of the disease risk analysis device of the present application.

The disease risk analysis apparatus 100 described in this application can be installed in an electronic device. According to the realized functions, the disease risk analysis device may include a data classification module 101, a model training module 102, a target data acquisition module 103, an index relationship establishment module 104, a data matching module 105, and an analysis calculation module 106. The module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.

In this embodiment, the functions of each module/unit are as follows:

The data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set;

The model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models;

The target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set;

The index module 104 is configured to establish an index relationship between the target data set and the classified data set;

The data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;

The analysis and calculation module 106 is configured to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.

In detail, the specific implementation of each module of the disease risk analysis device is as follows:

The data classification module 101 is configured to obtain a training data set, and classify the training data set to obtain a classification data set.

Preferably, in order to better utilize the classification data set later, the embodiment of the application stores the classification data set in a pre-built disease database, and the disease database may be a mysql database, an Oracle database, or the like.

The model training module 102 is configured to use the classification data set to train multiple pre-built weak classifiers, and select multiple target weak classifiers from the multiple weak classifiers after training, and compare all The target weak classifiers are aggregated into disease models.

h(δ _i ,p,θ)=pδ _i <pθ

The error rate function is:

In detail, the disease model is as follows:

Is the output result of the disease model.

The target data acquisition module 103 is configured to acquire a user data set to be judged, and preprocess the user data set to be judged to obtain a target data set.

Further, in an optional embodiment of the present application, the target data acquisition module 103 is specifically configured to:

Acquiring a user data set to be judged, identifying missing data with missing data in the user data set to be judged, and obtaining a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set.

Preferably, the embodiment of the present application can use a pre-edited java sentence to perform length detection on the user data to be judged in the user data set to be judged. The user data to be judged includes multiple attribute data and corresponding data with the judged user. Numerical value, for example, the age data of the user to be judged and the value corresponding to the age data exist in the data set of the user to be judged; in specific detection, the numerical value corresponding to each attribute data in the user data to be judged is detected, and when the value is detected When the length is not 0 or null, it is determined that the value of the attribute data is not missing, and the detection is continued; when the value length is detected as 0 or null, the value of the attribute data is determined to be missing, and all missing values The attribute data and the corresponding value get the set of missing data, that is, the missing data set.

Preferably, in the embodiment of the present application, said generating the prediction data of the missing data in the missing data set includes:

Calculate the mean value of the adjacent data to obtain predicted data.

Among them, V is the number of adjacent data, and D _v is any adjacent data.

The index relationship establishment module 104 is configured to establish an index relationship between the target data set and the classification data set.

Preferably, the index relationship establishment module 104 is specifically configured to:

An index relationship is established between the target data in the target data set and the classified data set according to the target category.

Preferably, the indexing relationship between the target data in the target data set and the classified data set according to the target category refers to the historical disease of the user contained in the target data set, the historical medication of the user, the user Any data such as historical disease symptoms and other data is retrieved in the category data table, and the target data in the target data set is classified into a corresponding category according to the search result.

The data matching module 105 is configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set.

Further, in another optional embodiment of the present application, the data matching module 105 is specifically configured to: extract characters from multiple target data in the target data set to generate multiple character data corresponding to the multiple target data set;

In detail, after the disease analysis result is obtained, the device further includes a message sending module, and the message sending module is configured to:

Comparing the result of the disease analysis with a preset result threshold;

In the embodiment of this application, the acquired training data set is classified, the classification data set obtained by the classification is used to train multiple pre-built weak classifiers, and multiple targets are selected from the multiple weak classifiers after training. Weak classifier, which aggregates the weak classifier into a disease model; after obtaining the user data set to be judged, the judgment data set is preprocessed to obtain the target data set, and the target data set and the classification data set are indexed; according to the index The relationship matches the target data set with the classification data set to obtain a matched data set; the disease model is used to analyze and calculate the matched data set to obtain a disease analysis result. By classifying the training data set before training the model, the efficiency of model training can be improved. By using different types of classification data sets to train the base model, the accuracy of the disease model can be improved, which is conducive to improving the accuracy of disease model analysis. At the same time, when performing data analysis on the user data set to be judged, the target data set obtained by the preprocessing of the user data set to be judged and the classification data set are indexed, so that the target data set and the classification data set can be found quickly and accurately. Correspondence, determining the category corresponding to the target data set is beneficial to quickly and accurately identify the disease risk corresponding to the user data to be judged based on the category through the disease model.

As shown in FIG. 3, it is a schematic diagram of the structure of an electronic device implementing the disease risk analysis method of the present application.

The electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a disease risk analysis program 12.

Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the disease risk analysis program 12, etc., but also to temporarily store data that has been output or will be output.

The processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc. The processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules stored in the memory 11 (such as executing Disease risk analysis programs, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.

The bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.

FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.

For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

Further, the electronic device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.

Optionally, the electronic device 1 may also include a user interface. The user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.

It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.

The disease risk analysis program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:

Further, if the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer readable storage medium, and the computer readable storage The medium can be volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .

Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store a block chain node Use the created data, etc.

In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the present application.

Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any accompanying diagrams in the claims should not be regarded as limiting the claims involved.

The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

In addition, it is obvious that the word "including" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices stated in the system claims can also be implemented by one unit or device through software or hardware. The second class words are used to indicate names, and do not indicate any specific order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims

A disease risk analysis method, wherein the method includes:

Acquiring a training data set, and classifying the training data set to obtain a classification data set;

Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ；

Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;

Establishing an index relationship between the target data set and the classification data set;

Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;

The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
The disease risk analysis method of claim 1, wherein the preprocessing the user data set to be judged to obtain the target data set comprises:

Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set.
The disease risk analysis method according to claim 1, wherein said establishing an index relationship between said target data set and said classification data set comprises:

Creating a category data table in the disease database according to the categories contained in the classification data set;

Determine the target category to which the target data in the target data set belongs in the category data table;

An index relationship is established between the target data in the target data set and the classified data set according to the target category.
The disease risk analysis method of claim 1, wherein the matching the target data set with the classification data set according to the index relationship to obtain a matching data set comprises:

Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;

The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
The disease risk analysis method according to any one of claims 1 to 4, wherein, after the disease analysis result is obtained, the method further comprises:

Comparing the result of the disease analysis with a preset result threshold;

When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;

When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
The disease risk analysis method according to claim 2, wherein said generating prediction data of missing data in said missing data set comprises:

Using the mice function to select the adjacent data of any missing data in the missing data set;

Calculate the mean value of the adjacent data to obtain predicted data.
A disease risk analysis device, wherein the device includes:

The data classification module is used to obtain a training data set, and classify the training data set to obtain a classification data set;

The model training module is used to train a plurality of pre-built weak classifiers using the classification data set, and select a plurality of target weak classifiers from the plurality of weak classifiers after training, and the target weak classifier Classifiers are aggregated into disease models;

The target data acquisition module is used to acquire the user data set to be judged, and preprocess the user data set to be judged to obtain the target data set;

An index relationship establishment module, configured to establish an index relationship between the target data set and the classification data set;

A data matching module, configured to match the target data set with the classified data set according to the index relationship to obtain a matching data set;

The analysis and calculation module is used to analyze and calculate the matching data set by using the disease model to obtain a disease analysis result.
The disease risk analysis device according to claim 7, wherein the target data acquisition module is specifically used for:

Acquiring a user data set to be judged, identifying missing data with missing data in the user data set to be judged, and obtaining a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set.
The disease risk analysis device according to claim 7, wherein the index relationship establishment module is specifically configured to:

Creating a category data table in the disease database according to categories included in the classification data set;

Determine the target category to which the target data in the target data set belongs in the category data table;

An index relationship is established between the target data in the target data set and the classified data set according to the target category.
An electronic device, wherein the electronic device includes:

At least one processor; and,

A memory communicatively connected with the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the following steps:

Acquiring a training data set, and classifying the training data set to obtain a classification data set;

Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ；

Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;

Establishing an index relationship between the target data set and the classification data set;

Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;

The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
11. The electronic device according to claim 10, wherein said preprocessing said user data set to be judged to obtain a target data set comprises:

Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set.
11. The electronic device according to claim 10, wherein said establishing an index relationship between said target data set and said classification data set comprises:

Creating a category data table in the disease database according to the categories contained in the classification data set;

Determine the target category to which the target data in the target data set belongs in the category data table;

An index relationship is established between the target data in the target data set and the classified data set according to the target category.
11. The electronic device according to claim 10, wherein said matching said target data set with said classification data set according to said index relationship to obtain a matching data set comprises:

Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;

The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
The electronic device according to any one of claims 10 to 13, wherein, after the disease analysis result is obtained, when the instruction is executed by the at least one processor, the following steps are further implemented:

Comparing the result of the disease analysis with a preset result threshold;

When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;

When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.
11. The electronic device according to claim 11, wherein said generating prediction data of missing data in said missing data set comprises:

Using the mice function to select the adjacent data of any missing data in the missing data set;

Calculate the mean value of the adjacent data to obtain predicted data.
A computer-readable storage medium includes a storage data area and a storage program area. The storage data area stores data created according to the use of blockchain nodes, and the storage program area stores a computer program; wherein the computer program is stored by a processor The following steps are implemented during execution:

Acquiring a training data set, and classifying the training data set to obtain a classification data set;

Use the classification data set to train multiple pre-built weak classifiers, select multiple target weak classifiers from the multiple weak classifiers after training, and aggregate the target weak classifiers into a disease model ；

Acquiring a user data set to be judged, and preprocessing the user data set to be judged to obtain a target data set;

Establishing an index relationship between the target data set and the classification data set;

Matching the target data set with the classification data set according to the index relationship to obtain a matching data set;

The disease model is used to analyze and calculate the matching data set to obtain a disease analysis result.
15. The computer-readable storage medium of claim 16, wherein the preprocessing the user data set to be judged to obtain the target data set comprises:

Identify missing data with missing data in the user data set to be judged, and obtain a missing data set;

Generating prediction data of missing data in the missing data set;

Filling the prediction data into the user data set to be judged to obtain the target data set.
15. The computer-readable storage medium according to claim 16, wherein said establishing an index relationship between said target data set and said classification data set comprises:

Creating a category data table in the disease database according to the categories contained in the classification data set;

Determine the target category to which the target data in the target data set belongs in the category data table;

An index relationship is established between the target data in the target data set and the classified data set according to the target category.
15. The computer-readable storage medium according to claim 16, wherein said matching said target data set with said classification data set according to said index relationship to obtain a matching data set comprises:

Extracting characters from multiple target data in the target data set to generate multiple character data sets corresponding to the multiple target data;

The multiple character data sets are matched with the classification data sets through the index relationship to generate a matching data set.
The computer-readable storage medium according to any one of claims 16 to 19, wherein, after the disease analysis result is obtained, the following steps are further implemented when the computer program is executed by the processor:

Comparing the result of the disease analysis with a preset result threshold;

When the disease analysis result is less than or equal to the result threshold, sending a first treatment plan reminder;

When the disease analysis result is greater than the result threshold, a second treatment plan reminder is sent.