CN112364896B - Method and device for determining health information distribution based on machine learning - Google Patents

Method and device for determining health information distribution based on machine learning Download PDF

Info

Publication number
CN112364896B
CN112364896B CN202011153516.1A CN202011153516A CN112364896B CN 112364896 B CN112364896 B CN 112364896B CN 202011153516 A CN202011153516 A CN 202011153516A CN 112364896 B CN112364896 B CN 112364896B
Authority
CN
China
Prior art keywords
spectrum data
data
saliva
health
blood
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011153516.1A
Other languages
Chinese (zh)
Other versions
CN112364896A (en
Inventor
曾振
王健宗
程宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011153516.1A priority Critical patent/CN112364896B/en
Priority to PCT/CN2020/136368 priority patent/WO2021189982A1/en
Publication of CN112364896A publication Critical patent/CN112364896A/en
Application granted granted Critical
Publication of CN112364896B publication Critical patent/CN112364896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses a method and a device for determining health information distribution based on machine learning, relates to the technical field of data processing, and mainly aims to solve the problems that the existing health information distribution image is low in determining efficiency and cannot meet the requirements of convenience and rapidness in data processing in health care. Comprising the following steps: acquiring spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information. The method is mainly used for determining the health information distribution based on machine learning.

Description

Method and device for determining health information distribution based on machine learning
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for determining health information distribution based on machine learning.
Background
With the increasing attention of people to self health and health of others, intelligent health physical examination has gradually developed as an attention focus in medical care schemes. The intelligent health examination refers to acquiring basic health data of a user through a simple medical examination mode such as blood collection, blood pressure, blood sugar, ultrasonic images and the like, and analyzing the basic health data through an accurate data processing mode to obtain health indexes or distribution of all health information of the user.
At present, the existing distribution of health information is usually based on comparison between each single index in basic health data and international medical standards, the requirement of comprehensive analysis of the health information cannot be met, and the single comparison mode enables the result obtained by data processing to be redundant; the basic monitoring data is used as a medical resource, the single comparison mode cannot meet the requirements of determining the adaptive health information distribution image along with the change of different medical scenes, so that the health information distribution image determination efficiency is low, and the convenience and rapidity requirements of the health medical treatment on the data processing cannot be met.
Disclosure of Invention
In view of the above, the invention provides a method and a device for determining health information distribution based on machine learning, which mainly aims to solve the problems that the existing health information distribution image has low determining efficiency and cannot meet the requirements of convenience and rapidness in data processing in health care.
According to one aspect of the present invention, there is provided a method for determining a health information distribution based on machine learning, including:
acquiring spectrum data;
classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations;
and integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information.
Further, before the classifying the spectral data based on the trained spectral classification model, the method further comprises:
acquiring a spectrum training data set, wherein the spectrum training data set comprises spectrum data corresponding to health features of different classifications;
constructing a spectrum classification model comprising at least two decision tree models and a neural network model in a combined way, wherein the combined construction is realized by taking the at least two decision tree models as input layers and the one neural network model as output layers;
and training the spectrum classification model which is built by combining based on the spectrum training data set.
Further, the acquiring spectral data includes:
and acquiring spectral data at least comprising blood infrared spectral data, blood ultraviolet spectral data, saliva infrared spectral data and saliva ultraviolet spectral data.
Further, the integrating the health features in the classification result according to the preset spectrum integration weight to obtain a distribution image of the health information includes:
counting the integration interval of the health features of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data marks in the classification processing result by combining a weighted summation mode with a preset spectrum integration weight;
and drawing a distribution image containing the health information of the integration interval in a superposition mode.
Further, before the classifying the spectral data based on the trained spectral classification model, the method further comprises:
respectively judging whether a distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, the wavelength value and the amplitude value in the saliva ultraviolet spectrum data;
and if the distortion state exists, filtering the wavelength value and the amplitude value which are in the distortion state, and taking the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data as spectrum data to be classified, wherein the filtering is to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data which are corresponding to the wavelength value and the amplitude value which are in the distortion state.
Further, after the drawing of the distribution image containing the health information of the integration interval in the superimposed manner, the method further includes:
after receiving a distributed image query request of health information, extracting a historical image matched with the distributed image;
and rendering the distribution image and the history image according to different colors, and combining and rendering the distribution image and the history image in a semitransparent overlapping mode for output.
Further, the health features are feature data upon which different health states are characterized.
According to another aspect of the present invention, there is provided a health information distribution determining apparatus based on machine learning, including:
the acquisition module is used for acquiring spectrum data;
the classification processing module is used for carrying out classification processing on the spectrum data based on a trained spectrum classification model to obtain classification processing results containing the spectrum data and respectively marking health features, and the spectrum classification model is a hybrid model established based on different-level machine learning model combinations;
and the integration processing module is used for integrating the health features in the classification processing result according to a preset spectrum integration weight value to obtain a distribution image of the health information.
Further, the apparatus further comprises: the model is built and the model is trained,
the acquisition module is further used for acquiring a spectrum training data set, wherein the spectrum training data set comprises spectrum data corresponding to health features with different classification marks;
the building module is used for building a spectrum classification model comprising at least two decision tree models and one neural network model in a combined way, wherein the combined building is realized by taking the at least two decision tree models as input layers and the one neural network model as output layers;
and the training module is used for training the spectrum classification model which is built by combining based on the spectrum training data set.
Further, the acquisition module is specifically configured to acquire spectral data including at least blood infrared spectral data, blood ultraviolet spectral data, saliva infrared spectral data, saliva ultraviolet spectral data.
Further, the integrated processing module includes:
the statistical unit is used for counting the integration interval of the health features marked by the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data in the classification processing result by combining a weighted summation mode with a preset spectrum integration weight;
And the drawing unit is used for drawing the distribution image containing the health information of the integration interval in a superposition mode.
Further, the apparatus further comprises:
the judging module is used for respectively judging whether the distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, the wavelength value and the amplitude value in the saliva ultraviolet spectrum data;
the filtering processing module is used for filtering the wavelength value and the amplitude value which are in the distortion state if the distortion state exists, and taking the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data as spectrum data to be classified, wherein the filtering processing is to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data which are corresponding to the wavelength value and the amplitude value which are in the distortion state.
Further, the integrated processing module includes:
the extraction unit is used for extracting a history image matched with the distribution image after receiving a distribution image query request of the health information;
And the output unit is used for rendering the distribution image and the history image according to different colors, and combining and rendering the distribution image and the history image in a semitransparent overlapping mode for output.
Further, the health features are feature data upon which different health states are characterized.
According to still another aspect of the present invention, there is provided a storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the above-described method for determining a distribution of health information based on machine learning.
According to still another aspect of the present invention, there is provided a computer apparatus including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to perform operations corresponding to the method for determining a health information distribution based on machine learning.
By means of the technical scheme, the technical scheme provided by the embodiment of the invention has at least the following advantages:
Compared with the prior art, the embodiment of the invention obtains the spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing results according to the preset spectrum integration weight to obtain a distribution image of the health information, thereby meeting the determination requirement of health information distribution in health examination, more efficiently and accurately determining the health features of the user, and greatly meeting the convenience and rapidity requirements of the health medical field on data processing.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 is a flowchart of a method for determining a health information distribution based on machine learning according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for determining a health information distribution based on machine learning according to an embodiment of the present invention;
fig. 3 is a block diagram showing a determination apparatus for health information distribution based on machine learning according to an embodiment of the present invention;
FIG. 4 is a block diagram showing another apparatus for determining health information distribution based on machine learning according to an embodiment of the present invention;
fig. 5 shows a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides a method for determining health information distribution based on machine learning, which comprises the following steps of:
101. spectral data is acquired.
When in health examination, the blood sample and the saliva sample of a user are subjected to spectrum detection by utilizing a spectrometer to obtain spectrum data at least comprising blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data and saliva ultraviolet spectrum data, so that the spectrum data is utilized to carry out a distribution image of health information.
102. And classifying the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data and respectively marking health features.
The spectrum classification model is a hybrid model established based on different-level machine learning model combinations, and in order to classify the spectrum data, the hybrid model in the embodiment of the invention is a machine learning model comprising two different classification functions and is combined according to different levels. Specifically, the combining of different levels to build the hybrid model may build a hybrid relationship with the second classification model for the input layer of the first classification model as the second classification model, where the first classification model and the second classification model are different machine learning models, for example, the first classification model is a decision tree model, the second classification model may be other classification models of the non-decision data model, and the output layer in the hybrid model is not specifically limited in the embodiment of the present invention.
103. And integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information.
For the embodiment of the invention, in order to intelligently complete the determination of the distribution image of the health information, after the classification processing is performed by using the spectrum classification model, the classification processing result includes the health feature of the spectrum data marking probability, the health feature is the feature data based on which different health states are represented, for example, the uric acid value can be the feature data representing rheumatic diseases, the human chorionic gonadotropin can be the feature data representing pregnancy states, and the like. Therefore, in order to enable the obtained classified health features to be suitable for health examination so as to quickly, conveniently and accurately determine the health information of the user, the spectrum data of the classified different health features are integrated to obtain the distribution image of the health information, so that the convenience and rapidness requirements of the health treatment on the data processing are met.
It should be noted that, the preset spectrum integration weight is a weight configured for different health features in advance for spectrum characteristic distribution, for example, the weight of the health feature of myocarditis classified based on infrared spectrum data is 0.2, the weight of the health feature of myocarditis classified based on ultraviolet spectrum data is 0.4, so that health information integration is performed on the health feature of myocarditis based on the weights of 0.2 and 0.4, and a distribution image of myocarditis is obtained.
Compared with the prior art, the embodiment of the invention obtains the spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing results according to the preset spectrum integration weight to obtain a distribution image of the health information, thereby meeting the determination requirement of health information distribution in health examination, more efficiently and accurately determining the health features of the user, and greatly meeting the convenience and rapidity requirements of the health medical field on data processing.
The embodiment of the invention provides another method for determining health information distribution based on machine learning, as shown in fig. 2, the method comprises the following steps:
201. a spectral training dataset is acquired.
In the embodiment of the invention, in order to realize training of the hybrid model, the classification capability of the spectrum data is accurately obtained, and the spectrum training data set is acquired so as to acquire training data from the spectrum training data set to train the hybrid model. The spectrum training data set comprises spectrum data corresponding to health features of different classifications, wherein the health features are feature data used for representing different health states. The spectral data includes at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, saliva ultraviolet spectrum data, and the spectral data is characterized by light wavelength and amplitude, for example, if the radiation amount in the range of a minute wavelength width dλ centered on the wavelength λ is dX for the blood infrared spectrum data, the radiation amount corresponding to the unit wavelength interval is referred to as spectral density xλ, that is, xλ=dx/dλ, where the radiation amount may be radiation flux, radiation intensity, radiation brightness, radiation illuminance, or the like. In general, the wavelengths are different, and the corresponding spectral densities are also different, and when the corresponding relationship between the spectral density of the light source and the wavelength is expressed by a function, the function is called as the spectral distribution xλ (λ) of the light source, that is, the blood infrared spectrum data, and the embodiment of the invention is not particularly limited.
202. The spectral classification model is built by combining at least two decision tree models and a neural network model.
For the embodiment of the invention, in order to improve the classification processing capability of the spectrum data, the data classification processing is executed efficiently, and the spectrum classification model of at least two decision tree models and one neural network model is constructed through combination. In the embodiment of the invention, since the spectrum data at least can include blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data and saliva ultraviolet spectrum data, in order to optimize the accuracy of the classification processing, at least two decision tree models are used as input levels, and one neural network model is used as an output layer to construct a mixed spectrum classification model.
It should be noted that different spectral data may result in different health characteristics being determined, and thus, the first level is to build at least two decision tree models, and the specific steps include: randomly replacing samples from an original training set by using a Bootstrapping method to select m samples, sampling n_tree times altogether, generating n_tree training sets, and respectively training n_tree decision tree models for the n_tree training sets. For a single decision tree model, the number of training sample features is that pruning is not needed in the splitting process of the decision tree, and the method comprises the following steps:
A. The feature set D' = { z1, z2, z3, z4}, the health feature is classified into class 2, the classification result is yes, not, for example, a decision tree of cold feature can be constructed, the first layer feature judges whether the a spectrum data accords with the cold feature spectrum distribution between c-b, if yes, the second layer feature judges whether the a spectrum data accords with the viral cold spectrum distribution between f-t, and the like. Assuming that the given training set is D = { (x 1, y 1), (x 2, y 2),, (xNyN) }, the jth traversal xj and its value s can be selected as the segmentation traversal and segmentation points, 2 regions are defined, R 1 (j,s)={x|x j ≤s},R 2 (j,s)={x|x j More than s }, then find the optimal segmentation traversal xj and the optimal segmentation point s, solveWhere cm is the decision tree output on Rm, which is the average of the outputs yi corresponding to all input instances xi on region Rm. />The above procedure is repeated for each region R1 and R2 until a stop condition is met, dividing the input space into M regions R1, R2,, RM, generating a decision tree:
B. for the class-two classification problem, if the probability that the sample point belongs to class 1 is p, the base index of the probability distribution is:
for sample set D, its base index is:wherein Ck is the sample set belonging to the kth class in D, K is the class Other number, if the sample set D is divided into two parts D1 and D2 according to whether the feature D' takes a certain possible value z, namely D 1 ={(x,y)|D'(x)=z},D 2 =D-D 1 Under the condition of the characteristic D', the base index of the set D is as follows: />
C. Generating a decision tree: 1. assuming the training dataset of the node as D, for each feature D' = { z1, z2, z3, z4}, for each value { z1, z2, z3, z4} that it may take, dividing D into two parts D1 and D2 based on whether the test of the sample point pair { z1, z2, z3, z4} is "yes" or "no", and calculating2. Among all possible features D' and all possible segmentation points { z1, z2, z3, z4}, the feature with the smallest keni index and its corresponding segmentation point are selected as the optimal feature and the optimal segmentation point. And generating two sub-nodes from the current node according to the characteristics, and distributing the training data set to the two sub-nodes according to the characteristics. 3. And recursively calling 1 and 2 for the two sub-nodes until a stop condition is met to generate a CART decision tree n_tree, wherein the process is a single decision tree model generation process, and the generation processes of at least more than two decision tree models are the same and are not repeated.
In addition, after the model establishment of a plurality of decision trees is completed, the label probability of matching the classification results of the health features of different spectrum data in the training set is obtained, the vector form is used as training sample data of the neural network to train the neural network, for example, the classification result of the decision tree is spectrum data a-which belongs to the virus type 1 cold feature, and the spectrum data a belongs to the rheumatism feature and the meningitis feature. The vectors of the whole disease features, such as 150 disease vectors, are configured to be 1 corresponding to the three health features, and the rest are 0, and when the neural network training sample is constructed, the input sample data is the tag probability, namely, the risk distribution of the classified health features is determined, the tag probability comprises the weight configuration of at least 30 major disease features, the weight configuration of 80 medium disease features and the weight configuration of 40 light disease features, so that the neural network training is performed to obtain the determination result of the health features including low risk, medium risk and high risk, and the embodiment of the invention is not particularly limited.
203. And training the spectrum classification model which is built by combining based on the spectrum training data set.
In the embodiment of the invention, in order to realize the classification training of the spectrum data, the spectrum classification model which is built by combining is trained based on the training data in the spectrum training data set, so that the spectrum classification model which is used for completing the training and is suitable for the classification of the health features is obtained.
204. Spectral data is acquired.
Further, for further definition and explanation, step 204 may specifically include: and acquiring spectral data at least comprising blood infrared spectral data, blood ultraviolet spectral data, saliva infrared spectral data and saliva ultraviolet spectral data.
In the embodiment of the invention, in order to simplify the determination step of the distribution of the health information and improve the operation convenience of collecting the health samples of the user, when the blood samples and the saliva samples are collected, the blood samples and the saliva samples are subjected to spectrum analysis by utilizing the spectrometer, and the spectrum data at least comprising blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data and saliva ultraviolet spectrum data are analyzed.
The spectral data analyzed by the spectrometer is determined based on the wavelength and the amplitude of different light rays in blood and saliva, so that the spectral data of different light rays such as ultraviolet rays, infrared rays and the like are obtained, and the embodiment of the invention is not particularly limited.
205. And classifying the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data and respectively marking health features.
Further, in order to optimize the spectrum data and avoid that abnormal data exists in the analyzed spectrum data to affect classification processing, data preprocessing is required to be performed on the spectrum data, and in the embodiment of the invention, before the classification processing is performed on the spectrum data based on the trained spectrum classification model, the method further includes: respectively judging whether a distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, the wavelength value and the amplitude value in the saliva ultraviolet spectrum data; and if the distortion state exists, filtering the wavelength value and the amplitude value in the distortion state, and taking the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data as spectrum data to be classified.
For the embodiment of the invention, because the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data are all embodied based on the wavelength value and the amplitude value, in order to filter abnormal data, whether the wavelength value and the amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data have distortion states is judged. The distortion state is a sharp increase or decrease of a wavelength value and an amplitude value, and generally, the sharp increase or decrease is configured with a distortion range matching a normal wavelength value and an amplitude value of a spectrum, and if the distortion range is exceeded, the distortion state is determined to exist. And filtering the wavelength value and the amplitude value in the distortion state to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data corresponding to the wavelength value and the amplitude value in the distortion state. If any one of the spectrum data has distorted wavelength value and amplitude value, the corresponding spectrum data is deleted, if all the spectrum data has distorted, the acquisition error of the spectrometer is indicated, and all the spectrum data can be deleted so as to carry out spectrum analysis on the blood and saliva samples again.
206. And integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information.
For further explanation and refinement of the embodiment of the present invention, step 206 may specifically include: counting the integration interval of the health features of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data marks in the classification processing result by combining a weighted summation mode with a preset spectrum integration weight; and drawing a distribution image containing the health information of the integration interval in a superposition mode.
In the embodiment of the invention, since the classification processing result comprises the health features of low risk, medium risk and high risk, in order to accurately obtain the distribution of the health information, the integration intervals of the health features of different classes of risks which are subjected to classification processing are counted by combining a preset spectrum integration weight in a weighted summation mode, and the distribution image of the health information of the integration interval is drawn. The preset spectrum integration weight is a weight configured for different health features according to spectrum characteristic distribution in advance, for example, blood ultraviolet spectrum data is classified into a middle risk virus type 1 cold feature and a low risk meningitis feature, blood infrared spectrum data is classified into a high risk virus type 1 cold feature and a middle risk meningitis feature, the corresponding blood ultraviolet spectrum data is classified into a virus type 1 cold feature of 0.1 and a meningitis feature of 0.6, the blood infrared spectrum data is classified into a virus type 1 cold feature of 0.3 and a meningitis feature of 0.3, each weight is used for carrying out weighted summation, for example, the risk +0.3 in the middle risk and the high risk, each grade risk is digitized in advance, and a numerical region of each risk after the weighted summation is configured, so that health information of the weighted summation virus type 1 cold feature, such as 0.1 in the middle risk +0.3, is high risk and high risk is obtained.
It should be noted that, in the drawing of the distribution image including the health information of the integration interval in the overlapping manner, the integration interval is a risk interval where different health features are located, for example, the health information determined by the user may include one health feature or may include a plurality of health features, so, in order to uniformly perform visualization, the distribution image is drawn in the overlapping manner, and each risk region in the distribution image may display the distribution of different health features in the overlapping manner, for example, the middle risk region may include a rheumatism feature and a meningitis feature, so as to conveniently complete the distribution display of the health information.
Further, in order to meet the visualization requirement of the health information distribution, after step 206, the embodiment of the present invention further includes: after receiving a distributed image query request of health information, extracting a historical image matched with the distributed image; and rendering the distribution image and the history image according to different colors, and combining and rendering the distribution image and the history image in a semitransparent overlapping mode for output.
For the embodiment of the invention, in order to meet the management requirement of the health information, after the request of the distributed image combined with the health information is combined, the historical image matched with the distributed image is extracted, namely the distributed image generated by the historical health characteristics of the user, the distributed image and the historical image are rendered according to different colors, the distributed image and the historical image are combined and rendered in a semitransparent overlapping mode, and the output is carried out, so that the user can view the historical images with different colors and the distributed image through semitransparent rendering graphs. If the historical images are multiple, multiple colors can be rendered after the historical images are marked according to the time information, so that the visual effect is improved.
Compared with the prior art, the embodiment of the invention obtains the spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing results according to the preset spectrum integration weight to obtain a distribution image of the health information, thereby meeting the determination requirement of health information distribution in health examination, more efficiently and accurately determining the health features of the user, and greatly meeting the convenience and rapidity requirements of the health medical field on data processing.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a device for determining a health information distribution based on machine learning, as shown in fig. 3, where the device includes:
an acquisition module 31 for acquiring spectral data;
the classification processing module 32 is configured to perform classification processing on the spectral data based on a trained spectral classification model, so as to obtain classification processing results that include the respective health features marked by the spectral data, where the spectral classification model is a hybrid model built based on a combination of machine learning models at different levels;
The integration processing module 33 is configured to integrate the health features in the classification result according to a preset spectrum integration weight, so as to obtain a distribution image of the health information.
Compared with the prior art, the embodiment of the invention obtains the spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing results according to the preset spectrum integration weight to obtain a distribution image of the health information, thereby meeting the determination requirement of health information distribution in health examination, more efficiently and accurately determining the health features of the user, and greatly meeting the convenience and rapidity requirements of the health medical field on data processing.
Further, as an implementation of the method shown in fig. 2, another apparatus for determining a health information distribution based on machine learning is provided in an embodiment of the present invention, as shown in fig. 4, where the apparatus includes:
An acquisition module 41 for acquiring spectral data;
the classification processing module 42 is configured to perform classification processing on the spectral data based on a trained spectral classification model, to obtain classification processing results including health features marked by the spectral data respectively, where the spectral classification model is a hybrid model built based on a combination of machine learning models at different levels;
the integration processing module 43 is configured to integrate the health features in the classification result according to a preset spectrum integration weight, so as to obtain a distribution image of the health information.
Further, the apparatus further comprises: a model 44 is constructed and a model 45 is trained,
the acquiring module 41 is further configured to acquire a spectrum training dataset, where the spectrum training dataset includes spectrum data corresponding to health features labeled with different classifications;
the building module 44 is configured to build a spectrum classification model including at least two decision tree models and one neural network model in a combined manner, where the combined construction is implemented by using the at least two decision tree models as an input layer and the one neural network model as an output layer;
the training module 45 is configured to train the spectrum classification model that is built by combining based on the spectrum training data set.
Further, the acquiring module 41 is specifically configured to acquire spectral data including at least blood infrared spectral data, blood ultraviolet spectral data, saliva infrared spectral data, saliva ultraviolet spectral data.
Further, the integrated processing module 43 includes:
the statistics unit 4301 is configured to use a weighted summation manner to combine a preset spectrum integration weight to count integration intervals of health features marked by the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification result;
a drawing unit 4302 for drawing a distribution image containing the health information of the integration section in a superimposed manner.
Further, the apparatus further comprises:
the judging module 46 is configured to judge whether a distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, a wavelength value and an amplitude value in the saliva ultraviolet spectrum data respectively;
the filtering module 47 is configured to filter the wavelength value and the amplitude value in the distortion state if the distortion state exists, and take the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data as spectrum data to be classified, where the filtering process is to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data corresponding to the wavelength value and the amplitude value in the distortion state.
Further, the integrated processing module 43 includes:
an extraction unit 4303 configured to extract a history image that matches a distribution image of health information after receiving the distribution image query request;
and an output unit 4304, configured to render the distribution image and the history image according to different colors, and combine and render the distribution image and the history image in a semitransparent overlapping manner for output.
Further, the health features are feature data upon which different health states are characterized.
Compared with the prior art, the embodiment of the invention obtains the spectrum data; classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations; and integrating the health features in the classification processing results according to the preset spectrum integration weight to obtain a distribution image of the health information, thereby meeting the determination requirement of health information distribution in health examination, more efficiently and accurately determining the health features of the user, and greatly meeting the convenience and rapidity requirements of the health medical field on data processing.
According to an embodiment of the present invention, there is provided a storage medium storing at least one executable instruction for performing the method for determining a health information distribution based on machine learning in any of the above method embodiments.
Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and the specific embodiment of the present invention is not limited to the specific implementation of the computer device.
As shown in fig. 5, the computer device may include: a processor 502, a communication interface (Communications Interface) 504, a memory 506, and a communication bus 508.
Wherein: processor 502, communication interface 504, and memory 506 communicate with each other via communication bus 508.
A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.
The processor 502 is configured to execute the program 510, and may specifically perform relevant steps in the above-described embodiment of a method for determining a health information distribution based on machine learning.
In particular, program 510 may include program code including computer-operating instructions.
The processor 502 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the computer device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
A memory 506 for storing a program 510. Memory 506 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may be specifically operable to cause the processor 502 to:
acquiring spectrum data;
classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations;
and integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information.
It will be apparent to those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A method for determining a distribution of health information based on machine learning, comprising:
acquiring spectrum data;
classifying the spectrum data based on a trained spectrum classification model to obtain classification processing results containing health features marked by the spectrum data respectively, wherein the spectrum classification model is a hybrid model established based on different-level machine learning model combinations;
integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of health information;
wherein the acquiring spectral data comprises:
acquiring spectrum data at least comprising blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data and saliva ultraviolet spectrum data;
the step of integrating the health features in the classification processing result according to the preset spectrum integration weight value to obtain a distribution image of the health information comprises the following steps:
Counting the integration interval of the health features of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data marks in the classification processing result by combining a weighted summation mode with a preset spectrum integration weight;
drawing a distribution image containing the health information of the integration interval in a superposition mode;
before classifying the spectral data based on the trained spectral classification model, the method further comprises:
respectively judging whether a distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, the wavelength value and the amplitude value in the saliva ultraviolet spectrum data;
and if the distortion state exists, filtering the wavelength value and the amplitude value which are in the distortion state, and taking the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data as spectrum data to be classified, wherein the filtering is to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data which are corresponding to the wavelength value and the amplitude value which are in the distortion state.
2. The method of claim 1, wherein prior to classifying the spectral data based on the trained spectral classification model, the method further comprises:
acquiring a spectrum training data set, wherein the spectrum training data set comprises spectrum data corresponding to health features of different classifications;
constructing a spectrum classification model comprising at least two decision tree models and a neural network model in a combined way, wherein the combined construction is realized by taking the at least two decision tree models as input layers and the one neural network model as output layers;
and training the spectrum classification model which is built by combining based on the spectrum training data set.
3. The method according to claim 2, wherein after the overlaying the distribution image containing the health information of the integration interval, the method further comprises:
after receiving a distributed image query request of health information, extracting a historical image matched with the distributed image;
and rendering the distribution image and the history image according to different colors, and combining and rendering the distribution image and the history image in a semitransparent overlapping mode for output.
4. A method according to any one of claims 1-3, wherein the health features are feature data on which different health states are based.
5. A machine learning based health information distribution determining apparatus, comprising:
the acquisition module is used for acquiring spectrum data;
the classification processing module is used for carrying out classification processing on the spectrum data based on a trained spectrum classification model to obtain classification processing results containing the spectrum data and respectively marking health features, and the spectrum classification model is a hybrid model established based on different-level machine learning model combinations;
the integration processing module is used for integrating the health features in the classification processing result according to a preset spectrum integration weight to obtain a distribution image of the health information;
the acquisition module is specifically used for acquiring spectrum data at least comprising blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data and saliva ultraviolet spectrum data;
wherein, the integration processing module includes:
the statistical unit is used for counting the integration interval of the health features marked by the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data in the classification processing result by combining a weighted summation mode with a preset spectrum integration weight;
A drawing unit for drawing a distribution image containing the health information of the integration interval in a superimposed manner;
the apparatus further comprises:
the judging module is used for respectively judging whether the distortion state exists in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, the wavelength value and the amplitude value in the saliva ultraviolet spectrum data;
the filtering processing module is used for filtering the wavelength value and the amplitude value which are in the distortion state if the distortion state exists, and taking the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data as spectrum data to be classified, wherein the filtering processing is to delete the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data and the saliva ultraviolet spectrum data which are corresponding to the wavelength value and the amplitude value which are in the distortion state.
6. A storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the method of determining a distribution of health information based on machine learning of any one of claims 1-4.
7. A computer device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the method for determining a health information distribution based on machine learning of any one of claims 1-4.
CN202011153516.1A 2020-10-26 2020-10-26 Method and device for determining health information distribution based on machine learning Active CN112364896B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011153516.1A CN112364896B (en) 2020-10-26 2020-10-26 Method and device for determining health information distribution based on machine learning
PCT/CN2020/136368 WO2021189982A1 (en) 2020-10-26 2020-12-15 Health information distribution determination method and apparatus based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011153516.1A CN112364896B (en) 2020-10-26 2020-10-26 Method and device for determining health information distribution based on machine learning

Publications (2)

Publication Number Publication Date
CN112364896A CN112364896A (en) 2021-02-12
CN112364896B true CN112364896B (en) 2023-10-24

Family

ID=74512157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011153516.1A Active CN112364896B (en) 2020-10-26 2020-10-26 Method and device for determining health information distribution based on machine learning

Country Status (2)

Country Link
CN (1) CN112364896B (en)
WO (1) WO2021189982A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008154024A1 (en) * 2007-06-11 2008-12-18 Hartley, Frank Mid-ir spectral measurements for real-time identification of analytes in an industrial and laboratory setting
RU2009104267A (en) * 2009-02-09 2010-08-20 Оксана Анатольевна Гусякова (RU) METHOD FOR ESTIMATING EFFECTIVENESS OF TREATMENT OF CHRONIC GENERALIZED PARADONTITIS
CN107045637A (en) * 2016-12-16 2017-08-15 中国医学科学院生物医学工程研究所 A kind of blood species identification instrument and recognition methods based on spectrum
CN107423549A (en) * 2016-04-21 2017-12-01 唯亚威解决方案股份有限公司 Healthy tracking equipment
CN108542402A (en) * 2018-05-17 2018-09-18 吉林求是光谱数据科技有限公司 Blood sugar detecting method based on Self-organizing Competitive Neutral Net model and infrared spectrum
CN110852227A (en) * 2019-11-04 2020-02-28 中国科学院遥感与数字地球研究所 Hyperspectral image deep learning classification method, device, equipment and storage medium
CN111444965A (en) * 2020-03-27 2020-07-24 泰康保险集团股份有限公司 Data processing method based on machine learning and related equipment
CN111523593A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and apparatus for analyzing medical images

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7899625B2 (en) * 2006-07-27 2011-03-01 International Business Machines Corporation Method and system for robust classification strategy for cancer detection from mass spectrometry data
US20190247650A1 (en) * 2018-02-14 2019-08-15 Bao Tran Systems and methods for augmenting human muscle controls
CN110298396B (en) * 2019-06-25 2022-02-08 北京工业大学 Hyperspectral image classification method based on deep learning multi-feature fusion
CN110946552B (en) * 2019-10-30 2022-04-08 南京航空航天大学 Cervical cancer pre-lesion screening method combining spectrum and image

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008154024A1 (en) * 2007-06-11 2008-12-18 Hartley, Frank Mid-ir spectral measurements for real-time identification of analytes in an industrial and laboratory setting
RU2009104267A (en) * 2009-02-09 2010-08-20 Оксана Анатольевна Гусякова (RU) METHOD FOR ESTIMATING EFFECTIVENESS OF TREATMENT OF CHRONIC GENERALIZED PARADONTITIS
CN107423549A (en) * 2016-04-21 2017-12-01 唯亚威解决方案股份有限公司 Healthy tracking equipment
CN107045637A (en) * 2016-12-16 2017-08-15 中国医学科学院生物医学工程研究所 A kind of blood species identification instrument and recognition methods based on spectrum
CN108542402A (en) * 2018-05-17 2018-09-18 吉林求是光谱数据科技有限公司 Blood sugar detecting method based on Self-organizing Competitive Neutral Net model and infrared spectrum
CN110852227A (en) * 2019-11-04 2020-02-28 中国科学院遥感与数字地球研究所 Hyperspectral image deep learning classification method, device, equipment and storage medium
CN111444965A (en) * 2020-03-27 2020-07-24 泰康保险集团股份有限公司 Data processing method based on machine learning and related equipment
CN111523593A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and apparatus for analyzing medical images

Also Published As

Publication number Publication date
CN112364896A (en) 2021-02-12
WO2021189982A1 (en) 2021-09-30

Similar Documents

Publication Publication Date Title
CN107247971B (en) Intelligent analysis method and system for ultrasonic thyroid nodule risk index
CN105640577A (en) Method and system automatically detecting local lesion in radiographic image
CN112581438B (en) Slice image recognition method and device, storage medium and electronic equipment
CN107133651A (en) The functional magnetic resonance imaging data classification method of subgraph is differentiated based on super-network
CN112347908B (en) Surgical instrument image identification method based on space grouping attention model
CN111476319B (en) Commodity recommendation method, commodity recommendation device, storage medium and computing equipment
CN110163101A (en) The difference of Chinese medicine seed and grade quick discrimination method
Sharafudeen et al. Detecting skin lesions fusing handcrafted features in image network ensembles
CN114937232A (en) Wearing detection method, system and equipment for medical waste treatment personnel protective appliance
KR et al. Yolo for Detecting Plant Diseases
CN106844638A (en) Information retrieval method, device and electronic equipment
CN115424053B (en) Small sample image recognition method, device, equipment and storage medium
Kundu et al. Vision transformer based deep learning model for monkeypox detection
CN111784665A (en) OCT image quality assessment method, system and device based on Fourier transform
CN112434718A (en) New coronary pneumonia multi-modal feature extraction fusion method and system based on depth map
US8918347B2 (en) Methods and systems for computer-based selection of identifying input for class differentiation
CN113158821A (en) Multimodal eye detection data processing method and device and terminal equipment
CN112364896B (en) Method and device for determining health information distribution based on machine learning
Moreira et al. PEDA 376K: a novel dataset for deep-learning based porn-detectors
CN116303922A (en) Consultation message response method, consultation message response device, computer equipment, storage medium and product
CN115471856A (en) Invoice image information identification method and device and storage medium
Ali et al. A comparison of machine learning methods for best accuracy covid-19 diagnosis using chest x-ray images
CN114022698A (en) Multi-tag behavior identification method and device based on binary tree structure
Kang et al. CST-YOLO: A Novel Method for Blood Cell Detection Based on Improved YOLOv7 and CNN-Swin Transformer
Siddiqui et al. Attention based covid-19 detection using generative adversarial network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40041463

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant