WO2021189982A1 - Health information distribution determination method and apparatus based on machine learning - Google Patents

Health information distribution determination method and apparatus based on machine learning Download PDF

Info

Publication number
WO2021189982A1
WO2021189982A1 PCT/CN2020/136368 CN2020136368W WO2021189982A1 WO 2021189982 A1 WO2021189982 A1 WO 2021189982A1 CN 2020136368 W CN2020136368 W CN 2020136368W WO 2021189982 A1 WO2021189982 A1 WO 2021189982A1
Authority
WO
WIPO (PCT)
Prior art keywords
spectrum data
spectral
data
health
saliva
Prior art date
Application number
PCT/CN2020/136368
Other languages
French (fr)
Chinese (zh)
Inventor
曾振
王健宗
程宁
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021189982A1 publication Critical patent/WO2021189982A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the field of data processing technology, in particular to a method and device for determining the distribution of health information based on machine learning.
  • the intelligent health checkup refers to obtaining the user's basic health data through simple medical examination methods such as blood, blood pressure, blood sugar, and ultrasound images, and analyzing the basic health data through accurate data processing methods to obtain the user's health indicators Or the distribution of various health information.
  • the inventor realizes that the existing distribution of health information is usually based on the comparison of individual indicators in basic health data with international medical standards, which cannot meet the needs of comprehensive analysis of health information, and a single comparison method makes data processing obtainable.
  • the results are more redundant; and basic monitoring data as a medical resource, a single comparison method cannot meet the need to determine the appropriate health information distribution image as different medical scenarios change, so that the health information distribution image determination efficiency Low, unable to meet the needs of health care for the convenience and speed of data processing.
  • the present application provides a method and device for determining the distribution of health information based on machine learning.
  • the main purpose is to solve the problem that the existing health information distribution image determination efficiency is low, which cannot meet the convenience and rapidity of data processing in health care. The question of demand.
  • a method for determining the distribution of health information based on machine learning which includes:
  • the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels
  • the health features in the classification processing result are integrated to obtain a distribution image of health information.
  • a device for determining the distribution of health information based on machine learning including:
  • the classification processing module is used to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a combination of machine learning models based on different levels Established hybrid model;
  • the integration processing module is used to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
  • a storage medium stores at least one executable instruction, and the executable instruction causes a processor to execute a method for determining the distribution of health information based on machine learning,
  • the method for determining the distribution of health information based on machine learning includes the following steps:
  • the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels
  • the health features in the classification processing result are integrated to obtain a distribution image of health information.
  • a computer device including: a processor, a memory, a communication interface, and a communication bus.
  • the processor, the memory, and the communication interface complete mutual communication through the communication bus.
  • the memory is used to store at least one executable instruction that causes the processor to execute a method for determining the distribution of health information based on machine learning, wherein the method for determining the distribution of health information based on machine learning is It includes the following steps:
  • the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels
  • the health features in the classification processing result are integrated to obtain a distribution image of health information.
  • This application provides a method and device for determining the distribution of health information based on machine learning. Compared with the prior art, it satisfies the demand for determining the distribution of health information in a health checkup, and determines the health characteristics of users more efficiently and accurately. This greatly meets the needs of the health and medical field for the convenience and speed of data processing.
  • FIG. 1 shows a flowchart of a method for determining the distribution of health information based on machine learning provided by an embodiment of the present application
  • FIG. 2 shows a flowchart of another method for determining the distribution of health information based on machine learning provided by an embodiment of the present application
  • Figure 3 shows a block diagram of a device for determining the distribution of health information based on machine learning provided by an embodiment of the present application
  • FIG. 4 shows a block diagram of another device for determining the distribution of health information based on machine learning provided by an embodiment of the present application
  • Fig. 5 shows a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the embodiment of the application provides a method for determining the distribution of health information based on machine learning. As shown in FIG. 1, the method includes:
  • the results include at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  • the spectrum data is used to carry out the distribution image of health information.
  • the spectral classification model is a hybrid model established based on a combination of machine learning models at different levels.
  • the hybrid model in the embodiment of the present application is a machine learning model that includes two different classification functions, and is based on different types of machine learning models. Combine hierarchies. Specifically, combining different levels to establish a hybrid model can establish a hybrid relationship with the second classification model by using the first classification model as the input layer of the second classification model, where the first classification model and the second classification model The model is a different machine learning model.
  • the first classification model is a decision tree model
  • the second classification model can be another classification model other than the decision data model, which serves as the output layer in the mixed model. This embodiment does not Make specific restrictions.
  • the classification processing result includes the health features of the spectral data mark probability, and the health features are used to represent different health
  • the characteristic data on which the state is based for example, the uric acid value may be characteristic data that characterizes rheumatic diseases, and human chorionic gonadotropin may be characteristic data that characterizes the state of pregnancy, etc., which are not specifically limited in the embodiments of the present application.
  • the spectral data of the classified different health characteristics are integrated and processed to obtain the distribution image of the health information , So as to meet the needs of health care for the convenience and speed of data processing.
  • the preset spectral integration weights are pre-configured for different health characteristics with respect to the distribution of spectral characteristics.
  • the weight of the health characteristics of myocarditis classified based on infrared spectrum data is 0.2
  • the classification of ultraviolet spectrum data is The weight of the health feature of myocarditis is 0.4, so that the health information of the health feature of myocarditis is integrated based on the weights of 0.2 and 0.4 to obtain the distribution image of myocarditis, which is not specifically limited in the embodiment of the present application.
  • This application provides a method for determining the distribution of health information based on machine learning.
  • the embodiment of this application obtains spectral data; classifies the spectral data based on a trained spectral classification model to obtain Contains the classification processing results of the spectral data respectively labeling the health characteristics, the spectral classification model is a hybrid model established based on a combination of machine learning models of different levels; according to the preset spectral integration weights, the health characteristics in the classification processing results Perform integrated processing to obtain distribution images of health information, meet the needs for determining the distribution of health information in health examinations, and determine the health characteristics of users more efficiently and accurately, thereby greatly satisfying the convenience and convenience of data processing in the health care field. The need for rapidity.
  • the embodiment of the present application provides another method for determining the distribution of health information based on machine learning. As shown in FIG. 2, the method includes:
  • a spectral training data set is acquired, so as to obtain training data from the spectral training data set to train the hybrid model.
  • the spectral training data set includes spectral data corresponding to health features of different classifications, and the health features are feature data used to characterize different health states.
  • the spectrum data includes at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, saliva ultraviolet spectrum data, and the spectrum data is characterized by light wavelength and amplitude.
  • the wavelength is ⁇
  • the radiation quantity in the range of the small wavelength width d ⁇ as the center is dX
  • the function is called the spectral distribution of the light source X ⁇ ( ⁇ ), which is blood
  • the infrared spectrum data is not specifically limited in the embodiment of this application.
  • a spectral classification model of at least two decision tree models and one neural network model is constructed by combining.
  • the combination construction is realized by using the at least two decision tree models as the input layer and the one neural network model as the output layer.
  • the spectral data may at least include blood infrared spectral data , Blood ultraviolet spectrum data, saliva infrared spectrum data, saliva ultraviolet spectrum data, in order to optimize the accuracy of the classification processing, at least two decision tree models are used as input levels, and a neural network model is used as the output layer to construct a mixed spectrum classification model.
  • the first level is to establish at least two decision tree models.
  • the specific steps include: using the Bootstrapping method from the original training set to randomly replace the sampling to select m Samples, a total of n_tree times are sampled, and n_tree training sets are generated. For n_tree training sets, we train n_tree decision tree models respectively. For a single decision tree model, the number of training sample features is, no pruning is required during the splitting process of the decision tree.
  • the steps are: using the Bootstrapping method from the original training set to randomly replace the sampling to select m Samples, a total of n_tree times are sampled, and n_tree training sets are generated. For n_tree training sets, we train n_tree decision tree models respectively. For a single decision tree model, the number of training sample features is, no pruning is required during the splitting process of the decision tree. The steps are:
  • a decision tree for cold features can be constructed as the first layer of feature judgment as a spectrum Whether the data conforms to the cold characteristic spectral distribution between cb, if so, the second-level feature judges whether the a spectral data conforms to the viral cold spectral distribution between ft, and so on.
  • x j ⁇ s ⁇ , R 2 (j,s) ⁇ x
  • two sub-nodes are generated from the current node, and the training data set is allocated to the two sub-nodes according to the characteristics. 3. Recursively call 1. and 2. to the two sub-nodes until the stop condition is met to generate the CART decision tree n_tree.
  • the above process is the generation process of a single decision tree model, and the generation process of at least two decision tree models They are all the same, so I won’t repeat them here.
  • the spectral data a is a feature of virus type 1 cold
  • the spectral data a is a feature of rheumatism and a feature of meningeal inflammation.
  • Construct a vector of full disease features, such as 150 disease vectors, corresponding to the above three health features configured as 1, and the rest are 0.
  • the input sample data is the label probability, that is, the classification of the health features is performed
  • the label probability includes the weight configuration of at least 30 major disease features, the weight configuration of 80 medium disease features, and the weight configuration of 40 mild disease features. Therefore, the neural network training is performed to obtain
  • the determination results of health characteristics including low-risk, medium-risk, and high-risk are not specifically limited in the embodiment of the present application.
  • the combined-built spectral classification model is trained based on the training data in the spectral training data set to obtain a spectral classification model suitable for health feature classification after the training is completed.
  • step 204 may specifically include: acquiring spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  • a spectrometer is used to perform spectral analysis on the blood samples and saliva samples.
  • Output spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  • the analyzed spectrum data can also include non-ultraviolet and infrared spectrum data .
  • the spectral data analyzed by the spectrometer is determined based on the wavelength and amplitude of different light rays in blood and saliva, thereby obtaining spectral data of different light rays such as ultraviolet and infrared rays, which are not specifically limited in the embodiments of the present application.
  • the spectral classification model is based on a trained spectrum.
  • the method further includes: separately determining the wavelength values in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data.
  • the amplitude value has a distorted state; if there is a distorted state, the wavelength value and amplitude value in the distorted state are filtered, and the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, and the saliva infrared The spectrum data and the saliva ultraviolet spectrum data are used as the spectrum data to be classified.
  • the distortion state is a sharp increase or decrease in the wavelength value and amplitude value.
  • the surge or sharp decrease is configured to a distortion range that matches the normal wavelength value and amplitude value of the spectrum. If the distortion range exceeds this distortion range, it is determined that there is a distortion state. Perform filtering processing on the distorted wavelength value and amplitude value.
  • the filtering process is to delete the distorted wavelength value, the blood infrared spectrum data corresponding to the amplitude value, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data. .
  • the wavelength value and amplitude value of any one of the spectrum data is distorted, the corresponding spectrum data will be deleted. If all the spectrum data is distorted, it means that the spectrometer has collected errors. You can delete all of them to re-analyze the blood and saliva samples.
  • the embodiments of this application do not make specific limitations.
  • step 206 may specifically include: using a weighted sum method combined with preset spectral integration weights to count the blood infrared spectrum data and the blood ultraviolet spectrum in the classification processing result. Data, the saliva infrared spectrum data, and the integration interval of the health characteristics marked by the saliva ultraviolet spectrum data; the distribution image containing the health information of the integration interval is drawn in a superimposed manner.
  • the weighted summation method is combined with preset spectral integration weight statistics to undergo classification processing.
  • the integrated interval of the health characteristics of different levels of risk, and the distribution image of the health information of the integrated interval is drawn.
  • the preset spectral integration weights are pre-configured weights for different health characteristics with respect to the distribution of spectral characteristics. For example, blood ultraviolet spectrum data is classified into medium-risk virus type 1 cold characteristics, low-risk meningitis characteristics, and blood infrared Spectral data is classified into high-risk virus type 1 cold features and medium-risk meningitis features.
  • the blood ultraviolet spectrum data classifies the virus type 1 cold feature as 0.1
  • the meningitis feature as 0.6
  • the blood infrared spectrum data classifies the virus type 1
  • each weight is used for weighted summation, for example, 0.1*medium risk+0.3*high risk, where each level of risk is digitized in advance, and the corresponding weighted summation is configured
  • the weighted and summed health information of virus type 1 cold features is obtained, such as 0.1*medium risk+0.3*high risk ⁇ high risk, which is the high risk of virus 1 cold feature. Examples of this application There are no specific restrictions on the numerical value and numerical value area.
  • the integration interval is the risk interval in which different health characteristics are located.
  • the health information determined by the user may include a health characteristic or It includes multiple health features. Therefore, in order to visualize it in a unified manner, the distribution image is drawn in a superimposed manner.
  • Each risk area in the distribution image can show the distribution of different health characteristics in an overlapping manner.
  • the medium risk area may include rheumatism.
  • the embodiment of the present application further includes: after receiving a query request for a distribution image of health information, extracting historical images matching the distribution image; according to different colors Rendering the distributed image and the historical image, and rendering the distributed image and the historical image in a semi-transparent overlapping manner for output.
  • the historical image matching the distribution image is extracted, that is, the distribution image generated by the user's historical health characteristics, according to different colors Render the distributed image and the historical image, and combine the rendered distributed image and the historical image in a semi-transparent overlapping manner to output, so that the user can view the historical image and the distributed image of different colors through the semi-transparent rendering image.
  • the historical image matching the distribution image is extracted, that is, the distribution image generated by the user's historical health characteristics, according to different colors Render the distributed image and the historical image, and combine the rendered distributed image and the historical image in a semi-transparent overlapping manner to output, so that the user can view the historical image and the distributed image of different colors through the semi-transparent rendering image.
  • multiple colors can be rendered after labeling according to time information, thereby improving the visualization effect.
  • This application provides another method for determining the distribution of health information based on machine learning.
  • the embodiment of this application obtains spectral data; and classifies the spectral data based on a trained spectral classification model.
  • Obtain a classification processing result including the spectrum data separately labeled health characteristics the spectrum classification model is a hybrid model established based on a combination of machine learning models of different levels; according to preset spectrum integration weights, the health in the classification processing results
  • the characteristics are integrated and processed to obtain the distribution image of health information, which meets the demand for determining the distribution of health information in the health examination, and more efficiently and accurately determines the health characteristics of the user, thereby greatly satisfying the convenience of data processing in the health care field , The need for rapidity.
  • an embodiment of the present application provides a device for determining the distribution of health information based on machine learning.
  • the device includes:
  • the obtaining module 31 is used to obtain spectral data
  • the classification processing module 32 is configured to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a machine learning model based on different levels Hybrid model established by combination;
  • the integration processing module 33 is configured to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
  • This application provides a device for determining the distribution of health information based on machine learning.
  • the embodiment of this application obtains spectral data; classifies the spectral data based on a trained spectral classification model to obtain Contains the classification processing results of the spectral data respectively labeling the health characteristics, the spectral classification model is a hybrid model established based on a combination of machine learning models of different levels; according to the preset spectral integration weights, the health characteristics in the classification processing results Perform integrated processing to obtain distribution images of health information, meet the needs for determining the distribution of health information in health examinations, and determine the health characteristics of users more efficiently and accurately, thereby greatly satisfying the convenience and convenience of data processing in the health care field. The need for rapidity.
  • an embodiment of the present application provides another device for determining the distribution of health information based on machine learning.
  • the device includes:
  • the obtaining module 41 is used to obtain spectral data
  • the classification processing module 42 is configured to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a machine learning model based on different levels Hybrid model established by combination;
  • the integration processing module 43 is configured to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
  • the device further includes: a construction model 44, a training model 45,
  • the acquiring module 41 is further configured to acquire a spectral training data set, the spectral training data set including the spectral data corresponding to the health characteristics of different classifications;
  • the construction module 44 is configured to construct a spectral classification model including at least two decision tree models and one neural network model in combination, wherein the combination is constructed by using the at least two decision tree models as input layers and Describe a neural network model implemented for the output layer;
  • the training module 45 is configured to train the combined spectral classification model based on the spectral training data set.
  • the acquisition module 41 is specifically configured to acquire spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  • the integration processing module 43 includes:
  • the statistics unit 4301 is configured to use a weighted sum method combined with preset spectral integration weights to count the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification processing result.
  • the drawing unit 4302 is configured to draw a distribution image containing the health information of the integration interval in a superimposed manner.
  • the device further includes:
  • the judging module 46 is configured to respectively judge whether the wavelength value and the amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are in a distorted state;
  • the filter processing module 47 is configured to filter the wavelength value and amplitude value in the distortion state if there is a distortion state, and combine the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, and the saliva infrared spectrum data.
  • Spectral data, the saliva ultraviolet spectrum data are used as spectrum data to be classified, and the filtering process is to delete the distorted wavelength value, the blood infrared spectrum data corresponding to the amplitude value, blood ultraviolet spectrum data, and saliva infrared spectrum Data, saliva UV spectrum data.
  • the integration processing module 43 includes:
  • the extracting unit 4303 is configured to extract a historical image matching the distributed image after receiving a query request for a distributed image of health information
  • the output unit 4304 is configured to render the distributed image and the historical image according to different colors, and combine and render the distributed image and the historical image in a semi-transparent overlapping manner for output.
  • the health characteristics are characteristic data used to characterize different health states.
  • This application provides another device for determining the distribution of health information based on machine learning.
  • the embodiment of this application obtains spectral data; and classifies the spectral data based on a trained spectral classification model.
  • Obtain a classification processing result including the spectrum data separately labeled health characteristics the spectrum classification model is a hybrid model established based on a combination of machine learning models of different levels; according to preset spectrum integration weights, the health in the classification processing results
  • the characteristics are integrated and processed to obtain the distribution image of health information, which meets the demand for determining the distribution of health information in the health examination, and more efficiently and accurately determines the health characteristics of the user, thereby greatly satisfying the convenience of data processing in the health care field , The need for rapidity.
  • a storage medium is provided.
  • the storage medium may be non-volatile or volatile.
  • the storage medium stores at least one executable instruction, and the computer executable instruction can execute the foregoing
  • the method for determining the distribution of health information based on machine learning includes the following steps:
  • the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels
  • the health features in the classification processing result are integrated to obtain a distribution image of health information.
  • FIG. 5 shows a schematic structural diagram of a computer device according to an embodiment of the present application, and the specific embodiment of the present application does not limit the specific implementation of the computer device.
  • the computer device may include: a processor (processor) 502, a communication interface (Communications Interface) 504, a memory (memory) 506, and a communication bus 508.
  • processor processor
  • communication interface Communication Interface
  • memory memory
  • the processor 502, the communication interface 504, and the memory 506 communicate with each other through the communication bus 508.
  • the communication interface 504 is used to communicate with other devices, such as network elements such as clients or other servers.
  • the processor 502 is configured to execute the program 510, and specifically can execute the relevant steps in the foregoing embodiment of the method for determining the distribution of health information based on machine learning.
  • the program 510 may include program code, and the program code includes a computer operation instruction.
  • the processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application.
  • the one or more processors included in the computer device may be the same type of processor, such as one or more CPUs, or different types of processors, such as one or more CPUs and one or more ASICs.
  • the memory 506 is used to store the program 510.
  • the memory 506 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
  • the program 510 may be specifically used to cause the processor 502 to perform the following operations:
  • the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels
  • the health features in the classification processing result are integrated to obtain a distribution image of health information.
  • modules or steps of this application can be implemented by a general computing device, they can be concentrated on a single computing device, or distributed images are composed of multiple computing devices. On the network, they can optionally be implemented with program codes executable by a computing device, so that they can be stored in a storage device for execution by the computing device, and in some cases, they can be different from those here.
  • the steps shown or described are executed in sequence, or they are respectively fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module for implementation. In this way, this application is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

A health information distribution determination method and apparatus based on machine learning, which relate to the technical field of data processing, and mainly aim to solve the problem of it being impossible to meet the convenience and rapidity requirements of health care for data processing due to the low determination efficiency of an existing health information distribution image. The method comprises: acquiring spectral data (101); on the basis of a trained spectral classification model, performing classification processing on the spectral data, so as to obtain a classification processing result that includes health features respectively marked with the spectral data (102), wherein the spectral classification model is a hybrid model established on the basis of a combination of different levels of machine learning models; and according to a preset spectral integration weight, performing integration processing on the health features in the classification processing result, so as to obtain a health information distribution image (103). The method is mainly used for determining a health information distribution on the basis of machine learning.

Description

基于机器学习的健康信息分布的确定方法及装置Method and device for determining health information distribution based on machine learning
本申请要求于2020年10月26日提交中国专利局、申请号为202011153516.1,申请名称为“基于机器学习的健康信息分布的确定方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 26, 2020, the application number is 202011153516.1, and the application title is "Method and Apparatus for Determining Health Information Distribution Based on Machine Learning", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及一种数据处理技术领域,特别是涉及一种基于机器学习的健康信息分布的确定方法及装置。This application relates to the field of data processing technology, in particular to a method and device for determining the distribution of health information based on machine learning.
背景技术Background technique
随着人们对自身健康及他人健康的关注度越来越大,智能化的健康体检已经逐步发展为医疗保障方案中关注焦点。其中,智能化的健康体检是指通过采集血液、血压、血糖、超声波图像等简单的医疗检查方式获取用户的基础健康数据,通过精确地数据处理方式对基础健康数据进行分析,得到用户的健康指标或者各健康信息的分布。As people pay more and more attention to their own health and the health of others, intelligent health check-ups have gradually developed into the focus of medical insurance programs. Among them, the intelligent health checkup refers to obtaining the user's basic health data through simple medical examination methods such as blood, blood pressure, blood sugar, and ultrasound images, and analyzing the basic health data through accurate data processing methods to obtain the user's health indicators Or the distribution of various health information.
目前,发明人意识到现有对健康信息的分布通常是基于基础健康数据中各单一指标与国际医疗标准进行对比,无法满足对健康信息的综合分析的需求,且单一的对比方式使得数据处理得到的结果较为冗余;而基础监控数据作为一种医疗资源,单一的对比方式也无法满足需要随着不同医疗场景变化,确定出相适应的健康信息分布图像的需求,使得健康信息分布图像确定效率较低,无法满足健康医疗对数据处理的便捷性、快速性需求。At present, the inventor realizes that the existing distribution of health information is usually based on the comparison of individual indicators in basic health data with international medical standards, which cannot meet the needs of comprehensive analysis of health information, and a single comparison method makes data processing obtainable. The results are more redundant; and basic monitoring data as a medical resource, a single comparison method cannot meet the need to determine the appropriate health information distribution image as different medical scenarios change, so that the health information distribution image determination efficiency Low, unable to meet the needs of health care for the convenience and speed of data processing.
技术问题technical problem
有鉴于此,本申请提供一种基于机器学习的健康信息分布的确定方法及装置,主要目的在于解决现有健康信息分布图像确定效率较低,无法满足健康医疗对数据处理的便捷性、快速性需求的问题。In view of this, the present application provides a method and device for determining the distribution of health information based on machine learning. The main purpose is to solve the problem that the existing health information distribution image determination efficiency is low, which cannot meet the convenience and rapidity of data processing in health care. The question of demand.
技术解决方案Technical solutions
依据本申请一个方面,提供了一种基于机器学习的健康信息分布的确定方法,包括:According to one aspect of this application, a method for determining the distribution of health information based on machine learning is provided, which includes:
获取光谱数据;Obtain spectral data;
基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
依据本申请另一个方面,提供了一种基于机器学习的健康信息分布的确定装置,包括:According to another aspect of the present application, a device for determining the distribution of health information based on machine learning is provided, including:
获取模块,用于获取光谱数据;Obtaining module for obtaining spectral data;
分类处理模块,用于基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;The classification processing module is used to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a combination of machine learning models based on different levels Established hybrid model;
整合处理模块,用于依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。The integration processing module is used to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
根据本申请的又一方面,提供了一种存储介质,所述存储介质中存储有至少一可执行指令,所述可执行指令使处理器执行一种基于机器学习的健康信息分布的确定方法,其中,所述基于机器学习的健康信息分布的确定方法包括以下步骤:According to another aspect of the present application, a storage medium is provided, the storage medium stores at least one executable instruction, and the executable instruction causes a processor to execute a method for determining the distribution of health information based on machine learning, Wherein, the method for determining the distribution of health information based on machine learning includes the following steps:
获取光谱数据;Obtain spectral data;
基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
根据本申请的再一方面,提供了一种计算机设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;According to another aspect of the present application, a computer device is provided, including: a processor, a memory, a communication interface, and a communication bus. The processor, the memory, and the communication interface complete mutual communication through the communication bus. Communication
所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行一种基于机器学习的健康信息分布的确定方法,其中,所述基于机器学习的健康信息分布的确定方法包括以下步骤:The memory is used to store at least one executable instruction that causes the processor to execute a method for determining the distribution of health information based on machine learning, wherein the method for determining the distribution of health information based on machine learning is It includes the following steps:
获取光谱数据;Obtain spectral data;
基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
有益效果Beneficial effect
本申请提供了一种基于机器学习的健康信息分布的确定方法及装置,与现有技术相比,满足健康体 检中对于健康信息分布的确定需求,更加高效、准确地确定出用户的健康特征,从而大大地满足了健康医疗领域对于数据处理的便捷性、快速性需求。This application provides a method and device for determining the distribution of health information based on machine learning. Compared with the prior art, it satisfies the demand for determining the distribution of health information in a health checkup, and determines the health characteristics of users more efficiently and accurately. This greatly meets the needs of the health and medical field for the convenience and speed of data processing.
附图说明Description of the drawings
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:By reading the detailed description of the preferred embodiments below, various other advantages and benefits will become clear to those of ordinary skill in the art. The drawings are only used for the purpose of illustrating the preferred embodiments, and are not considered as a limitation to the application. Also, throughout the drawings, the same reference symbols are used to denote the same components. In the attached picture:
图1示出了本申请实施例提供的一种基于机器学习的健康信息分布的确定方法流程图;FIG. 1 shows a flowchart of a method for determining the distribution of health information based on machine learning provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种基于机器学习的健康信息分布的确定方法流程图;FIG. 2 shows a flowchart of another method for determining the distribution of health information based on machine learning provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种基于机器学习的健康信息分布的确定装置组成框图;Figure 3 shows a block diagram of a device for determining the distribution of health information based on machine learning provided by an embodiment of the present application;
图4示出了本申请实施例提供的另一种基于机器学习的健康信息分布的确定装置组成框图;FIG. 4 shows a block diagram of another device for determining the distribution of health information based on machine learning provided by an embodiment of the present application;
图5示出了本申请实施例提供的一种计算机设备的结构示意图。Fig. 5 shows a schematic structural diagram of a computer device provided by an embodiment of the present application.
本发明的最佳实施方式The best mode of the present invention
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although the drawings show exemplary embodiments of the present disclosure, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
本申请实施例提供了一种基于机器学习的健康信息分布的确定方法,如图1所示,该方法包括:The embodiment of the application provides a method for determining the distribution of health information based on machine learning. As shown in FIG. 1, the method includes:
101、获取光谱数据。101. Acquire spectral data.
其中,进行健康体检时,通过对用户血液采样、唾液采样,利用光谱仪对血液样本、唾液样本进行光谱检测,得到至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据,从而利用光谱数据进行健康信息的分布图像。Among them, during the health checkup, by sampling the user's blood and saliva, and using a spectrometer to perform spectral detection on blood samples and saliva samples, the results include at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data. The spectrum data is used to carry out the distribution image of health information.
102、基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果。102. Perform classification processing on the spectrum data based on the trained spectrum classification model, and obtain a classification processing result including the health characteristics of the spectrum data respectively labeled.
其中,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型,为了对光谱数据进行分类,本申请实施例中的混合模型为包含两种不同分类功能的机器学习模型,并按照不同层级进行组合。具体的,不同层级进行组合建立混合模型可以为将第一种分类模型作为第二种分类模型的输入层建立与第二种分类模型的混合关系,其中,第一种分类模型与第二种分类模型为不相同的机器学习模型,例如,第一种分类模型为决策树模型,第二分类模型可以为非决策数据模型的其他的分类模型,作为混合模型中的输出层,本申请实施例不做具体限定。Wherein, the spectral classification model is a hybrid model established based on a combination of machine learning models at different levels. In order to classify spectral data, the hybrid model in the embodiment of the present application is a machine learning model that includes two different classification functions, and is based on different types of machine learning models. Combine hierarchies. Specifically, combining different levels to establish a hybrid model can establish a hybrid relationship with the second classification model by using the first classification model as the input layer of the second classification model, where the first classification model and the second classification model The model is a different machine learning model. For example, the first classification model is a decision tree model, and the second classification model can be another classification model other than the decision data model, which serves as the output layer in the mixed model. This embodiment does not Make specific restrictions.
103、依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。103. Perform integration processing on the health features in the classification processing result according to the preset spectral integration weights to obtain a distribution image of health information.
对于本申请实施例中,为了智能地完成对健康信息的分布图像确定,在利用光谱分类模型进行分类处理后,分类处理结果中包括光谱数据标记概率的健康特征,健康特征为用于表征不同健康状态所依据的特征数据,例如,尿酸值可以为表征风湿性疾病的特征数据,人绒毛膜促性腺激素可以为表征怀孕状态的特征数据等,本申请实施例不做具体限定。因此,为了使得到的分类的健康特征适用于健康体检中,以便快速、便捷、准确的确定用户的健康信息,通过对分类出的不同健康特征的光谱数据进行整合处理,得到健康信息的分布图像,从而满足健康医疗对数据处理的便捷性、快速性需求。For the embodiments of the present application, in order to intelligently complete the determination of the distribution image of the health information, after the spectral classification model is used for classification processing, the classification processing result includes the health features of the spectral data mark probability, and the health features are used to represent different health The characteristic data on which the state is based, for example, the uric acid value may be characteristic data that characterizes rheumatic diseases, and human chorionic gonadotropin may be characteristic data that characterizes the state of pregnancy, etc., which are not specifically limited in the embodiments of the present application. Therefore, in order to make the classified health characteristics suitable for health examinations, so as to quickly, conveniently and accurately determine the user's health information, the spectral data of the classified different health characteristics are integrated and processed to obtain the distribution image of the health information , So as to meet the needs of health care for the convenience and speed of data processing.
需要说明的是,预设光谱整合权值为预先对不同健康特征针对光谱特性分布所配置的权值,例如,基于红外线光谱数据分类出心肌炎的健康特征的权值为0.2,紫外线光谱数据分类出心肌炎的健康特征的权重为0.4,以便基于权值0.2、0.4进行对心肌炎的健康特征进行健康信息整合,得到心肌炎的分布图像,本申请实施例不做具体限定。It should be noted that the preset spectral integration weights are pre-configured for different health characteristics with respect to the distribution of spectral characteristics. For example, the weight of the health characteristics of myocarditis classified based on infrared spectrum data is 0.2, and the classification of ultraviolet spectrum data is The weight of the health feature of myocarditis is 0.4, so that the health information of the health feature of myocarditis is integrated based on the weights of 0.2 and 0.4 to obtain the distribution image of myocarditis, which is not specifically limited in the embodiment of the present application.
本申请提供了一种基于机器学习的健康信息分布的确定方法,与现有技术相比,本申请实施例通过获取光谱数据;基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像,满足健康体检中对于健康信息分布的确定需求,更加高效、准确地确定出用户的健康特征,从而大大地满足了健康医疗领域对于数据处理的便捷性、快速性需求。This application provides a method for determining the distribution of health information based on machine learning. Compared with the prior art, the embodiment of this application obtains spectral data; classifies the spectral data based on a trained spectral classification model to obtain Contains the classification processing results of the spectral data respectively labeling the health characteristics, the spectral classification model is a hybrid model established based on a combination of machine learning models of different levels; according to the preset spectral integration weights, the health characteristics in the classification processing results Perform integrated processing to obtain distribution images of health information, meet the needs for determining the distribution of health information in health examinations, and determine the health characteristics of users more efficiently and accurately, thereby greatly satisfying the convenience and convenience of data processing in the health care field. The need for rapidity.
本申请实施例提供了另一种基于机器学习的健康信息分布的确定方法,如图2所示,该方法包括:The embodiment of the present application provides another method for determining the distribution of health information based on machine learning. As shown in FIG. 2, the method includes:
201、获取光谱训练数据集。201. Obtain a spectral training data set.
本申请实施例中,为了实现对混合模型的训练,从而得到精准地对光谱数据的分类能力,获取光谱训练数据集,以便从光谱训练数据集中获取训练数据对混合模型进行训练。其中,所述光谱训练数据集中包括标记不同分类的健康特征所对应的光谱数据,所述健康特征为用于表征不同健康状态所依据的特征数据。另外,光谱数据至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据,并且,光谱数据的以光波长及振幅进行表征,例如,对于血液红外光谱数据,以波长λ为 中心的微小波长宽度dλ范围内的辐射量为dX,则单位波长问隔所对应的辐射量称为光谱密度Xλ,即Xλ=dX/dλ,式中的辐射量可以是辐射通量、辐射强度、辐射亮度、辐射照度等。一般而言.波长不同,其对应的光谱密度也不同,将光源的光谱密度与波长之间的对应关系用函数来表示时,称此函数为该光源的光谱分布Xλ(λ),即为血液红外光谱数据,本申请实施例不做具体限定。In the embodiments of the present application, in order to train the hybrid model and obtain the ability to accurately classify the spectral data, a spectral training data set is acquired, so as to obtain training data from the spectral training data set to train the hybrid model. Wherein, the spectral training data set includes spectral data corresponding to health features of different classifications, and the health features are feature data used to characterize different health states. In addition, the spectrum data includes at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, saliva ultraviolet spectrum data, and the spectrum data is characterized by light wavelength and amplitude. For example, for blood infrared spectrum data, the wavelength is λ The radiation quantity in the range of the small wavelength width dλ as the center is dX, then the radiation quantity corresponding to the unit wavelength interval is called the spectral density Xλ, that is, Xλ=dX/dλ, where the radiation quantity can be radiant flux, radiation Intensity, radiance, irradiance, etc. Generally speaking. Different wavelengths have different corresponding spectral densities. When the corresponding relationship between the spectral density of the light source and the wavelength is expressed as a function, the function is called the spectral distribution of the light source Xλ(λ), which is blood The infrared spectrum data is not specifically limited in the embodiment of this application.
202、组合构建包含至少两个决策树模型、以及一个神经网络模型的光谱分类模型。202. Combine and construct a spectral classification model including at least two decision tree models and one neural network model.
对于本申请实施例,为了提高对光谱数据的分类处理能力,从而高效的执行数据分类处理,通过组合构建至少两个决策树模型、一个神经网络模型的光谱分类模型。其中,所述组合构建为以所述至少两个决策树模型为输入层、以所述一个神经网络模型为输出层进行实现的,本申请实施例中,由于光谱数据至少可以包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据,为了优化对分类处理的精度,以至少两个决策树模型作为输入层级,以一个神经网络模型作为输出层构建混合的光谱分类模型。For the embodiments of the present application, in order to improve the classification and processing capability of spectral data, thereby efficiently performing data classification processing, a spectral classification model of at least two decision tree models and one neural network model is constructed by combining. Wherein, the combination construction is realized by using the at least two decision tree models as the input layer and the one neural network model as the output layer. In the embodiment of the present application, since the spectral data may at least include blood infrared spectral data , Blood ultraviolet spectrum data, saliva infrared spectrum data, saliva ultraviolet spectrum data, in order to optimize the accuracy of the classification processing, at least two decision tree models are used as input levels, and a neural network model is used as the output layer to construct a mixed spectrum classification model.
需要说明的是,不同光谱数据可能导致不同的健康特征的确定,因此,第一层级为建立至少两个决策树模型,具体步骤包括:从原始训练集中使用Bootstrapping方法随机有放回采样选出m个样本,共进行n_tree次采样,生成n_tree个训练集,对于n_tree个训练集,我们分别训练n_tree个决策树模型。对于单个决策树模型,训练样本特征的个数为,在决策树的分裂过程中不需要剪枝,步骤为:It should be noted that different spectral data may lead to the determination of different health characteristics. Therefore, the first level is to establish at least two decision tree models. The specific steps include: using the Bootstrapping method from the original training set to randomly replace the sampling to select m Samples, a total of n_tree times are sampled, and n_tree training sets are generated. For n_tree training sets, we train n_tree decision tree models respectively. For a single decision tree model, the number of training sample features is, no pruning is required during the splitting process of the decision tree. The steps are:
A、特征集D’={z1,z2,z3,z4},健康特征分类为2类,分类结果为是、不是,例如,感冒特征的决策树,可以构建为第一层特征判断为a光谱数据是否符合感冒特征光谱分布于c-b之间,若是,则第二层特征判断为a光谱数据是否符合病毒性感冒光谱分布于f-t之间,以此类推。假设给定训练集为D={(x1,y1),(x2,y2),,,(xNyN)},可选择第j个遍历xj及其取值s作为切分遍历和切分点,定义2个区域,R 1(j,s)={x|x j≤s},R 2(j,s)={x|x j>s},然后寻找最优切分遍历xj及最优切分点s,求解
Figure PCTCN2020136368-appb-000001
其中,cm是Rm上的决策树输出,是区域Rm上所有输入实例xi对应的输出yi的均值。
Figure PCTCN2020136368-appb-000002
对于每个区域R1和R2重复上述过程,直到满足停止条件,将输入空间划分为M个区域R1,R2,,,RM,生成决策树:
Figure PCTCN2020136368-appb-000003
A. Feature set D'={z1, z2, z3, z4}, the health features are classified into 2 categories, and the classification result is yes or no. For example, a decision tree for cold features can be constructed as the first layer of feature judgment as a spectrum Whether the data conforms to the cold characteristic spectral distribution between cb, if so, the second-level feature judges whether the a spectral data conforms to the viral cold spectral distribution between ft, and so on. Assuming that the given training set is D={(x1, y1), (x2, y2),,, (xNyN)}, the jth traversal xj and its value s can be selected as the segmentation traversal and segmentation point, definition 2 regions, R 1 (j,s)={x|x j ≤s}, R 2 (j,s)={x|x j >s}, then find the optimal segmentation traverse xj and the optimal cut Point s, solve
Figure PCTCN2020136368-appb-000001
Among them, cm is the output of the decision tree on Rm, and is the mean value of the output yi corresponding to all input instances xi on the region Rm.
Figure PCTCN2020136368-appb-000002
Repeat the above process for each region R1 and R2, until the stop condition is met, divide the input space into M regions R1, R2,,, RM to generate a decision tree:
Figure PCTCN2020136368-appb-000003
B、对于二类分类问题,若样本点属于第1类的概率为p,则概率分布的基尼指数为:B. For the second-class classification problem, if the probability that the sample point belongs to the first category is p, then the Gini index of the probability distribution is:
Figure PCTCN2020136368-appb-000004
对于样本集D,其基尼指数为:
Figure PCTCN2020136368-appb-000005
其中,Ck是D中属于第k类的样本集,K是类别个数,如果样本集合D根据特征D’是否取某一可能值z被分割成D1和D2两个部分,即D 1={(x,y)|D'(x)=z},D 2=D-D 1,在特征D’条件下,集合D的基尼指数为:
Figure PCTCN2020136368-appb-000006
Figure PCTCN2020136368-appb-000004
For sample set D, its Gini index is:
Figure PCTCN2020136368-appb-000005
Among them, Ck is the sample set belonging to the k-th category in D, and K is the number of categories. If the sample set D is divided into two parts D1 and D2 according to whether the feature D'takes a certain possible value z, that is, D 1 = { (x,y)|D'(x)=z}, D 2 =DD 1 , under the condition of characteristic D', the Gini index of set D is:
Figure PCTCN2020136368-appb-000006
C、决策树生成步骤:1、设结点的训练数据集为D,对每一个特征D’={z1,z2,z3,z4},对其可能取的每个值{z1,z2,z3,z4},根据样本点对{z1,z2,z3,z4}的测试为“是”或“否”将D分割成D1和D2两部分,并计算
Figure PCTCN2020136368-appb-000007
2、在所有可能的特征D’以及其所有可能的切分点{z1,z2,z3,z4}中,选择基尼指数最小的特征及其对应的切分点作为最优特征与最优切分点。依此从现结点生成两个子结点,将训练数据集依特征分配到两个子结点中去。3、对两个子结点递归地调用1.和2.,直至满足停止条件,生成CART决策树n_tree,上述过程为单个决策树模型的生成过程,对于至少两个以上的决策树模型的生成过程均相同,在此不再赘述。
C. Decision tree generation steps: 1. Set the training data set of the node as D, and for each feature D'={z1, z2, z3, z4}, for each possible value {z1, z2, z3 , Z4}, according to the sample point {z1, z2, z3, z4} test is "yes" or "no" to divide D into two parts D1 and D2, and calculate
Figure PCTCN2020136368-appb-000007
2. Among all possible features D'and all possible segmentation points {z1, z2, z3, z4}, select the feature with the smallest Gini index and its corresponding segmentation point as the optimal feature and optimal segmentation point. According to this, two sub-nodes are generated from the current node, and the training data set is allocated to the two sub-nodes according to the characteristics. 3. Recursively call 1. and 2. to the two sub-nodes until the stop condition is met to generate the CART decision tree n_tree. The above process is the generation process of a single decision tree model, and the generation process of at least two decision tree models They are all the same, so I won’t repeat them here.
另外,对于完成多个决策树的模型建立后,获取训练集中对于不同光谱数据进行健康特征的分类结果匹配的标签概率,以向量形式作为神经网络的训练样本数据训练神经网络,例如,作为决策树的分类结果为光谱数据a-属于病毒1型感冒特征,且光谱数据a属于风湿特征,属于脑膜炎症特征。构建全疾病特征的向量,如150种疾病向量,对应上述三种健康特征配置为1,其余为0,在构建神经网络训练样本时,输入样本数据为标签概率,即对于分类出的健康特征进行风险分布的确定,标签概率中包括至少30种重大疾病特征的权值配置,以及80种中等疾病特征的权值配置,40种轻症疾病特征的权值配置, 因此,进行神经网络训练,得到包含低等风险、中等风险、高等风险的健康特征的确定结果,本申请实施例不做具体限定。In addition, after completing the establishment of multiple decision tree models, obtain the label probability matching the classification results of the health characteristics of different spectral data in the training set, and use the vector form as the training sample data of the neural network to train the neural network, for example, as a decision tree The classification result of is that the spectral data a is a feature of virus type 1 cold, and the spectral data a is a feature of rheumatism and a feature of meningeal inflammation. Construct a vector of full disease features, such as 150 disease vectors, corresponding to the above three health features configured as 1, and the rest are 0. When constructing neural network training samples, the input sample data is the label probability, that is, the classification of the health features is performed To determine the risk distribution, the label probability includes the weight configuration of at least 30 major disease features, the weight configuration of 80 medium disease features, and the weight configuration of 40 mild disease features. Therefore, the neural network training is performed to obtain The determination results of health characteristics including low-risk, medium-risk, and high-risk are not specifically limited in the embodiment of the present application.
203、基于所述光谱训练数据集对完成组合构建的光谱分类模型进行训练。203. Train the spectral classification model constructed by the combination based on the spectral training data set.
本申请实施例中,为了实现对光谱数据的分类训练,基于光谱训练数据集中的训练数据对完成组合构建的光谱分类模型进行训练,以得到完成训练、适用于健康特征分类的光谱分类模型。In the embodiments of the present application, in order to achieve classification training on spectral data, the combined-built spectral classification model is trained based on the training data in the spectral training data set to obtain a spectral classification model suitable for health feature classification after the training is completed.
204、获取光谱数据。204. Acquire spectral data.
进一步地,为了进一步限定及说明,步骤204具体可以包括:获取至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据。Further, for further definition and explanation, step 204 may specifically include: acquiring spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
本申请实施例中,为了简化对健康信息分布的确定步骤,提高对用户健康样本采集的操作便捷性,在采集到血液样本、唾液样本时,利用光谱仪对血液样本、唾液样本进行光谱解析,解析出至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据,当然的,随着光谱仪的发展,解析的光谱数据还可以包括非紫外、红外射线的光谱数据,本申请实施例不做具体限定。In the embodiments of the present application, in order to simplify the steps of determining the distribution of health information and improve the convenience of the user's health sample collection operation, when blood samples and saliva samples are collected, a spectrometer is used to perform spectral analysis on the blood samples and saliva samples. Output spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data. Of course, with the development of spectrometers, the analyzed spectrum data can also include non-ultraviolet and infrared spectrum data , The embodiments of this application do not make specific limitations.
对于光谱仪解析的光谱数据,即为基于不同光射线在血液、唾液中的波长、振幅所确定的,从而得到紫外、红外等不同光射线的光谱数据,本申请实施例不做具体限定。The spectral data analyzed by the spectrometer is determined based on the wavelength and amplitude of different light rays in blood and saliva, thereby obtaining spectral data of different light rays such as ultraviolet and infrared rays, which are not specifically limited in the embodiments of the present application.
205、基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果。205. Perform classification processing on the spectrum data based on the trained spectrum classification model to obtain a classification processing result that includes the health characteristics of the spectrum data respectively labeled.
进一步地,为了实现对光谱数据数据优化,避免解析出的光谱数据中存在异常数据而影响分类处理,需要对光谱数据进行数据预处理,本申请实施例中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,所述方法还包括:分别判断所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据中的波长值、振幅值是否存在失真状态;若存在失真状态,则对处于失真状态的波长值、振幅值进行过滤处理,并将过滤后的所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据作为待进行分类处理的光谱数据。Further, in order to optimize the spectral data data and avoid abnormal data in the parsed spectral data from affecting the classification process, it is necessary to perform data preprocessing on the spectral data. In the embodiment of the present application, the spectral classification model is based on a trained spectrum. Before performing classification processing on the spectrum data, the method further includes: separately determining the wavelength values in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data. Whether the amplitude value has a distorted state; if there is a distorted state, the wavelength value and amplitude value in the distorted state are filtered, and the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, and the saliva infrared The spectrum data and the saliva ultraviolet spectrum data are used as the spectrum data to be classified.
对于本申请实施例,由于血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据均是基于波长值、振幅值进行体现的,因此,为了实现对异常数据的过滤,判断血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据中的波长值、振幅值是否存在失真状态。其中,失真状态为波长值、振幅值激增或激减,一般的,将激增或激减配置一个匹配光谱正常波长值、振幅值的失真范围,超过此失真范围,则确定为存在失真状态。对于存在失真状态的波长值、振幅值进行过滤处理,过滤处理为删除所述处于失真状态的波长值、振幅值对应的血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据。其中,若任意一个光谱数据的波长值、振幅值失真,则删除对应的光谱数据,若全部的光谱数据存在失真,则说明光谱仪采集失误,可以全部删除,以便重新对血液、唾液样本进行光谱解析,本申请实施例不做具体限定。For the embodiments of this application, since blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data are all embodied based on wavelength values and amplitude values, in order to filter abnormal data, determine blood Whether the wavelength value and amplitude value in the infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data are distorted. Among them, the distortion state is a sharp increase or decrease in the wavelength value and amplitude value. Generally, the surge or sharp decrease is configured to a distortion range that matches the normal wavelength value and amplitude value of the spectrum. If the distortion range exceeds this distortion range, it is determined that there is a distortion state. Perform filtering processing on the distorted wavelength value and amplitude value. The filtering process is to delete the distorted wavelength value, the blood infrared spectrum data corresponding to the amplitude value, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data. . Among them, if the wavelength value and amplitude value of any one of the spectrum data is distorted, the corresponding spectrum data will be deleted. If all the spectrum data is distorted, it means that the spectrometer has collected errors. You can delete all of them to re-analyze the blood and saliva samples. , The embodiments of this application do not make specific limitations.
206、依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。206. Perform integration processing on the health features in the classification processing result according to the preset spectral integration weights to obtain a distribution image of health information.
对于本申请实施例,为了进一步说明及细化,步骤206具体可以包括:利用加权求和方式结合预设光谱整合权值统计所述分类处理结果中所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据标记的健康特征的整合区间;以叠加的方式绘制包含所述整合区间的健康信息的分布图像。For the embodiment of the present application, for further explanation and refinement, step 206 may specifically include: using a weighted sum method combined with preset spectral integration weights to count the blood infrared spectrum data and the blood ultraviolet spectrum in the classification processing result. Data, the saliva infrared spectrum data, and the integration interval of the health characteristics marked by the saliva ultraviolet spectrum data; the distribution image containing the health information of the integration interval is drawn in a superimposed manner.
本申请实施例中,由于分类处理结果中包括低等风险、中等风险、高等风险的健康特征,为了准确的得到健康信息的分布,通过加权求和方式结合预设光谱整合权值统计经过分类处理的不同等级风险健康特征的整合区间,并绘制出整合区间的健康信息的分布图像。其中,所述预设光谱整合权值为预先对不同健康特征针对光谱特性分布所配置的权值,例如,血液紫外线光谱数据分类为中风险病毒1型感冒特征、低风险脑膜炎特征,血液红外线光谱数据分类为高风险病毒1型感冒特征、中风险脑膜炎特征,对应的,血液紫外线光谱数据分类出病毒1型感冒特征为0.1、脑膜炎特征为0.6,血液红外光谱数据分类出病毒1型感冒特征为0.3、脑膜炎特征为0.3,则利用各权值进行加权求和,例如,0.1*中风险+0.3*高风险,其中预先对各等级风险进行数值化,以及配置对应的加权求和后各风险的数值区域,从而得到经过加权求和的病毒1型感冒特征的健康信息,如0.1*中风险+0.3*高风险→高风险,为病毒1性感冒特征高风险,本申请实施例中对于数值化、数值区域不做具体限定。In the embodiment of this application, since the classification processing results include low-risk, medium-risk, and high-risk health characteristics, in order to accurately obtain the distribution of health information, the weighted summation method is combined with preset spectral integration weight statistics to undergo classification processing. The integrated interval of the health characteristics of different levels of risk, and the distribution image of the health information of the integrated interval is drawn. Wherein, the preset spectral integration weights are pre-configured weights for different health characteristics with respect to the distribution of spectral characteristics. For example, blood ultraviolet spectrum data is classified into medium-risk virus type 1 cold characteristics, low-risk meningitis characteristics, and blood infrared Spectral data is classified into high-risk virus type 1 cold features and medium-risk meningitis features. Correspondingly, the blood ultraviolet spectrum data classifies the virus type 1 cold feature as 0.1, the meningitis feature as 0.6, and the blood infrared spectrum data classifies the virus type 1 If the cold feature is 0.3 and the meningitis feature is 0.3, then each weight is used for weighted summation, for example, 0.1*medium risk+0.3*high risk, where each level of risk is digitized in advance, and the corresponding weighted summation is configured After the numerical area of each risk, the weighted and summed health information of virus type 1 cold features is obtained, such as 0.1*medium risk+0.3*high risk→high risk, which is the high risk of virus 1 cold feature. Examples of this application There are no specific restrictions on the numerical value and numerical value area.
需要说明的是,以叠加的方式绘制包含整合区间的健康信息的分布图像中,整合区间即为不同健康特征所处于的风险区间,例如,用户确定出的健康信息可以包括一个健康特征、也可以包括多个健康特征,因此,为了统一进行可视化,因此,以叠加的方式绘制分布图像,分布图像中每个风险区域可以重叠的方式展现不同健康特征的分布,例如,中等风险区域中可以包括风湿特征、脑膜炎特征,以便便捷的完成健康信息的分布展现。It should be noted that, in the distribution image of the health information containing the integration interval drawn in a superimposed manner, the integration interval is the risk interval in which different health characteristics are located. For example, the health information determined by the user may include a health characteristic or It includes multiple health features. Therefore, in order to visualize it in a unified manner, the distribution image is drawn in a superimposed manner. Each risk area in the distribution image can show the distribution of different health characteristics in an overlapping manner. For example, the medium risk area may include rheumatism. Features, meningitis features, in order to easily complete the distribution of health information.
进一步地,为了满足健康信息分布的可视化需求,步骤206之后,本申请实施例还包括:当接收到健康信息的分布图像查询请求后,提取与所述分布图像匹配的历史图像;按照不同的颜色渲染所述分布图像、所述历史图像,并通过半透明重叠方式组合渲染所述分布图像与所述历史图像,进行输出。Further, in order to meet the visualization requirements of the distribution of health information, after step 206, the embodiment of the present application further includes: after receiving a query request for a distribution image of health information, extracting historical images matching the distribution image; according to different colors Rendering the distributed image and the historical image, and rendering the distributed image and the historical image in a semi-transparent overlapping manner for output.
对于本申请实施例,为了满足对健康信息的管理需求,当结合到健康信息的分布图像请求后,提取与分布图像匹配的历史图像,即为用户历史健康特征生成的分布图像,按照不同的颜色渲染分布图像、以及历史图像,并以半透明重叠方式组合渲染分布图像、历史图像,进行输出,以便用户通过半透明的渲染图中查看不同颜色的历史图像、以及分布图像。其中,若历史图像为多个,可以按照时间信息进行标注后渲染出多个颜色,从而提高可视化效果。For the embodiment of this application, in order to meet the management needs of health information, when combined with the distribution image request of the health information, the historical image matching the distribution image is extracted, that is, the distribution image generated by the user's historical health characteristics, according to different colors Render the distributed image and the historical image, and combine the rendered distributed image and the historical image in a semi-transparent overlapping manner to output, so that the user can view the historical image and the distributed image of different colors through the semi-transparent rendering image. Among them, if there are multiple historical images, multiple colors can be rendered after labeling according to time information, thereby improving the visualization effect.
本申请提供了另一种基于机器学习的健康信息分布的确定方法,与现有技术相比,本申请实施例通过获取光谱数据;基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像,满足健康体检中对于健康信息分布的确定需求,更加高效、准确地确定出用户的健康特征,从而大大地满足了健康医疗领域对于数据处理的便捷性、快速性需求。This application provides another method for determining the distribution of health information based on machine learning. Compared with the prior art, the embodiment of this application obtains spectral data; and classifies the spectral data based on a trained spectral classification model. Obtain a classification processing result including the spectrum data separately labeled health characteristics, the spectrum classification model is a hybrid model established based on a combination of machine learning models of different levels; according to preset spectrum integration weights, the health in the classification processing results The characteristics are integrated and processed to obtain the distribution image of health information, which meets the demand for determining the distribution of health information in the health examination, and more efficiently and accurately determines the health characteristics of the user, thereby greatly satisfying the convenience of data processing in the health care field , The need for rapidity.
进一步的,作为对上述图1所示方法的实现,本申请实施例提供了一种基于机器学习的健康信息分布的确定装置,如图3所示,该装置包括:Further, as an implementation of the method shown in FIG. 1, an embodiment of the present application provides a device for determining the distribution of health information based on machine learning. As shown in FIG. 3, the device includes:
获取模块31,用于获取光谱数据;The obtaining module 31 is used to obtain spectral data;
分类处理模块32,用于基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;The classification processing module 32 is configured to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a machine learning model based on different levels Hybrid model established by combination;
整合处理模块33,用于依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。The integration processing module 33 is configured to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
本申请提供了一种基于机器学习的健康信息分布的确定装置,与现有技术相比,本申请实施例通过获取光谱数据;基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像,满足健康体检中对于健康信息分布的确定需求,更加高效、准确地确定出用户的健康特征,从而大大地满足了健康医疗领域对于数据处理的便捷性、快速性需求。This application provides a device for determining the distribution of health information based on machine learning. Compared with the prior art, the embodiment of this application obtains spectral data; classifies the spectral data based on a trained spectral classification model to obtain Contains the classification processing results of the spectral data respectively labeling the health characteristics, the spectral classification model is a hybrid model established based on a combination of machine learning models of different levels; according to the preset spectral integration weights, the health characteristics in the classification processing results Perform integrated processing to obtain distribution images of health information, meet the needs for determining the distribution of health information in health examinations, and determine the health characteristics of users more efficiently and accurately, thereby greatly satisfying the convenience and convenience of data processing in the health care field. The need for rapidity.
进一步的,作为对上述图2所示方法的实现,本申请实施例提供了另一种基于机器学习的健康信息分布的确定装置,如图4所示,该装置包括:Further, as an implementation of the method shown in FIG. 2, an embodiment of the present application provides another device for determining the distribution of health information based on machine learning. As shown in FIG. 4, the device includes:
获取模块41,用于获取光谱数据;The obtaining module 41 is used to obtain spectral data;
分类处理模块42,用于基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;The classification processing module 42 is configured to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a machine learning model based on different levels Hybrid model established by combination;
整合处理模块43,用于依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。The integration processing module 43 is configured to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
进一步地,所述装置还包括:构建模型44、训练模型45,Further, the device further includes: a construction model 44, a training model 45,
所述获取模块41,还用于获取光谱训练数据集,所述光谱训练数据集中包括标记不同分类的健康特征所对应的光谱数据;The acquiring module 41 is further configured to acquire a spectral training data set, the spectral training data set including the spectral data corresponding to the health characteristics of different classifications;
所述构建模块44,用于组合构建包含至少两个决策树模型、以及一个神经网络模型的光谱分类模型,其中,所述组合构建为以所述至少两个决策树模型为输入层、以所述一个神经网络模型为输出层进行实现的;The construction module 44 is configured to construct a spectral classification model including at least two decision tree models and one neural network model in combination, wherein the combination is constructed by using the at least two decision tree models as input layers and Describe a neural network model implemented for the output layer;
所述训练模块45,用于基于所述光谱训练数据集对完成组合构建的光谱分类模型进行训练。The training module 45 is configured to train the combined spectral classification model based on the spectral training data set.
进一步地,所述获取模块41,具体用于获取至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据。Further, the acquisition module 41 is specifically configured to acquire spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
进一步地,所述整合处理模块43包括:Further, the integration processing module 43 includes:
统计单元4301,用于利用加权求和方式结合预设光谱整合权值统计所述分类处理结果中所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据标记的健康特征的整合区间;The statistics unit 4301 is configured to use a weighted sum method combined with preset spectral integration weights to count the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification processing result. The integration interval of the health characteristics marked by the spectral data;
绘制单元4302,用于以叠加的方式绘制包含所述整合区间的健康信息的分布图像。The drawing unit 4302 is configured to draw a distribution image containing the health information of the integration interval in a superimposed manner.
进一步地,所述装置还包括:Further, the device further includes:
判断模块46,用于分别判断所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据中的波长值、振幅值是否存在失真状态;The judging module 46 is configured to respectively judge whether the wavelength value and the amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are in a distorted state;
过滤处理模块47,用于若存在失真状态,则对处于失真状态的波长值、振幅值进行过滤处理,并将过滤后的所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据作为待进行分类处理的光谱数据,所述过滤处理为删除所述处于失真状态的波长值、振幅值对应的血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据。The filter processing module 47 is configured to filter the wavelength value and amplitude value in the distortion state if there is a distortion state, and combine the filtered blood infrared spectrum data, the blood ultraviolet spectrum data, and the saliva infrared spectrum data. Spectral data, the saliva ultraviolet spectrum data are used as spectrum data to be classified, and the filtering process is to delete the distorted wavelength value, the blood infrared spectrum data corresponding to the amplitude value, blood ultraviolet spectrum data, and saliva infrared spectrum Data, saliva UV spectrum data.
进一步地,所述整合处理模块43包括:Further, the integration processing module 43 includes:
提取单元4303,用于当接收到健康信息的分布图像查询请求后,提取与所述分布图像匹配的历史图像;The extracting unit 4303 is configured to extract a historical image matching the distributed image after receiving a query request for a distributed image of health information;
输出单元4304,用于按照不同的颜色渲染所述分布图像、所述历史图像,并通过半透明重叠方式组合渲染所述分布图像与所述历史图像,进行输出。The output unit 4304 is configured to render the distributed image and the historical image according to different colors, and combine and render the distributed image and the historical image in a semi-transparent overlapping manner for output.
进一步地,所述健康特征为用于表征不同健康状态所依据的特征数据。Further, the health characteristics are characteristic data used to characterize different health states.
本申请提供了另一种基于机器学习的健康信息分布的确定装置,与现有技术相比,本申请实施例通过获取光谱数据;基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像,满足健康体检中对于健康信息分布的确定需求,更加高效、准确地确定出用户的健康特征,从而大大地满足了健康医疗领域对于数据处理的便捷性、快速性需求。This application provides another device for determining the distribution of health information based on machine learning. Compared with the prior art, the embodiment of this application obtains spectral data; and classifies the spectral data based on a trained spectral classification model. Obtain a classification processing result including the spectrum data separately labeled health characteristics, the spectrum classification model is a hybrid model established based on a combination of machine learning models of different levels; according to preset spectrum integration weights, the health in the classification processing results The characteristics are integrated and processed to obtain the distribution image of health information, which meets the demand for determining the distribution of health information in the health examination, and more efficiently and accurately determines the health characteristics of the user, thereby greatly satisfying the convenience of data processing in the health care field , The need for rapidity.
根据本申请一个实施例提供了一种存储介质,所述存储介质可以是非易失性,也可以是易失性,所述存储介质存储有至少一可执行指令,该计算机可执行指令可执行上述任意方法实施例中的基于机器学习的健康信息分布的确定方法,所述基于机器学习的健康信息分布的确定方法包括以下步骤:According to an embodiment of the present application, a storage medium is provided. The storage medium may be non-volatile or volatile. The storage medium stores at least one executable instruction, and the computer executable instruction can execute the foregoing In the method for determining the distribution of health information based on machine learning in any method embodiment, the method for determining the distribution of health information based on machine learning includes the following steps:
获取光谱数据;Obtain spectral data;
基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
图5示出了根据本申请一个实施例提供的一种计算机设备的结构示意图,本申请具体实施例并不对计算机设备的具体实现做限定。FIG. 5 shows a schematic structural diagram of a computer device according to an embodiment of the present application, and the specific embodiment of the present application does not limit the specific implementation of the computer device.
如图5所示,该计算机设备可以包括:处理器(processor)502、通信接口(Communications Interface)504、存储器(memory)506、以及通信总线508。As shown in FIG. 5, the computer device may include: a processor (processor) 502, a communication interface (Communications Interface) 504, a memory (memory) 506, and a communication bus 508.
其中:处理器502、通信接口504、以及存储器506通过通信总线508完成相互间的通信。Among them, the processor 502, the communication interface 504, and the memory 506 communicate with each other through the communication bus 508.
通信接口504,用于与其它设备比如客户端或其它服务器等的网元通信。The communication interface 504 is used to communicate with other devices, such as network elements such as clients or other servers.
处理器502,用于执行程序510,具体可以执行上述基于机器学习的健康信息分布的确定方法实施例中的相关步骤。The processor 502 is configured to execute the program 510, and specifically can execute the relevant steps in the foregoing embodiment of the method for determining the distribution of health information based on machine learning.
具体地,程序510可以包括程序代码,该程序代码包括计算机操作指令。Specifically, the program 510 may include program code, and the program code includes a computer operation instruction.
处理器502可能是中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本申请实施例的一个或多个集成电路。计算机设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application. The one or more processors included in the computer device may be the same type of processor, such as one or more CPUs, or different types of processors, such as one or more CPUs and one or more ASICs.
存储器506,用于存放程序510。存储器506可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。The memory 506 is used to store the program 510. The memory 506 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
程序510具体可以用于使得处理器502执行以下操作:The program 510 may be specifically used to cause the processor 502 to perform the following operations:
获取光谱数据;Obtain spectral data;
基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布图像在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of this application can be implemented by a general computing device, they can be concentrated on a single computing device, or distributed images are composed of multiple computing devices. On the network, they can optionally be implemented with program codes executable by a computing device, so that they can be stored in a storage device for execution by the computing device, and in some cases, they can be different from those here. The steps shown or described are executed in sequence, or they are respectively fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module for implementation. In this way, this application is not limited to any specific combination of hardware and software.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应 包括在本申请的保护范围之内。The above descriptions are only preferred embodiments of the application, and are not intended to limit the application. For those skilled in the art, the application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application.

Claims (20)

  1. 一种基于机器学习的健康信息分布的确定方法,其中,包括:A method for determining the distribution of health information based on machine learning, which includes:
    获取光谱数据;Obtain spectral data;
    基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
    依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
  2. 根据权利要求1所述的方法,其中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,所述方法还包括:The method according to claim 1, wherein, before the spectral data is classified based on the trained spectral classification model, the method further comprises:
    获取光谱训练数据集,所述光谱训练数据集中包括标记不同分类的健康特征所对应的光谱数据;Acquiring a spectral training data set, the spectral training data set including the spectral data corresponding to the health features of different classifications;
    组合构建包含至少两个决策树模型、以及一个神经网络模型的光谱分类模型,其中,所述组合构建为以所述至少两个决策树模型为输入层、以所述一个神经网络模型为输出层进行实现的;The combined construction includes a spectral classification model including at least two decision tree models and one neural network model, wherein the combined construction is constructed with the at least two decision tree models as the input layer and the one neural network model as the output layer Implemented
    基于所述光谱训练数据集对完成组合构建的光谱分类模型进行训练。Based on the spectral training data set, the spectral classification model constructed by the combination is trained.
  3. 根据权利要求1或2所述的方法,其中,所述获取光谱数据包括:The method according to claim 1 or 2, wherein said acquiring spectral data comprises:
    获取至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据。Obtain spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  4. 根据权利要求3所述的方法,其中,所述依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像包括:The method according to claim 3, wherein said integrating the health features in the classification processing result according to a preset spectral integration weight to obtain a distribution image of health information comprises:
    利用加权求和方式结合预设光谱整合权值统计所述分类处理结果中所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据标记的健康特征的整合区间;Using a weighted sum method combined with preset spectral integration weights to count the health characteristics of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification processing result ’S integration interval;
    以叠加的方式绘制包含所述整合区间的健康信息的分布图像。The distribution image containing the health information of the integration interval is drawn in a superimposed manner.
  5. 根据权利要求4所述的方法,其中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,所述方法还包括:The method according to claim 4, wherein, before the spectral data is classified based on the trained spectral classification model, the method further comprises:
    分别判断所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据中的波长值、振幅值是否存在失真状态;Separately determining whether the wavelength value and amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are in a distorted state;
    若存在失真状态,则对处于失真状态的波长值、振幅值进行过滤处理,并将过滤后的所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据作为待进行分类处理的光谱数据,所述过滤处理为删除所述处于失真状态的波长值、振幅值对应的血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据。If there is a distortion state, the wavelength value and amplitude value in the distortion state are filtered, and the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are filtered. The spectrum data is used as the spectrum data to be classified, and the filtering process is to delete the blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data corresponding to the distorted wavelength value and amplitude value.
  6. 根据权利要求4所述的方法,其中,所述以叠加的方式绘制包含所述整合区间的健康信息的分布图像之后,所述方法还包括:4. The method according to claim 4, wherein after the rendering of the distribution image containing the health information of the integration interval in a superimposed manner, the method further comprises:
    当接收到健康信息的分布图像查询请求后,提取与所述分布图像匹配的历史图像;After receiving the distributed image query request for health information, extract historical images matching the distributed image;
    按照不同的颜色渲染所述分布图像、所述历史图像,并通过半透明重叠方 式组合渲染所述分布图像与所述历史图像,进行输出。The distributed image and the historical image are rendered according to different colors, and the distributed image and the historical image are combined and rendered in a semi-transparent overlapping manner for output.
  7. 根据权利要求1-6任一项所述的方法,其中,所述健康特征为用于表征不同健康状态所依据的特征数据。The method according to any one of claims 1 to 6, wherein the health characteristics are characteristic data used to characterize different health states.
  8. 一种基于机器学习的健康信息分布的确定装置,其中,包括:A device for determining the distribution of health information based on machine learning, which includes:
    获取模块,用于获取光谱数据;Obtaining module for obtaining spectral data;
    分类处理模块,用于基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;The classification processing module is used to classify the spectrum data based on the trained spectrum classification model to obtain a classification processing result including the health characteristics of the spectrum data respectively labeled, and the spectrum classification model is a combination of machine learning models based on different levels Established hybrid model;
    整合处理模块,用于依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。The integration processing module is used to perform integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information.
  9. 一种存储介质,所述存储介质中存储有至少一可执行指令,所述可执行指令使处理器执行一种基于机器学习的健康信息分布的确定方法:A storage medium storing at least one executable instruction, the executable instruction causing a processor to execute a method for determining the distribution of health information based on machine learning:
    其中,所述基于机器学习的健康信息分布的确定方法包括:Wherein, the method for determining the distribution of health information based on machine learning includes:
    获取光谱数据;Obtain spectral data;
    基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
    依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
  10. 根据权利要求9所述的存储介质,其中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,还包括:The storage medium according to claim 9, wherein before the classification processing of the spectral data based on the trained spectral classification model, the method further comprises:
    获取光谱训练数据集,所述光谱训练数据集中包括标记不同分类的健康特征所对应的光谱数据;Acquiring a spectral training data set, the spectral training data set including the spectral data corresponding to the health features of different classifications;
    组合构建包含至少两个决策树模型、以及一个神经网络模型的光谱分类模型,其中,所述组合构建为以所述至少两个决策树模型为输入层、以所述一个神经网络模型为输出层进行实现的;The combined construction includes a spectral classification model including at least two decision tree models and one neural network model, wherein the combined construction is constructed with the at least two decision tree models as the input layer and the one neural network model as the output layer Implemented
    基于所述光谱训练数据集对完成组合构建的光谱分类模型进行训练。Based on the spectral training data set, the spectral classification model constructed by the combination is trained.
  11. 根据权利要求9或10所述的存储介质,其中,所述获取光谱数据包括:The storage medium according to claim 9 or 10, wherein said acquiring spectral data comprises:
    获取至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据。Obtain spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  12. 根据权利要求11所述的存储介质,其中,所述依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像包括:11. The storage medium according to claim 11, wherein said performing integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information comprises:
    利用加权求和方式结合预设光谱整合权值统计所述分类处理结果中所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据标记的健康特征的整合区间;Using a weighted sum method combined with preset spectral integration weights to count the health characteristics of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification processing result ’S integration interval;
    以叠加的方式绘制包含所述整合区间的健康信息的分布图像。The distribution image containing the health information of the integration interval is drawn in a superimposed manner.
  13. 根据权利要求12所述的存储介质,其中,所述基于已训练的光谱分 类模型对所述光谱数据进行分类处理之前,还包括:The storage medium according to claim 12, wherein, before the spectral data is classified based on the trained spectral classification model, the method further comprises:
    分别判断所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据中的波长值、振幅值是否存在失真状态;Separately determining whether the wavelength value and amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are in a distorted state;
    若存在失真状态,则对处于失真状态的波长值、振幅值进行过滤处理,并将过滤后的所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据作为待进行分类处理的光谱数据,所述过滤处理为删除所述处于失真状态的波长值、振幅值对应的血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据。If there is a distortion state, the wavelength value and amplitude value in the distortion state are filtered, and the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are filtered. The spectrum data is used as the spectrum data to be classified, and the filtering process is to delete the blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data corresponding to the distorted wavelength value and amplitude value.
  14. 根据权利要求12所述的存储介质,其中,所述以叠加的方式绘制包含所述整合区间的健康信息的分布图像之后,还包括:11. The storage medium according to claim 12, wherein after said drawing the distribution image containing the health information of the integration interval in a superimposed manner, the method further comprises:
    当接收到健康信息的分布图像查询请求后,提取与所述分布图像匹配的历史图像;After receiving the distributed image query request for health information, extract historical images matching the distributed image;
    按照不同的颜色渲染所述分布图像、所述历史图像,并通过半透明重叠方式组合渲染所述分布图像与所述历史图像,进行输出。The distributed image and the historical image are rendered according to different colors, and the distributed image and the historical image are combined and rendered in a semi-transparent overlapping manner for output.
  15. 根据权利要求9-14任一项所述的存储介质,其中,所述健康特征为用于表征不同健康状态所依据的特征数据。14. The storage medium according to any one of claims 9-14, wherein the health characteristics are characteristic data used to characterize different health states.
  16. 一种计算机设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;A computer device includes: a processor, a memory, a communication interface, and a communication bus. The processor, the memory, and the communication interface communicate with each other through the communication bus;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行一种基于机器学习的健康信息分布的确定方法,其中,所述基于机器学习的健康信息分布的确定方法包括以下步骤:The memory is used to store at least one executable instruction that causes the processor to execute a method for determining the distribution of health information based on machine learning, wherein the method for determining the distribution of health information based on machine learning is It includes the following steps:
    获取光谱数据;Obtain spectral data;
    基于已训练的光谱分类模型对所述光谱数据进行分类处理,得到包含所述光谱数据分别标记健康特征的分类处理结果,所述光谱分类模型为基于不同层级机器学习模型组合建立的混合模型;Performing classification processing on the spectrum data based on the trained spectrum classification model to obtain classification processing results containing the spectrum data respectively labeled health characteristics, the spectrum classification model being a hybrid model established based on a combination of machine learning models of different levels;
    依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像。According to the preset spectral integration weights, the health features in the classification processing result are integrated to obtain a distribution image of health information.
  17. 根据权利要求16所述的计算机设备,其中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,还包括:The computer device according to claim 16, wherein before the classification processing of the spectrum data based on the trained spectrum classification model, the method further comprises:
    获取光谱训练数据集,所述光谱训练数据集中包括标记不同分类的健康特征所对应的光谱数据;Acquiring a spectral training data set, the spectral training data set including the spectral data corresponding to the health features of different classifications;
    组合构建包含至少两个决策树模型、以及一个神经网络模型的光谱分类模型,其中,所述组合构建为以所述至少两个决策树模型为输入层、以所述一个神经网络模型为输出层进行实现的;The combined construction includes a spectral classification model including at least two decision tree models and one neural network model, wherein the combined construction is constructed with the at least two decision tree models as the input layer and the one neural network model as the output layer Implemented
    基于所述光谱训练数据集对完成组合构建的光谱分类模型进行训练。Based on the spectral training data set, the spectral classification model constructed by the combination is trained.
  18. 根据权利要求16或17所述的计算机设备,其中,所述获取光谱数据包括:The computer device according to claim 16 or 17, wherein said acquiring spectral data comprises:
    获取至少包括血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据的光谱数据。Obtain spectrum data including at least blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data.
  19. 根据权利要求18所述的计算机设备,其中,所述依据预设光谱整合权值,对所述分类处理结果中的健康特征进行整合处理,得到健康信息的分布图像包括:18. The computer device according to claim 18, wherein said performing integration processing on the health features in the classification processing result according to preset spectral integration weights to obtain a distribution image of health information comprises:
    利用加权求和方式结合预设光谱整合权值统计所述分类处理结果中所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据标记的健康特征的整合区间;Using a weighted sum method combined with preset spectral integration weights to count the health characteristics of the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data in the classification processing result ’S integration interval;
    以叠加的方式绘制包含所述整合区间的健康信息的分布图像。The distribution image containing the health information of the integration interval is drawn in a superimposed manner.
  20. 根据权利要求19所述的计算机设备,其中,所述基于已训练的光谱分类模型对所述光谱数据进行分类处理之前,还包括:The computer device according to claim 19, wherein, before the classification processing of the spectrum data based on the trained spectrum classification model, the method further comprises:
    分别判断所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据中的波长值、振幅值是否存在失真状态;Separately determining whether the wavelength value and amplitude value in the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are in a distorted state;
    若存在失真状态,则对处于失真状态的波长值、振幅值进行过滤处理,并将过滤后的所述血液红外光谱数据、所述血液紫外光谱数据、所述唾液红外光谱数据、所述唾液紫外光谱数据作为待进行分类处理的光谱数据,所述过滤处理为删除所述处于失真状态的波长值、振幅值对应的血液红外光谱数据、血液紫外光谱数据、唾液红外光谱数据、唾液紫外光谱数据。If there is a distortion state, the wavelength value and amplitude value in the distortion state are filtered, and the blood infrared spectrum data, the blood ultraviolet spectrum data, the saliva infrared spectrum data, and the saliva ultraviolet spectrum data are filtered. The spectrum data is used as the spectrum data to be classified, and the filtering process is to delete the blood infrared spectrum data, blood ultraviolet spectrum data, saliva infrared spectrum data, and saliva ultraviolet spectrum data corresponding to the distorted wavelength value and amplitude value.
PCT/CN2020/136368 2020-10-26 2020-12-15 Health information distribution determination method and apparatus based on machine learning WO2021189982A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011153516.1 2020-10-26
CN202011153516.1A CN112364896B (en) 2020-10-26 2020-10-26 Method and device for determining health information distribution based on machine learning

Publications (1)

Publication Number Publication Date
WO2021189982A1 true WO2021189982A1 (en) 2021-09-30

Family

ID=74512157

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136368 WO2021189982A1 (en) 2020-10-26 2020-12-15 Health information distribution determination method and apparatus based on machine learning

Country Status (2)

Country Link
CN (1) CN112364896B (en)
WO (1) WO2021189982A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080187207A1 (en) * 2006-07-27 2008-08-07 International Business Machines Corporation Method and system for robust classification strategy for cancer detection from mass spectrometry data
CN108542402A (en) * 2018-05-17 2018-09-18 吉林求是光谱数据科技有限公司 Blood sugar detecting method based on Self-organizing Competitive Neutral Net model and infrared spectrum
CN110298396A (en) * 2019-06-25 2019-10-01 北京工业大学 Hyperspectral image classification method based on deep learning multiple features fusion
CN110852227A (en) * 2019-11-04 2020-02-28 中国科学院遥感与数字地球研究所 Hyperspectral image deep learning classification method, device, equipment and storage medium
CN110946552A (en) * 2019-10-30 2020-04-03 南京航空航天大学 Cervical cancer pre-lesion screening method combining spectrum and image
CN111523593A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and apparatus for analyzing medical images

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008154024A1 (en) * 2007-06-11 2008-12-18 Hartley, Frank Mid-ir spectral measurements for real-time identification of analytes in an industrial and laboratory setting
RU2402772C1 (en) * 2009-02-09 2010-10-27 Оксана Анатольевна Гусякова Method of estimating efficiency of chronic generalised periodontitis treatment
US10251597B2 (en) * 2016-04-21 2019-04-09 Viavi Solutions Inc. Health tracking device
CN107045637B (en) * 2016-12-16 2020-07-24 中国医学科学院生物医学工程研究所 Blood species identification instrument and method based on spectrum
US20190247650A1 (en) * 2018-02-14 2019-08-15 Bao Tran Systems and methods for augmenting human muscle controls
CN111444965B (en) * 2020-03-27 2024-03-12 泰康保险集团股份有限公司 Data processing method based on machine learning and related equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080187207A1 (en) * 2006-07-27 2008-08-07 International Business Machines Corporation Method and system for robust classification strategy for cancer detection from mass spectrometry data
CN108542402A (en) * 2018-05-17 2018-09-18 吉林求是光谱数据科技有限公司 Blood sugar detecting method based on Self-organizing Competitive Neutral Net model and infrared spectrum
CN110298396A (en) * 2019-06-25 2019-10-01 北京工业大学 Hyperspectral image classification method based on deep learning multiple features fusion
CN110946552A (en) * 2019-10-30 2020-04-03 南京航空航天大学 Cervical cancer pre-lesion screening method combining spectrum and image
CN110852227A (en) * 2019-11-04 2020-02-28 中国科学院遥感与数字地球研究所 Hyperspectral image deep learning classification method, device, equipment and storage medium
CN111523593A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and apparatus for analyzing medical images

Also Published As

Publication number Publication date
CN112364896A (en) 2021-02-12
CN112364896B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
Castiglione et al. COVID-19: automatic detection of the novel coronavirus disease from CT images using an optimized convolutional neural network
Sun et al. Cervical cancer diagnosis based on random forest
Wais Gender prediction methods based on first names with genderizeR.
WO2021196632A1 (en) Intelligent analysis system and method for panoramic digital pathological image
CN108346145A (en) The recognition methods of unconventional cell in a kind of pathological section
WO2019085064A1 (en) Medical claim denial determination method, device, terminal apparatus, and storage medium
WO2016205286A1 (en) Automatic entity resolution with rules detection and generation system
WO2022134466A1 (en) Data processing method and related device
CN112365939B (en) Data management method and system based on medical health big data
WO2018121145A1 (en) Method and device for vectorizing paragraph
WO2021120587A1 (en) Method and apparatus for retina classification based on oct, computer device, and storage medium
CN112434718B (en) New coronary pneumonia multi-modal feature extraction fusion method and system based on depth map
US11817215B2 (en) Artificial intelligence cloud diagnosis platform
CN110135506A (en) A kind of seven paracutaneous neoplasm detection methods applied to web
CN114356989A (en) Audit abnormal data detection method and device
WO2023134060A1 (en) Information pushing method and apparatus based on drug molecule image classification
CN111724269A (en) Machine learning-based settlement data processing method and device
CN110533102A (en) Single class classification method and classifier based on fuzzy reasoning
Kloeckner et al. Multi-categorical classification using deep learning applied to the diagnosis of gastric cancer
CN114140437A (en) Fundus hard exudate segmentation method based on deep learning
WO2021189982A1 (en) Health information distribution determination method and apparatus based on machine learning
CN115471856A (en) Invoice image information identification method and device and storage medium
JP4125951B2 (en) Text automatic classification method and apparatus, program, and recording medium
Siddiqui et al. Attention based covid-19 detection using generative adversarial network
CN113257429A (en) System, equipment and storage medium for recognizing fever diseases based on association rules

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926755

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20926755

Country of ref document: EP

Kind code of ref document: A1