CN113643812A

CN113643812A - Tumor risk multiple calculation method and system based on blood examination indexes

Info

Publication number: CN113643812A
Application number: CN202110973360.XA
Authority: CN
Inventors: 季凯; 王正
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-08-24
Filing date: 2021-08-24
Publication date: 2021-11-12

Abstract

The invention discloses a tumor risk multiple calculation method and a system based on blood examination indexes, which belong to the field of medical data processing, and comprise a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring blood routine laboratory sheets, blood biochemical laboratory sheets and tumor marker laboratory sheet pictures, identifying the detection indexes, ages and sexes in the blood routine laboratory sheets, the blood biochemical laboratory sheets and the tumor marker laboratory sheet pictures, or directly importing related data from electronic data; the tumor risk multiple calculation module is used for predicting the tumor disease risk of the person to be tested to be increased by multiple relative to the average population risk; the calculation process of the tumor risk multiple calculation module is as follows: establishing classified learning samples for non-tumor crowds and tumor-affected crowds, training a tumor-affected risk model by using the learning samples, calculating an average crowd tumor-affected risk value by using the tumor-affected risk model for an average crowd sample, and dividing the average crowd tumor risk value by the tumor-affected risk value of a person to be detected to obtain a tumor risk increase multiple. The invention utilizes blood routine, blood biochemistry and tumor marker detection indexes to predict tumor risk multiple, and innovates a tumor risk early warning mode.

Description

Tumor risk multiple calculation method and system based on blood examination indexes

Technical Field

The invention belongs to the field of medical data processing, and relates to a tumor risk multiple calculation method and system based on blood examination indexes.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

The technology of detecting susceptibility genes is often used for predicting the risk of cancers in the market at present. However, the detection based on susceptibility genes can only predict the risk of cancer in life, and cannot dynamically reflect the presence or absence of tumor in vivo. Moreover, the price of gene detection is expensive, and the popularization difficulty is high.

The blood routine, blood biochemical and tumor markers contain a large amount of human health information. Many specific indexes are common sensitive indexes, which are sensitive to many pathological changes in the organism, and many patients can make blood examination to make auxiliary diagnosis when the etiology is unknown. If the tumor risk can be accurately predicted according to the results of blood routine, blood biochemistry and tumor markers, more tumor risks can be found and prompted on the premise of not increasing the inspection cost.

Disclosure of Invention

In order to solve the problems, the invention provides a tumor risk multiple calculation method and a tumor risk multiple calculation system based on blood examination indexes.

In order to achieve the purpose, the invention adopts the following technical scheme:

(1) and the data acquisition module is used for acquiring blood routine laboratory sheets, blood biochemical laboratory sheets and tumor marker laboratory sheet pictures, identifying detection indexes, ages and sexes in the blood routine laboratory sheets, the blood biochemical laboratory sheets and the tumor marker laboratory sheet pictures, or directly importing related data from electronic data.

(2) And the tumor risk multiple calculation module is used for predicting the tumor disease risk of the to-be-detected person to be increased by multiple relative to the average population risk.

The calculation process of the tumor risk multiple calculation module is as follows: the method comprises the steps of establishing classified learning samples for non-tumor crowds and tumor-affected crowds, training a tumor-affected risk model by using the learning samples, calculating an average crowd tumor-affected risk value by using the tumor-affected risk model for an average crowd sample, calculating a tumor-affected risk value of a person to be measured by using the same model, and dividing the average crowd tumor risk value by the tumor-affected risk value of the person to be measured to obtain a tumor risk increase multiple.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

Fig. 1 is a schematic structural diagram of a tumor risk multiple calculation method and system based on blood examination indexes according to an embodiment of the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

Fig. 1 shows a tumor risk multiple calculation method and system based on blood examination indicators of the present embodiment, which includes:

Wherein the blood conventional index includes white blood cell count (WBC), red blood cell count (RBC), hemoglobin (Hb), hematocrit (Hct), Mean Corpuscular Volume (MCV), mean corpuscular hemoglobin content (MCH), Mean Corpuscular Hemoglobin Concentration (MCHC), platelet count (PLT), lymphocyte percentage (Lymph), monocyte percentage (Mono), neutrophil percentage (Neut), eosinophil percentage (Eos), basophil percentage (Baso), lymphocyte count (Lymph), monocyte count (Mono), neutrophil count (Neut), eosinophil count (Eos), basophil count (Baso), erythrocyte volume distribution width CV (RDW-CV), erythrocyte volume distribution width SD (RDW-SD), Platelet Distribution Width (PDW), mean Platelet Volume (MPV), percent large platelets (P-LCR%), hematocrit (PCT).

Wherein the biochemical blood indicators include glutamic-oxaloacetic transaminase (AST), glutamic-pyruvic transaminase (ALT), glutamic-oxaloacetic transaminase/glutamic-pyruvic transaminase (S/L), glutamyl transpeptidase (GGT), alkaline phosphatase (ALP), Total Protein (TP), Albumin (ALB), Globulin (GLO), albumin/globulin (A/G), Total Bilirubin (TBIL), Direct Bilirubin (DBIL), Indirect Bilirubin (IBIL), total Cholesterol (CHOL), high density lipoprotein (HDL-C), low density lipoprotein (LDL/C), Triglyceride (TG), Glucose (GLU), urea nitrogen (BUN), Creatinine (CREA), urea nitrogen/creatinine (BUN/CREA), URIC acid (URIC).

Wherein the tumor markers are AFP, CEA, Cyfra21-1, CA199, CA242, CA125, SCC, PSA.

Step 1: the classified learning samples are established for non-tumor crowds and tumor-affected crowds, and the tumor-affected risk model is trained by using the learning samples, wherein the machine learning algorithm model can be a preset algorithm, such as an SVM (support vector machine), a random forest algorithm, a LightGBM (LightGBM) algorithm or an XGboost algorithm. The machine learning algorithm model can also be an optimal machine learning algorithm model screened out after a plurality of algorithms are compared.

The selection of the calculation characteristics of the tumor risk model can utilize blood routine data, blood biochemistry and partial or all indexes of tumor markers.

The individual modeling is carried out on the crowds with lung cancer, liver cancer, stomach cancer, esophagus cancer, colorectal cancer, breast cancer, cervical cancer, kidney cancer, pancreatic cancer, thyroid cancer, prostatic cancer, ovarian cancer and nasopharyngeal carcinoma to obtain the individual tumor risk model.

Step 2: and calculating the tumor disease risk value of the average population by using the tumor disease risk model for the average population sample, and calculating the tumor disease risk value of the person to be detected by using the same model.

And step 3: dividing the tumor risk value of the person to be tested by the average population tumor risk value to obtain the tumor risk increase multiple.

Various modifications and changes may be made by those skilled in the art without departing from the spirit and scope of the invention, and it is intended to cover in the appended claims all such modifications, equivalents, and improvements as fall within the true spirit and scope of the invention.

Claims

1. A tumor risk multiple calculation method and system based on blood examination indexes are characterized by comprising the following steps:

the data acquisition module is used for acquiring blood routine laboratory sheets, blood biochemical laboratory sheets and tumor marker laboratory sheet pictures, identifying detection indexes, ages and sexes in the blood routine laboratory sheets, the blood biochemical laboratory sheets and the tumor marker laboratory sheet pictures, or directly importing related data from electronic data;

the tumor risk multiple calculation module is used for predicting the tumor disease risk of the person to be tested to be increased by multiple relative to the average population risk;

2. The method and system for calculating tumor risk factor according to claim 1, wherein the tumor risk factor is calculated by modeling lung cancer, liver cancer, stomach cancer, esophageal cancer, colorectal cancer, breast cancer, cervical cancer, kidney cancer, pancreatic cancer, thyroid cancer, prostate cancer, ovarian cancer, and nasopharyngeal cancer separately.

3. The method and system for calculating tumor risk fold based on blood test index according to claim 1, wherein the non-tumor population is healthy population excluding a specific tumor population and population with other non-tumor diseases.

4. The method and system for calculating tumor risk fold based on blood test index according to claim 1, wherein the average population sample comprises healthy population and common disease population, wherein the proportion of each disease population in the common disease population is in accordance with the incidence rate of common diseases.

5. The method and system for calculating multiple risk of tumor based on blood test index of claim 4, wherein the training data for machine learning can be trained by using blood routine data, blood biochemistry, partial index or all indexes of tumor markers, and can be used as a tumor risk model as long as the requirements of model evaluation indexes are met.

6. The method and system for calculating tumor risk fold based on blood test indexes as claimed in claim 5, wherein the tumor risk model evaluation indexes include prediction accuracy, AUC, sensitivity and specificity.

7. The method and system for calculating tumor risk fold based on blood examination index according to claim 6, wherein in the training process of the tumor risk model, a plurality of machine learning algorithm models are trained by using a sample set; and comparing all the trained machine learning algorithm models by using the predicted value errors, and generating an optimal tumor risk model by using the machine learning algorithm model with the highest accuracy.