CN113160994A - Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model - Google Patents
Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model Download PDFInfo
- Publication number
- CN113160994A CN113160994A CN202011634864.0A CN202011634864A CN113160994A CN 113160994 A CN113160994 A CN 113160994A CN 202011634864 A CN202011634864 A CN 202011634864A CN 113160994 A CN113160994 A CN 113160994A
- Authority
- CN
- China
- Prior art keywords
- model
- screening
- nash
- value
- ultrasonic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010053219 non-alcoholic steatohepatitis Diseases 0.000 title claims abstract description 102
- 238000012216 screening Methods 0.000 title claims abstract description 77
- 208000008338 non-alcoholic fatty liver disease Diseases 0.000 title claims abstract description 60
- 238000010276 construction Methods 0.000 title claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 57
- 210000002966 serum Anatomy 0.000 claims abstract description 52
- 238000012706 support-vector machine Methods 0.000 claims abstract description 27
- 208000004930 Fatty Liver Diseases 0.000 claims abstract description 15
- 238000007477 logistic regression Methods 0.000 claims abstract description 15
- 230000001575 pathological effect Effects 0.000 claims abstract description 15
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 238000003745 diagnosis Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000004458 analytical method Methods 0.000 claims abstract description 6
- 238000001514 detection method Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 43
- 210000004369 blood Anatomy 0.000 claims description 18
- 239000008280 blood Substances 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 16
- 210000004185 liver Anatomy 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 14
- LEHOTFFKMJEONL-UHFFFAOYSA-N Uric Acid Chemical compound N1C(=O)NC(=O)C2=C1NC(=O)N2 LEHOTFFKMJEONL-UHFFFAOYSA-N 0.000 claims description 12
- TVWHNULVHGKJHS-UHFFFAOYSA-N Uric acid Natural products N1C(=O)NC(=O)C2NC(=O)NC21 TVWHNULVHGKJHS-UHFFFAOYSA-N 0.000 claims description 12
- 229940116269 uric acid Drugs 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 150000002632 lipids Chemical class 0.000 claims description 10
- 238000002604 ultrasonography Methods 0.000 claims description 9
- 238000001574 biopsy Methods 0.000 claims description 8
- 230000023555 blood coagulation Effects 0.000 claims description 6
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 claims description 6
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 claims description 6
- 210000005228 liver tissue Anatomy 0.000 claims description 6
- 230000007774 longterm Effects 0.000 claims description 6
- 108010003415 Aspartate Aminotransferases Proteins 0.000 claims description 5
- 102000004625 Aspartate Aminotransferases Human genes 0.000 claims description 5
- 238000012352 Spearman correlation analysis Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 5
- 239000006185 dispersion Substances 0.000 claims description 4
- YPMOAQISONSSNL-UHFFFAOYSA-N 8-hydroxyoctyl 2-methylprop-2-enoate Chemical compound CC(=C)C(=O)OCCCCCCCCO YPMOAQISONSSNL-UHFFFAOYSA-N 0.000 claims description 3
- 102100036475 Alanine aminotransferase 1 Human genes 0.000 claims description 3
- 108010082126 Alanine transaminase Proteins 0.000 claims description 3
- 108010088751 Albumins Proteins 0.000 claims description 3
- 102000009027 Albumins Human genes 0.000 claims description 3
- 102000002260 Alkaline Phosphatase Human genes 0.000 claims description 3
- 108020004774 Alkaline Phosphatase Proteins 0.000 claims description 3
- 238000008789 Direct Bilirubin Methods 0.000 claims description 3
- 238000008416 Ferritin Methods 0.000 claims description 3
- 102000008857 Ferritin Human genes 0.000 claims description 3
- 108050000784 Ferritin Proteins 0.000 claims description 3
- 206010016654 Fibrosis Diseases 0.000 claims description 3
- 108020004206 Gamma-glutamyltransferase Proteins 0.000 claims description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims description 3
- 102000004877 Insulin Human genes 0.000 claims description 3
- 108090001061 Insulin Proteins 0.000 claims description 3
- 238000008050 Total Bilirubin Reagent Methods 0.000 claims description 3
- 108090000340 Transaminases Proteins 0.000 claims description 3
- 102000003929 Transaminases Human genes 0.000 claims description 3
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 claims description 3
- 238000004820 blood count Methods 0.000 claims description 3
- 230000037396 body weight Effects 0.000 claims description 3
- 206010012601 diabetes mellitus Diseases 0.000 claims description 3
- 230000035622 drinking Effects 0.000 claims description 3
- 230000009977 dual effect Effects 0.000 claims description 3
- 230000004761 fibrosis Effects 0.000 claims description 3
- 102000006640 gamma-Glutamyltransferase Human genes 0.000 claims description 3
- 239000008103 glucose Substances 0.000 claims description 3
- 208000006454 hepatitis Diseases 0.000 claims description 3
- 231100000283 hepatitis Toxicity 0.000 claims description 3
- 229940125396 insulin Drugs 0.000 claims description 3
- 238000009533 lab test Methods 0.000 claims description 3
- 210000000265 leukocyte Anatomy 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012790 confirmation Methods 0.000 claims description 2
- 230000007423 decrease Effects 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 230000037303 wrinkles Effects 0.000 claims 1
- 230000007547 defect Effects 0.000 abstract description 2
- 206010019708 Hepatic steatosis Diseases 0.000 abstract 1
- 208000010706 fatty liver disease Diseases 0.000 abstract 1
- 231100000240 steatosis hepatitis Toxicity 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000002440 hepatic effect Effects 0.000 description 2
- 231100000915 pathological change Toxicity 0.000 description 2
- 230000036285 pathological change Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 210000000952 spleen Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/40—Analysis of texture
- G06T7/41—Analysis of texture based on statistical description of texture
- G06T7/45—Analysis of texture based on statistical description of texture using co-occurrence matrix computation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10132—Ultrasound image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30056—Liver; Hepatic
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Primary Health Care (AREA)
- Data Mining & Analysis (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Ultra Sonic Daignosis Equipment (AREA)
Abstract
A construction method, a prediction system, equipment and a storage medium for noninvasive screening of a non-alcoholic steatohepatitis model belong to the field of fatty liver detection. The existing method for treating the steatohepatitis has the defects of low accuracy, high cost and the like. A construction method for non-invasive screening of a non-alcoholic steatohepatitis model collects serum, images and pathological data of a steatohepatitis patient; preprocessing serum characteristic data; statistical screening of serum characteristics; b ultrasonic characteristic extraction and screening; integrating effective B-mode ultrasonic features by using a Support Vector Machine (SVM); combining the obtained serum index characteristic with the R-NASH characteristic of B ultrasonic, and obtaining a non-alcoholic steatohepatitis diagnosis model by adopting a multivariate logistic regression method; and confirming the model threshold range of the NASH according to the maximum York index of the ROC curve through model comparison and efficiency analysis, so as to obtain a model with fixed final parameters for prediction.
Description
Technical Field
The invention relates to construction of a model for screening steatohepatitis, in particular to a method, a prediction system, equipment and a storage medium for non-invasively screening a non-alcoholic steatohepatitis model.
Background
At present, the diagnosis of steatohepatitis needs to be carried out by methods such as B-ultrasound, CT flat scan liver density or blood detection, B-type ultrasonic examination is an important and practical means for diagnosing steatohepatitis, the diagnosis accuracy is about 70%, the CT flat scan liver density is generally reduced, the ratio of the liver/spleen CT flat scan density is less than or equal to 1, the diagnosis of steatohepatitis can be determined, the degree of steatohepatitis can be judged according to the ratio of the liver/spleen CT flat scan density, the accuracy is slightly superior to that of B-ultrasound, but the method has the defects of high cost, radioactivity and the like.
Disclosure of Invention
The invention provides a construction method, a prediction system, equipment and a storage medium for non-invasive screening of a non-alcoholic steatohepatitis model, aiming at solving the problems of low accuracy, high cost and the like of the existing method for steatohepatitis.
A construction method for non-invasive screening of a non-alcoholic steatohepatitis model is realized by the following steps:
step one, collecting serum, images and pathological data of a steatohepatitis patient;
collecting data including patient basic information, NASH and non-NASH pathological biopsy data, liver ultrasonography B images and serum laboratory examination data; and ultrasound images corresponding to the patient's liver BMP format;
wherein the content of the first and second substances,
NASH has a chinese meaning of nonalcoholic steatohepatitis;
the basic information comprises age, sex, height, weight, BMI calculation, drinking history, diabetes and hepatitis history; BMI means Body Mass Index, also called Body weight for short, Body Mass Index, BMI for short;
serum laboratory test data include white blood cell count, platelet count, glutamic-pyruvic transaminase, glutamic-oxalacetic transaminase, glutamyl transpeptidase, total bilirubin, direct bilirubin, blood coagulation time, alkaline phosphatase, albumin, blood cholesterol, blood coagulation international normalized ratio, blood ferritin, fasting blood glucose, uric acid, blood lipid, fasting insulin, HOMA index;
step two, preprocessing serum characteristic data;
removing NASH and non-NASH pathological biopsy data, and uniformly carrying out the following standardized treatment on the rest serum variables, wherein the treatment method comprises the following steps:
(1) ranking each numerical variable from small to large, treating less than 2.5% or greater than 97.5% as outliers and changing to 2.5% and 97.5%, respectively;
(2) all numerical variables are normalized, and the formula is as follows: xn normalized=(Xn–Xmin)/(Xmax–Xmin) (ii) a Wherein, XnRepresenting any numerical variable, Xnnormazed denotes a numerical variable XnNormalized value of (2), XmaxRepresenting the maximum value, X, of a numerical variableminRepresents the minimum value of the numerical variables;
thirdly, statistical screening of serum characteristics;
selecting a part of the collected data as a training set; then, screening out indexes with significant correlation through spearman correlation analysis in a training group; then, carrying out statistical screening by forward conditional multivariate logistic regression to obtain a serum index related to NASH; wherein the serum indexes comprise glutamic-oxaloacetic transaminase, platelet count, blood fat, BMI calculation and uric acid;
step four, B ultrasonic characteristic extraction and screening;
the B ultrasonic feature extraction refers to calculating and extracting the gray texture features of the liver tissues; the gray texture features comprise a gray level co-occurrence matrix, a gray level walking run-length matrix, an intensity histogram and an invariant moment;
the screening operation comprises the following three steps:
(1) the viewer consistency test within and between groups exceeded 0.8;
(2) and (3) screening the screening result obtained in the step (1) as follows: carrying out primary screening operation by using a variance threshold method, wherein the related threshold is set to be 1.0;
(3) and (3) screening the screening result obtained in the step (2) as follows: performing final screening by adopting a lasso regression method; the method specifically comprises the following steps:
firstly, screening effective B-mode ultrasonic characteristics by adopting a Lasso regression model, wherein the cost function of the Lasso regression model is as follows:
wherein x isijRepresents the jth B-mode characteristic variable, y, of the ith patient dataiIs a response variable of the ith patient data, q represents the total number of B-mode characteristic variables, m is the total number of patient data, λ is a penalty function,the optimal result is obtained;
then, using the non-0 coefficient variable screened out by the lasso regression method as the effective B ultrasonic characteristic of the final modeling;
integrating effective B-ultrasonic features by using a Support Vector Machine (SVM);
predicting the effective B ultrasonic image group characteristics by using a Support Vector Machine (SVM) model to obtain a predicted value R-NASH; the prediction step of the SVM model comprises the following steps:
(1) selecting a Gaussian kernel function:x and z in the space are any two points in the space, and sigma is a width parameter of a function;
(2) constructing a support vector machine model:wherein alpha isiAnd b is the coefficient to be solved; alpha is alphaiIs obtained by solving the following dual problem:
0≤αi,i=1,2,K,N
yirepresenting for input xiCorresponding R-NASH value yi;
i represents the ith, j represents the jth, aj、aiFor the j, i sample to be solved; x is the number ofi、xjRepresenting the input characteristics of the type-B ultrasonic of the ith and j samples; y isjYi represents the integrated B ultrasonic characteristic R-NASH corresponding to the j and i samples, and N represents the number of samples; obtaining an optimal solution:selection of alpha*A component ofSatisfies the conditionsObtaining the optimal:
(3) predicting the characteristics of the B-ultrasonic image group by using the obtained SVM model to obtain a predicted value R-NASH:
wherein x is a vector formed by the features of the effective B-ultrasonic image group;
combining the obtained serum index characteristic with the R-NASH characteristic of the B ultrasonic wave, and obtaining a non-alcoholic steatohepatitis diagnosis model by adopting a multivariate logistic regression method, wherein the method comprises the following steps:
(1) suppose x1,x2,K,xnFor predictive value R-NASH and other serum index features, g is the fibrosis index, and the regression model is as follows:
z=hθ(x)=θ0+θ1x1+θ2x2+Λ+θnxn
wherein, theta0,θ1,θ2ΛθnIs a model parameter of multivariate logistic regression, n is the number of variables, and the adopted characteristics comprise: R-NASH, PLT, AST, BMI, Lipid, Uric acid;
(2) obtaining model parameters, and obtaining a diagnosis model as follows:
YNASH=2.145×R-NASH–0.306×PLT+0.267×AST+1.152×BMI+1.185×Lipid+ 0.114×Uric acid–3.375;
step seven, confirming the model;
and confirming the model threshold range of the NASH according to the maximum York index of the ROC curve through model comparison and efficiency analysis, so as to obtain a model with fixed final parameters for prediction.
Has the advantages that:
the invention relates to a non-alcoholic steatohepatitis screening model, which is a prediction model for non-alcoholic steatohepatitis related data formed by serum, images and pathological data of a plurality of steatohepatitis patients. The non-alcoholic steatohepatitis model is screened based on a plurality of clinical indexes, so that the advantages and disadvantages between serology and imaging are complemented, the prediction speed and accuracy of the model are improved, and reference can be provided for clinical liver fibrosis diagnosis. The NASH primary screening can be rapidly completed under the condition of not performing hepatic puncture, so that the degree of liver pathological changes of a patient can be efficiently monitored at any time, and corresponding clinical intervention measures can be formulated. The non-invasive prediction not only reduces the economic burden of the patient and the time cost of the doctor, but also avoids the pain and complication caused by the invasive examination of redundant patients and lightens the medical burden.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram showing comparison between the evaluation of the effect using the ROC curve and the design model of the present invention;
FIG. 3 is a schematic diagram of circled drawing of liver tissue and calculation of extracted gray texture features;
FIG. 4 is a diagrammatic illustration of a gray level co-occurrence matrix calculation method;
FIG. 5 is a schematic representation of a gray-scale walk run-length matrix calculation method;
FIG. 6 is a graphical representation of an intensity histogram correlation feature calculation method;
FIG. 7 is a graphical illustration of a method of calculating an invariant moment.
Detailed Description
The first embodiment is as follows:
in the non-invasive screening method for constructing a non-alcoholic steatohepatitis model according to the embodiment, as shown in fig. 1, the detection method is realized by the following steps:
step one, collecting serum, images and pathological data of a steatohepatitis patient;
collecting liver, gall and pancreas center data of a Nanjing drugstore hospital, including basic information of a patient, NASH (non-alcoholic steatohepatitis) and non-NASH pathological biopsy data, liver B ultrasonic images and serum laboratory examination data; and ultrasound images of the liver in BMP format corresponding to the patient, for example, it is the ultrasound department at the buhui hospital, south kyo that copies ultrasound images of the corresponding patient in BMP format.
Wherein the content of the first and second substances,
NASH has a chinese meaning of nonalcoholic steatohepatitis;
the basic information comprises age, sex, height, weight, BMI calculation, drinking history, diabetes and hepatitis history; BMI means Body Mass Index, also called Body weight for short, Body Mass Index, BMI for short;
serum laboratory test data include white blood cell count, platelet count, glutamic-pyruvic transaminase, glutamic-oxalacetic transaminase, glutamyl transpeptidase, total bilirubin, direct bilirubin, blood coagulation time, alkaline phosphatase, albumin, blood cholesterol, blood coagulation international normalized ratio, blood ferritin, fasting blood glucose, uric acid, blood lipid, fasting insulin, HOMA index;
step two, preprocessing serum characteristic data;
except for NASH and non-NASH pathological biopsy data, serum variables were uniformly normalized as follows:
(1) ranking each numerical variable from small to large, treating less than 2.5% or greater than 97.5% as outliers and changing to 2.5% and 97.5%, respectively;
(2) all numerical variables are normalized, and the formula is as follows: xn normalized=(Xn–Xmin)/(Xmax–Xmin) (ii) a Wherein, XnRepresenting any numerical variable, Xnnormazed denotes a numerical variable XnNormalized value of (2), XmaxRepresenting the maximum value, X, of a numerical variableminRepresents the minimum value of the numerical variables;
thirdly, statistical screening of serum characteristics;
selecting a part of the collected data as a training set; for example, the hepatobiliary pancreas center case of drumbeat hospital, Nanjing was used as the training group. Then, screening out indexes with significant correlation through spearman correlation analysis in a training group; then, carrying out statistical screening by forward conditional multivariate logistic regression to obtain the final meaningful serum index most relevant to NASH; wherein the serum indexes comprise glutamic-oxaloacetic transaminase, platelet count, blood fat, BMI calculation and uric acid; the serum indexes are AST for short, PLT for short, lipid for short, BMI calculation and Uric acid for short;
step four, B ultrasonic characteristic extraction and screening;
the B ultrasonic feature extraction refers to calculating and extracting the gray texture features of the liver tissues; the gray texture features comprise a gray level co-occurrence matrix, a gray level walking run-length matrix, an intensity histogram and an invariant moment;
the screening operation comprises the following three steps:
(1) an intra-and inter-group observer consistency test (ICC) of more than 0.8;
(2) carrying out primary screening operation by using a variance threshold method, wherein the related threshold is set to be 1.0;
(3) performing final screening by adopting a lasso regression method; the method specifically comprises the following steps:
firstly, screening effective B-mode features by adopting a Lasso regression model, thereby eliminating the features with small effect, wherein the cost function of the Lasso regression model is as follows:
wherein x isijRepresents the jth B-mode characteristic variable, y, of the ith patient dataiIs a response variable of the ith patient data, q represents the total number of B-mode characteristic variables, m is the total number of patient data, λ is a penalty function,the optimal result is obtained;
then, using the non-0 coefficient variable screened out by the lasso regression method as the effective B ultrasonic characteristic of the final modeling;
integrating effective B-ultrasonic features by using a Support Vector Machine (SVM);
predicting the effective B ultrasonic image group characteristics by using a Support Vector Machine (SVM) model to obtain a predicted value R-NASH; the SVM model prediction step comprises the following steps:
(1) selecting a Gaussian kernel function:x and z in the space are any two points in the space, and sigma is a width parameter of a function;
(2) constructing a support vector machine model:wherein alpha isiAnd b is the coefficient to be solved; alpha is alphaiIs obtained by solving the following dual problem:
0≤αi,i=1,2,K,N
yirepresenting for input xiCorresponding R-NASH value yi;
i represents the ith, j represents the jth, aj、aiFor the j, i sample to be solved; x is the number ofi、xjRepresenting the input characteristics of the type-B ultrasonic of the ith and j samples; y isjYi represents the integrated B ultrasonic characteristic R-NASH corresponding to the j and i samples, and N represents the number of samples; obtaining an optimal solution:selection of alpha*A component ofSatisfies the conditionsObtaining the optimal:
(3) predicting the characteristics of the B-ultrasonic image group by using the obtained SVM model to obtain a predicted value R-NASH:
wherein x is a vector formed by the features of the effective B-ultrasonic image group;
combining the obtained serum index characteristic with the R-NASH characteristic of the B ultrasonic wave, and obtaining a non-alcoholic steatohepatitis diagnosis model by adopting a multivariate logistic regression method; wherein the multivariate logistic regression method comprises the following steps:
(1) suppose x1,x2,K,xnFor predictive value R-NASH and other serum index features, g is the fibrosis index, and the regression model is as follows:
z=hθ(x)=θ0+θ1x1+θ2x2+Λ+θnxn
wherein, theta0,θ1,θ2ΛθnIs a model parameter of multivariate logistic regression, n is the number of variables, and the adopted characteristics comprise: R-NASH, PLT, AST, BMI, Lipid, Uric acid;
(2) obtaining model parameters, and obtaining a diagnosis model as follows:
YNASH=2.145×R-NASH–0.306×PLT+0.267×AST+1.152×BMI+1.185×Lipid+ 0.114×Uric acid–3.375。
step seven, confirming the model;
confirming a model threshold range for evaluating the NASH according to the maximum Johnson index (sensitivity plus specificity-1) of an ROC curve through model comparison and efficiency analysis, and obtaining a reasonable parameter range of a final model according to the determined threshold value so as to obtain a model with fixed final parameters for prediction; and applied to clinical validation.
The NASH preliminary evaluation model based on a plurality of clinical data can quickly complete the NASH preliminary screening under the condition of not performing hepatic puncture, so that the degree of liver pathological changes of a patient can be monitored at any time with high efficiency, and corresponding clinical intervention measures are formulated. The noninvasive diagnosis not only reduces the economic burden of patients and the time cost of doctors, but also avoids the pain and complication caused by invasive examination of redundant patients and lightens the medical burden.
The second embodiment is as follows:
different from the first specific embodiment, in the construction method for non-invasive screening of the non-alcoholic steatohepatitis model according to the first embodiment, a step of model verification and comparison is further added between the sixth step and the seventh step, specifically:
the model of the invention was compared to the existing model for ultrasonic FLI examination. The ROC curve was used for comparison between the effect evaluation and the model, as shown in fig. 2.
The ROC curve abscissa represents specificity and the ordinate represents sensitivity, and the higher the curve is, the better the classification capability of the model is. The solid curve is a new model, the dotted curve is an existing model, the two are put together for comparison, and the advantages and the disadvantages can be obviously distinguished.
The third concrete implementation mode:
different from the first or second specific embodiment, in the fourth step of the construction method for non-invasive screening of the non-alcoholic steatohepatitis model according to the present embodiment, the features used in the step of B-ultrasonic feature extraction are as follows: (type-B ultrasonic characteristics in raw Material)
B ultrasonic characteristic extraction:
reading a B-mode ultrasonic image by using screening software (liveradomics), standardizing the image through a gray level stretching step, and calculating and extracting the gray level texture characteristics of the liver; the gray texture features comprise a gray level co-occurrence matrix, a gray level walking run-length matrix, an intensity histogram and an invariant moment; as shown in fig. 3;
wherein, the algorithm for extracting the features by the screening software is as follows,
the method for calculating the gray level co-occurrence matrix is to describe texture according to the spatial correlation characteristics of gray levels, and belongs to a common method, and comprises the following steps:
autocorrelation characteristics, known in english as Autocorrelation: the method is characterized in that correlation exists among expected values of random error items and is used for evaluating the definition of an image;
and (3) correlation: it measures the degree of similarity of spatial gray level co-occurrence matrix elements in the row or column direction, and thus the magnitude of the correlation value reflects the local gray level correlation in the image. When the matrix element values are uniform and equal, the correlation value is large; conversely, if the matrix pixel values differ greatly then the correlation value is small. If the image has horizontal direction texture, the COR of the horizontal direction matrix is larger than the COR values of the other matrixes; COR is called completely, and Chinese meaning is related;
cluster protrusion: under the condition that an object is abrupt in the image, the larger the contrast between the lines and the patterns is, the larger the prom value is; the prom is called Clusterpromience, and Chinese meaning is cluster protrusion;
cluster shadow, english full clusterslide: shadow size (shade), which may be related to picture quality, is more intuitive to relate to the degree of drape of the garment, the flatter the shade is, the smaller the shade value is; measuring the skewness of the matrix and measuring the concept of uniformity; when the value is high, the image is asymmetric; shadow Chinese meaning shadow;
difference, english is called similarity: when calculating the contrast, the weights increase exponentially with the distance between the matrix elements and the diagonal, and if the weights increase linearly, dissimilarity is obtained.
Energy is called Energy throughout: the sum of squares of the gray level co-occurrence matrix element values is also called energy, and reflects the uniformity degree of the gray level distribution of the image and the thickness of the texture. If all the values of the co-occurrence matrix are equal, the ASM value is small; conversely, if some of the values are large and others are small, the ASM value is large. When the elements in the co-occurrence matrix are distributed in a concentrated manner, the ASM value is large. A large ASM value indicates a more uniform and regularly varying texture pattern.
Homogeneity, english term Homogeneity: the local uniformity of the image is measured, the value of the non-uniform image is lower, and the value of the uniform image is higher. In contrast to contrast or dissimilarity, the weight of homogeneity decreases with the distance of the element value from the diagonal, in an exponential fashion.
Maximum probability, english full name maximummobavailability: representing the texture feature that appears the most frequently in the image.
The calculation method is shown in fig. 4. In the figure, j represents the gray scale value of the pixel y, i represents the gray scale value of the pixel x, p represents the probability, and M, N represents an M × N matrix.
The fourth concrete implementation mode:
different from the third specific embodiment, in the method for constructing a non-invasive screening non-alcoholic steatohepatitis model according to the third specific embodiment, the method for calculating the gray-scale run matrix is shown in fig. 5, the gray-scale run matrix is a matrix formed by lengths of gray-scale run rows, and is only used for measuring and counting image pixel information, and the generated gray-scale run matrix is calculated in the actual use process to obtain image characteristic information based on a gray-scale co-occurrence matrix. Including short term emphasis (SRE), long term emphasis (LRE), run length non-uniformity (RLN), gray level non-uniformity (GLN), Run Percentage (RP), low gray level run emphasis (LGRE), high gray level run emphasis (HGRE), short term low gray level emphasis (SRLGE), short term high gray level emphasis (SRHGE), long term low gray level emphasis (LRLGE), long term high gray level emphasis (LRHGE).
In fig. 5, the (i, j) point represents the count of j consecutive occurrences of a pixel with a gray level i in a certain direction of the image, p represents the probability, and M represents the gray level on the imageThe number of degree stages; n represents the number of different runs on the image; n isrRepresenting the number of pixel points on the image;
the fifth concrete implementation mode:
different from the third or fourth embodiment, in the method for constructing a non-invasive screening non-alcoholic steatohepatitis model according to this embodiment, as shown in fig. 6, the relevant features of the intensity histogram are first-order features calculated from the intensity histogram, and the first-order features are the simplest first-order features, including Mean: statistical mean, Variance of intensity histogram: statistical variance of intensity histogram, Entropy: statistical entropy of intensity histogram, energy: the sum of the statistical energies of the intensity histograms; kurtosis (Kurtosis): and characterizing the characteristic number of the peak value height of the probability density distribution curve at the average value, describing whether the peak value of the distribution formed by Sample is abrupt or flat, and visually showing that the kurtosis reflects the sharpness of the peak. The kurtosis of the sample is a statistic compared with normal distribution, the kurtosis of the time series x is calculated, the kurtosis is used for measuring the condition that x deviates from certain distribution, and the kurtosis of the normal distribution is 3. If the kurtosis is larger than 3, the shape of the peak is sharper than that of the normal distribution peak. And vice versa; in statistics, kurtosis measures the kurtosis of a real random variable probability distribution, and high kurtosis means that the variance increase is caused by extreme differences at low frequencies that are greater or less than the mean. Skewness (Skewness): the situation of the symmetry of the distribution constituted by samples is described.
Wherein, the situation describing the symmetry of the distribution composed of samples is that the skewness of the time series x is calculated, and the skewness is used for measuring the symmetry of the x; if the skewness is negative, the dispersion degree on the left side of the x mean value is stronger than that on the right side; if the skewness is positive, the dispersion degree on the left side of the x-mean is weaker than that on the right side; for a normal distribution (or strictly symmetric distribution), skewness is equal to O.
The sixth specific implementation mode:
unlike the fifth embodiment, in the method for constructing a non-invasive screening non-alcoholic steatohepatitis model according to the fifth embodiment, as shown in fig. 7, the moment is a concept of probability and statistics, and is a digital feature of a random variable, the first-order origin moment is expected, the first-order central moment μ 1 is 0, and the second-order central moment μ 2 is the variance var (X) of X, and statistically, moments higher than 4 orders are rarely used. The third-order central moment mu 3 is used for measuring whether the distribution is biased or not, and the fourth-order central moment mu 4 is used for measuring how steep the distribution (or density) is near the mean value;
7 invariant moment groups (phi 1 phi 7) are derived by utilizing second-order and third-order specification central moments, and the invariant moment groups (phi 1 phi 7) are kept invariant during image translation, rotation and scale change.
The seventh embodiment:
the prediction system for the construction method for non-invasive screening of the non-alcoholic steatohepatitis model of the embodiment comprises:
the data acquisition module is used for preprocessing the collected serum, image and pathological data of the steatohepatitis patient;
the serum characteristic data preprocessing module is used for removing NASH and non-NASH pathological biopsy data and uniformly carrying out the following standardized processing on the rest serum variables; the specific treatment method comprises the following steps:
(1) ranking each numerical variable from small to large, treating less than 2.5% or greater than 97.5% as outliers and changing to 2.5% and 97.5%, respectively;
(2) all numerical variables are normalized, and the formula is as follows: xn normalized=(Xn–Xmin)/(Xmax–Xmin) (ii) a Wherein, XnRepresenting any numerical variable, Xnnormazed denotes a numerical variable XnNormalized value of (2), XmaxRepresenting the maximum value, X, of a numerical variableminRepresents the minimum value of the numerical variables;
serum characteristic statistical screening, which is used for serum characteristic statistical screening; specifically, a part of the collected data is selected as a training set; then, screening out indexes with significant correlation through spearman correlation analysis in a training group; then, carrying out statistical screening by forward conditional multivariate logistic regression to obtain a serum index related to NASH; wherein the serum indexes comprise glutamic-oxaloacetic transaminase, platelet count, blood fat, BMI calculation and uric acid;
the B ultrasonic characteristic extraction and screening module is used for extracting and screening B ultrasonic characteristics, and non-0 coefficient variables screened out by a lasso regression method are used as effective B ultrasonic characteristics of final modeling, so that the gray texture characteristics of liver tissues are calculated and extracted; the effective B-mode ultrasonic characteristics are screened out by adopting a Lasso regression model, wherein the cost function of the Lasso regression model is as follows:
wherein x isijRepresents the jth B-mode characteristic variable, y, of the ith patient dataiIs a response variable of the ith patient data, q represents the total number of B-mode characteristic variables, m is the total number of patient data, λ is a penalty function,the optimal result is obtained;
the characteristic integration module is used for predicting the effective B ultrasonic image group characteristics by utilizing a Support Vector Machine (SVM) model to obtain a predicted value R-NASH and integrating the effective B ultrasonic characteristics;
the prediction model establishing module is used for combining the obtained serum index characteristics and the R-NASH characteristics of the B ultrasonic wave and obtaining a non-alcoholic steatohepatitis screening model by adopting a multivariate logistic regression method;
the prediction model confirmation module is used for confirming the model threshold range of the NASH to be evaluated so as to obtain a model with fixed final parameters; the model threshold range for confirming and evaluating NASH is confirmed according to the maximum York index of an ROC curve through model comparison and efficiency analysis and is finally used for prediction.
The specific implementation mode is eight:
the detection device for the construction method for non-invasive screening of the non-alcoholic steatohepatitis model comprises the following steps:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to execute the steps of the method of constructing a non-invasive screening non-alcoholic steatohepatitis model by executing the executable instructions.
The specific implementation method nine:
a computer readable storage medium of the present embodiment stores a program that, when executed, implements the steps of the method of constructing a non-invasive screening non-alcoholic steatohepatitis model.
Claims (8)
1. A construction method for non-invasive screening of a non-alcoholic steatohepatitis model is characterized by comprising the following steps: the detection method is realized by the following steps:
step one, collecting serum, images and pathological data of a steatohepatitis patient;
collecting data including patient basic information, NASH and non-NASH pathological biopsy data, liver ultrasonography B images and serum laboratory examination data; and ultrasound images corresponding to the patient's liver BMP format;
wherein the content of the first and second substances,
NASH has a chinese meaning of nonalcoholic steatohepatitis;
the basic information comprises age, sex, height, weight, BMI calculation, drinking history, diabetes and hepatitis history; BMI means Body Mass Index, also called Body weight for short, Body Mass Index, BMI for short;
serum laboratory test data include white blood cell count, platelet count, glutamic-pyruvic transaminase, glutamic-oxalacetic transaminase, glutamyl transpeptidase, total bilirubin, direct bilirubin, blood coagulation time, alkaline phosphatase, albumin, blood cholesterol, blood coagulation international normalized ratio, blood ferritin, fasting blood glucose, uric acid, blood lipid, fasting insulin, HOMA index;
step two, preprocessing serum characteristic data;
removing NASH and non-NASH pathological biopsy data, and uniformly carrying out the following standardized treatment on the rest serum variables, wherein the treatment method comprises the following steps:
(1) ranking each numerical variable from small to large, treating less than 2.5% or greater than 97.5% as outliers and changing to 2.5% and 97.5%, respectively;
(2) all numerical variables are normalized, and the formula is as follows: xn normalized=(Xn–Xmin)/(Xmax–Xmin) (ii) a Wherein, XnRepresenting any numerical variable, Xnnormazed denotes a numerical variable XnNormalized value of (2), XmaxRepresenting the maximum value, X, of a numerical variableminRepresents the minimum value of the numerical variables;
thirdly, statistical screening of serum characteristics;
selecting a part of the collected data as a training set; then, screening out indexes with significant correlation through spearman correlation analysis in a training group; then, carrying out statistical screening by forward conditional multivariate logistic regression to obtain a serum index related to NASH; wherein the serum indexes comprise glutamic-oxaloacetic transaminase, platelet count, blood fat, BMI calculation and uric acid;
step four, B ultrasonic characteristic extraction and screening;
the B ultrasonic feature extraction refers to calculating and extracting the gray texture features of the liver tissues; the gray texture features comprise a gray level co-occurrence matrix, a gray level walking run-length matrix, an intensity histogram and an invariant moment;
the screening operation comprises the following three steps:
(1) the viewer consistency test within and between groups exceeded 0.8;
(2) and (3) screening the screening result obtained in the step (1) as follows: carrying out primary screening operation by using a variance threshold method, wherein the related threshold is set to be 1.0;
(3) and (3) screening the screening result obtained in the step (2) as follows: performing final screening by adopting a lasso regression method; the method specifically comprises the following steps:
firstly, screening effective B-mode ultrasonic characteristics by adopting a Lasso regression model, wherein the cost function of the Lasso regression model is as follows:
wherein x isijRepresents the jth B-mode characteristic variable, y, of the ith patient dataiIs a response variable of the ith patient data, q represents the total number of B-mode characteristic variables, m is the total number of patient data, λ is a penalty function,the optimal result is obtained;
then, using the non-0 coefficient variable screened out by the lasso regression method as the effective B ultrasonic characteristic of the final modeling;
integrating effective B-ultrasonic features by using a Support Vector Machine (SVM);
predicting the effective B ultrasonic image group characteristics by using a Support Vector Machine (SVM) model to obtain a predicted value R-NASH; the prediction step of the SVM model comprises the following steps:
(1) selecting a Gaussian kernel function:x and z in the space are any two points in the space, and sigma is a width parameter of a function;
(2) constructing a support vector machine model:wherein alpha isiAnd b is the coefficient to be solved; alpha is alphaiIs obtained by solving the following dual problem:
0≤αi,i=1,2,K,N
yirepresenting for input xiCorresponding R-NASH value yi;
i represents the ith, j represents the jth, aj、aiFor the j, i sample to be solved; x is the number ofi、xjRepresenting the input characteristics of the type-B ultrasonic of the ith and j samples; y isjYi represents the integrated B ultrasonic characteristic R-NASH corresponding to the j and i samples, and N represents the number of samples; obtaining an optimal solution:selection of alpha*A component ofSatisfies the conditionsObtaining the optimal:
(3) predicting the characteristics of the B-ultrasonic image group by using the obtained SVM model to obtain a predicted value R-NASH:
wherein x is a vector formed by the features of the effective B-ultrasonic image group;
combining the obtained serum index characteristic with the R-NASH characteristic of the B ultrasonic wave, and obtaining a non-alcoholic steatohepatitis diagnosis model by adopting a multivariate logistic regression method, wherein the method comprises the following steps:
(1) suppose x1,x2,K,xnFor predictive value R-NASH and other serum index features, g is the fibrosis index, and the regression model is as follows:
z=hθ(x)=θ0+θ1x1+θ2x2+Λ+θnxn
wherein, theta0,θ1,θ2ΛθnIs a model parameter of multivariate logistic regression, n is the number of variables, and the adopted characteristics comprise: R-NASH, PLT, AST, BMI, Lipid, Uric acid;
(2) obtaining model parameters, and obtaining a diagnosis model as follows:
YNASH=2.145×R-NASH–0.306×PLT+0.267×AST+1.152×BMI+1.185×Lipid+0.114×Uric acid–3.375;
step seven, confirming the model;
and confirming the model threshold range of the NASH according to the maximum York index of the ROC curve through model comparison and efficiency analysis, so as to obtain a model with fixed final parameters for prediction.
2. The method of claim 1, wherein the method comprises the following steps: in the fourth step, the features used in the step of B-mode ultrasonic feature extraction are as follows:
b ultrasonic characteristic extraction:
b ultrasonic images are read by screening software, then the images are normalized through a gray level stretching step, and then the gray level texture features of the liver are calculated and extracted; the gray texture features comprise a gray level co-occurrence matrix, a gray level walking run-length matrix, an intensity histogram and an invariant moment;
wherein, the algorithm for extracting the features by the screening software is as follows,
the method for calculating the gray level co-occurrence matrix is to describe textures according to the spatial correlation characteristics of gray levels, and comprises the following steps:
the autocorrelation characteristic: the method is characterized in that correlation exists among expected values of random error items and is used for evaluating the definition of an image;
and (3) correlation: the similarity degree of the elements of the space gray level co-occurrence matrix in the row or column direction is measured, so that the correlation value reflects the local gray level correlation in the image; when the matrix element values are uniform and equal, the correlation value is large; on the contrary, if the matrix pixel values are greatly different, the correlation value is small; if the image has horizontal direction texture, the COR of the horizontal direction matrix is larger than the COR values of the other matrixes; COR is called completely, and Chinese meaning is related;
cluster protrusion: under the condition that an object is abrupt in the image, the larger the contrast between the lines and the patterns is, the larger the prom value is; the prom is called Clusterpromience, and Chinese meaning is cluster protrusion;
cluster shadow, english full clusterslide: the shadow size is possibly related to the picture quality, more intuitively related to the wrinkle degree of the clothes, and the smoother the shadow value is, the smaller the shadow value is; measuring the skewness of the matrix and measuring the concept of uniformity; when the value is high, the image is asymmetric; shadow Chinese meaning shadow;
difference: when the contrast is calculated, the weight is exponentially increased along with the distance between the matrix elements and the diagonal, and if the distance is linearly increased, the dissimilarity is obtained;
energy: the energy is also called as the energy, and reflects the uniformity degree of the gray level distribution of the image and the thickness of the texture; if all the values of the co-occurrence matrix are equal, the ASM value is small; conversely, if some of the values are large and others are small, the ASM value is large; when elements in the co-occurrence matrix are distributed in a centralized manner, the ASM value is large; a large ASM value indicates a more uniform and regularly varying texture pattern;
homogeneity: measuring the local uniformity of the image, wherein the value of the non-uniform image is lower, and the value of the uniform image is higher; in contrast to contrast or dissimilarity, the weight of homogeneity decreases with the distance of the element value from the diagonal, in an exponential manner;
maximum probability, english full name maximummobavailability: representing the texture feature that appears the most frequently in the image.
3. The method of claim 2, wherein the non-invasive screening of the non-alcoholic steatohepatitis model is as follows: the method for calculating the gray level walking run matrix is that the gray level run matrix is a matrix formed by the lengths of the gray level wandering, the gray level run matrix is only used for measuring and counting image pixel information, and the generated gray level run matrix is required to be calculated in the actual use process to obtain image characteristic information based on the gray level co-occurrence matrix; including short-term emphasis, long-term emphasis, run length non-uniformity, gray level non-uniformity, run percentage, low gray level run emphasis, high gray level run emphasis, short-term low gray level emphasis, short-term high gray level emphasis, long-term low gray level emphasis, long-term high gray level emphasis.
4. The construction method for non-invasive screening of non-alcoholic steatohepatitis model according to claim 2 or 3, wherein: the method for calculating the intensity histogram comprises the following steps that relevant features of the intensity histogram are first-order features calculated according to the intensity histogram, and the first-order features comprise a statistical mean value of the intensity histogram, a statistical variance of the intensity histogram, a statistical entropy value of the intensity histogram and a statistical energy sum of the intensity histogram; the characteristic number of the peak value height of the characteristic probability density distribution curve at the average value describes whether the peak value of the distribution formed by the Sample is abrupt or flat, and describes the condition of the symmetry of the distribution formed by the Sample;
wherein, the situation describing the symmetry of the distribution composed of samples is that the skewness of the time series x is calculated, and the skewness is used for measuring the symmetry of the x; if the skewness is negative, the dispersion degree on the left side of the x mean value is stronger than that on the right side; if the skewness is positive, the dispersion degree on the left side of the x-mean is weaker than that on the right side; skewness is equal to O for a normal distribution.
5. The construction method for non-invasive screening of non-alcoholic steatohepatitis model according to claim 2 or 3, wherein: the method for calculating the moment of invariance is that,
7 invariant moment groups (phi 1 phi 7) are derived by utilizing second-order and third-order specification central moments;
the moment is a digital feature of a random variable, a first-order origin moment is expected, a first-order central moment mu 1 is 0, a second-order central moment mu 2 is a variance var (X) of X, a third-order central moment mu 3 is used for measuring whether the distribution is biased, and a fourth-order central moment mu 4 is used for measuring the degree of steepness of the distribution near the mean value.
6. A prediction system applied to the construction method for non-invasive screening of the non-alcoholic steatohepatitis model according to any one of claims 1 to 5, characterized in that the system comprises:
the data acquisition module is used for preprocessing the collected serum, image and pathological data of the steatohepatitis patient;
the serum characteristic data preprocessing module is used for removing NASH and non-NASH pathological biopsy data and uniformly carrying out the following standardized processing on the rest serum variables; the specific treatment method comprises the following steps:
(1) ranking each numerical variable from small to large, treating less than 2.5% or greater than 97.5% as outliers and changing to 2.5% and 97.5%, respectively;
(2) all numerical variables are normalized, and the formula is as follows: xn normalized=(Xn–Xmin)/(Xmax–Xmin) (ii) a Wherein, XnRepresenting any numerical variable, Xnnormazed denotes a numerical variable XnNormalized value of (2), XmaxRepresenting the maximum value, X, of a numerical variableminRepresents the minimum value of the numerical variables;
serum characteristic statistical screening, which is used for serum characteristic statistical screening; specifically, a part of the collected data is selected as a training set; then, screening out indexes with significant correlation through spearman correlation analysis in a training group; then, carrying out statistical screening by forward conditional multivariate logistic regression to obtain a serum index related to NASH; wherein the serum indexes comprise glutamic-oxaloacetic transaminase, platelet count, blood fat, BMI calculation and uric acid;
the B ultrasonic characteristic extraction and screening module is used for extracting and screening B ultrasonic characteristics, and non-0 coefficient variables screened out by a lasso regression method are used as effective B ultrasonic characteristics of final modeling, so that the gray texture characteristics of liver tissues are calculated and extracted; the effective B-mode ultrasonic characteristics are screened out by adopting a Lasso regression model, wherein the cost function of the Lasso regression model is as follows:
wherein x isijRepresents the jth B-mode characteristic variable, y, of the ith patient dataiIs a response variable of the ith patient data, q represents the total number of B-mode characteristic variables, m is the total number of patient data, λ is a penalty function,the optimal result is obtained;
the characteristic integration module is used for predicting the effective B ultrasonic image group characteristics by utilizing a Support Vector Machine (SVM) model to obtain a predicted value R-NASH and integrating the effective B ultrasonic characteristics;
the prediction model establishing module is used for combining the obtained serum index characteristics and the R-NASH characteristics of the B ultrasonic wave and obtaining a non-alcoholic steatohepatitis screening model by adopting a multivariate logistic regression method;
the prediction model confirmation module is used for confirming the model threshold range of the NASH to be evaluated so as to obtain a model with fixed final parameters; the model threshold range for confirming and evaluating NASH is confirmed according to the maximum York index of an ROC curve through model comparison and efficiency analysis and is finally used for prediction.
7. A detection device for noninvasive screening of a construction method of a non-alcoholic steatohepatitis model is characterized by comprising the following steps:
a processor;
a memory having stored therein executable instructions of the processor;
wherein the processor is configured to perform the steps of a method of construction for non-invasive screening of a non-alcoholic steatohepatitis model according to any one of claims 1 to 5 via execution of the executable instructions.
8. A computer readable storage medium storing a program which when executed performs the steps of a method of constructing a non-invasive screening model for non-alcoholic steatohepatitis as claimed in any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011634864.0A CN113160994A (en) | 2020-12-31 | 2020-12-31 | Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011634864.0A CN113160994A (en) | 2020-12-31 | 2020-12-31 | Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113160994A true CN113160994A (en) | 2021-07-23 |
Family
ID=76878584
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011634864.0A Pending CN113160994A (en) | 2020-12-31 | 2020-12-31 | Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113160994A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611410A (en) * | 2021-09-08 | 2021-11-05 | 温州医科大学附属第一医院 | Steatohepatitis risk diagnosis equipment and system and training method of residual error network of steatohepatitis risk diagnosis equipment and system |
CN114141363A (en) * | 2021-12-07 | 2022-03-04 | 川北医学院附属医院 | Severe pancreatitis prediction model construction method based on machine learning method |
CN114242247A (en) * | 2021-12-30 | 2022-03-25 | 吉林大学第一医院 | Non-obese MAFLD prediction system, device and storage medium |
CN116825362A (en) * | 2023-08-29 | 2023-09-29 | 北京回龙观医院(北京心理危机研究与干预中心) | Diagnostic prediction model for alcoholic liver injury and construction method and application method thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060172286A1 (en) * | 2005-02-03 | 2006-08-03 | Thierry Poynard | Diagnosis method of alcholic or non-alcoholic steato-hepatitis using biochemical markers |
-
2020
- 2020-12-31 CN CN202011634864.0A patent/CN113160994A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060172286A1 (en) * | 2005-02-03 | 2006-08-03 | Thierry Poynard | Diagnosis method of alcholic or non-alcoholic steato-hepatitis using biochemical markers |
Non-Patent Citations (3)
Title |
---|
冯巩等: "基于LASSO回归的非酒精性脂肪性肝病进展性肝纤维化预测模型的构建及分析", 临床肝胆病杂志, vol. 36, no. 10, 15 October 2020 (2020-10-15) * |
周玮等: "基于钆塞酸二钠增强磁共振成像影像组学定量评估肝硬化患者肝脏储备功能的应用价值", 中国医学科学院学报, vol. 42, no. 04, 3 September 2020 (2020-09-03) * |
胡志军等: "非酒精性脂肪性肝炎无创诊断模型的构建", 解放军医学杂志, vol. 45, no. 07, 28 July 2020 (2020-07-28) * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611410A (en) * | 2021-09-08 | 2021-11-05 | 温州医科大学附属第一医院 | Steatohepatitis risk diagnosis equipment and system and training method of residual error network of steatohepatitis risk diagnosis equipment and system |
CN114141363A (en) * | 2021-12-07 | 2022-03-04 | 川北医学院附属医院 | Severe pancreatitis prediction model construction method based on machine learning method |
CN114141363B (en) * | 2021-12-07 | 2023-09-12 | 川北医学院附属医院 | Machine learning method-based severe pancreatitis prediction model construction method |
CN114242247A (en) * | 2021-12-30 | 2022-03-25 | 吉林大学第一医院 | Non-obese MAFLD prediction system, device and storage medium |
CN116825362A (en) * | 2023-08-29 | 2023-09-29 | 北京回龙观医院(北京心理危机研究与干预中心) | Diagnostic prediction model for alcoholic liver injury and construction method and application method thereof |
CN116825362B (en) * | 2023-08-29 | 2024-01-02 | 北京回龙观医院(北京心理危机研究与干预中心) | Diagnostic prediction model for alcoholic liver injury and construction method and application method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ouyang et al. | Video-based AI for beat-to-beat assessment of cardiac function | |
Miranda et al. | A survey of medical image classification techniques | |
CN113160994A (en) | Construction method, prediction system, device and storage medium for noninvasive screening of non-alcoholic steatohepatitis model | |
Deng et al. | Deep learning-based HCNN and CRF-RRNN model for brain tumor segmentation | |
JP6837376B2 (en) | Image processing equipment and methods and programs | |
US9916658B2 (en) | Disease analysis apparatus, control method, and program | |
US9811904B2 (en) | Method and system for determining a phenotype of a neoplasm in a human or animal body | |
CN113850753B (en) | Medical image information computing method, device, edge computing equipment and storage medium | |
Florez et al. | Emergence of radiomics: novel methodology identifying imaging biomarkers of disease in diagnosis, response, and progression | |
Shamrat et al. | Analysing most efficient deep learning model to detect COVID-19 from computer tomography images | |
Abdulkareem et al. | Predicting post-contrast information from contrast agent free cardiac MRI using machine learning: Challenges and methods | |
Albahli et al. | AI-driven deep and handcrafted features selection approach for Covid-19 and chest related diseases identification | |
Bhan et al. | An assessment of machine learning algorithms in diagnosing cardiovascular disease from right ventricle segmentation of cardiac magnetic resonance images | |
CN111528918B (en) | Tumor volume change trend graph generation device after ablation, equipment and storage medium | |
Somasundaram et al. | Fetal brain extraction from magnetic resonance image (MRI) of human fetus | |
Beetz et al. | 3D shape-based myocardial infarction prediction using point cloud classification networks | |
Luong et al. | A computer-aided detection to intracranial hemorrhage by using deep learning: a case study | |
Benrabha et al. | Automatic ROI detection and classification of the achilles tendon ultrasound images | |
Wan et al. | Ceus-net: Lesion segmentation in dynamic contrast-enhanced ultrasound with feature-reweighted attention mechanism | |
Zhuo et al. | Fine-needle aspiration biopsy evaluation-oriented thyroid carcinoma auxiliary diagnosis | |
Abd Hamid et al. | Incorporating attention mechanism in enhancing classification of alzheimer’s disease | |
Vidya et al. | Automated detection of intracranial hemorrhage in noncontrast head computed tomography | |
Raad et al. | Probabilistic medical image imputation via deep adversarial learning | |
Amini et al. | Application of machine learning methods in diagnosis of alzheimer disease based on fractal feature extraction and convolutional neural network | |
EP3667674A1 (en) | Method and system for evaluating images of different patients, computer program and electronically readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |