WO2022246882A1 - 一种诊断多囊卵巢综合征的系统和方法 - Google Patents

一种诊断多囊卵巢综合征的系统和方法 Download PDF

Info

Publication number
WO2022246882A1
WO2022246882A1 PCT/CN2021/097896 CN2021097896W WO2022246882A1 WO 2022246882 A1 WO2022246882 A1 WO 2022246882A1 CN 2021097896 W CN2021097896 W CN 2021097896W WO 2022246882 A1 WO2022246882 A1 WO 2022246882A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
days
level
amh
less
Prior art date
Application number
PCT/CN2021/097896
Other languages
English (en)
French (fr)
Inventor
李蓉
徐慧玉
乔杰
冯国双
韩勇
史莉
Original Assignee
北京大学第三医院(北京大学第三临床医学院)
杭州青果医疗科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学第三医院(北京大学第三临床医学院), 杭州青果医疗科技有限责任公司 filed Critical 北京大学第三医院(北京大学第三临床医学院)
Publication of WO2022246882A1 publication Critical patent/WO2022246882A1/zh

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present invention relates to a system and method for assessing the probability of a subject suffering from polycystic ovary syndrome, and to a system and method for assisting in the diagnosis of polycystic ovary syndrome.
  • the system and method of the present invention can be used to assess The probability of the subject suffering from polycystic ovary syndrome, so as to assist in the diagnosis of polycystic ovary syndrome, and to evaluate whether the condition of the subject suffering from ovarian syndrome has improved after corresponding treatment.
  • PCOS polycystic ovary syndrome
  • PCOS is primarily a hyperandrogenic disorder, as has been validated in various rodent models of androgen-induced PCOS. However, how patients produce excess androgen is still unknown. Recent studies using mouse models suggest that AMH is involved in regulating the hypothalamic-pituitary-ovarian (H-P-O) axis and may stimulate excess androgen production.
  • Administration of recombinant human AMH at gestation days 16.5, 17.5, and 18.5 in mice activates AMH receptors in gonadotropin-releasing hormone (GnRH)-secreting neurons and increases the frequency of luteinizing hormone (LH) pulses, resulting in Increased serum LH and testosterone levels; decreased estradiol (E2) and progesterone levels in female mice on gestation day 19.5.
  • GnRH gonadotropin-releasing hormone
  • LH luteinizing hormone
  • E2 estradiol
  • progesterone levels in female mice on gestation day 19.5.
  • High levels of AMH induce elevated serum LH and testosterone, leading
  • the inventors of this application try to establish a system and method for diagnosing and predicting PCOS using AMH levels and other indicators, which may be helpful for screening and diagnosing PCOS clinically, and may further clarify the mechanism of PCOS cause.
  • the present invention relates to the following:
  • a system for diagnosing polycystic ovary syndrome comprising:
  • a data collection module which is used to obtain the anti-Müllerian hormone (AMH) level of the subject, collect the upper limit of the menstrual cycle days provided by the subject, collect the BMI of the subject, and obtain the androsene of the subject Data on diketone (AND) levels; and
  • the module for calculating the probability of suffering from polycystic ovary syndrome is used to calculate the above-mentioned data information obtained in the data collection module, so as to calculate the probability (p) of the subject suffering from polycystic ovary syndrome.
  • a grouping module in which a default polycystic ovary syndrome grouping parameter is pre-stored, and according to the grouping parameter, the calculated probability (p) of suffering from polycystic ovary syndrome is grouped, so that the subject The risk of developing polycystic ovary syndrome was divided into groups.
  • the module for calculating the probability of suffering from polycystic ovary syndrome the data transformation of the subject's anti-Müllerian hormone (AMH) level, the upper limit of menstrual cycle days, BMI, and androstenedione (AND) level is used.
  • AH anti-Müllerian hormone
  • BMI menstrual cycle days
  • AND androstenedione
  • the anti-Müllerian hormone (AMH) level refers to the anti-Mullerian hormone concentration in the venous blood of any day of the menstrual cycle of a female subject
  • the androstenedione (AND) level refers to the androstenedione concentration of the subject detected on any day during the menstrual period of the subject.
  • the anti-Müllerian hormone (AMH) level is converted into a five-category variable
  • the anti-Müllerian hormone (AMH) level will be divided into five groups, namely: the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml, and the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml.
  • AH level is 2.5 ng/ml and above and less than 5 ng/ml
  • anti-Müllerian hormone (AMH) level of the subject is 5 ng/ml and above and less than 7.5 ng/ml
  • subject's anti-Mullerian hormone (AMH) level is The Mullerian hormone (AMH) level is 7.5 ng/ml and above and less than 10 ng/ml
  • the subject's anti-Mullerian hormone (AMH) level is greater than or equal to 10 ng/ml.
  • the subject's upper limit of menstrual cycle days is converted into a five-category variable
  • the upper limit of the menstrual cycle days of the subjects is divided into five groups, namely, the upper limit of the menstrual cycle days of the subjects is less than 35 days, the upper limit of the menstrual cycle days of the subjects is 35 days and above but less than 45 days, and the subjects’ menstrual cycle days are less than 45 days.
  • the upper limit of the menstrual cycle days is 45 days and above and less than 60 days, the upper limit of the menstrual cycle days of the subject is 60 days and more but less than 90 days, and the upper limit of the menstrual cycle days of the subject is 90 days and above.
  • the subject's BMI is converted into a four-category variable
  • the BMI of the subject is divided into four groups, respectively: the subject's BMI is less than 18.5, the subject's BMI is 18.5 and above but less than 24, the subject's BMI is 24 and above but less than 28, and the subject Those with a BMI of 28 or above.
  • the subject's androstenedione (AND) level was converted into a three-category variable
  • the subject's androstenedione (AND) level ranges into three groups, namely: the subject's androstenedione (AND) level is less than 5nmol/L, the subject's androstenedione (AND) level is within 5nmol/L and above and less than 10nmol/L, and the androstenedione (AND) level of the subject is 10nmol/L and above.
  • the anti-Müllerian hormone (AMH) level the anti-Müllerian hormone (AMH) level, upper limit of menstrual cycle days, BMI, and androstenedione (AND
  • AH anti-Müllerian hormone
  • BMI upper limit of menstrual cycle days
  • AND androstenedione
  • the formula is the following formula one:
  • p is the calculated probability of the subject suffering from polycystic ovary syndrome, and a, b, c, d, i are unitless parameters;
  • AMH the upper limit of menstrual cycle days, BMI or AND take the value of 0 or 1.
  • i is any value selected from -4.91525 ⁇ -4.081495, and i is preferably -4.498372;
  • the value of AMH is 0;
  • AMH level When the subject's AMH level is 2.5 ng/ml or above and less than 5 ng/ml, the value of AMH is 1, a is any value selected from 0.3883373 to 0.8463509, and a is preferably 0.6173441;
  • AMH level When the subject's AMH level is 5 ng/ml and above and less than 7.5 ng/ml, the value of AMH is 1, a is any value selected from 1.2694194 to 1.7629597, and a is preferably 1.5161895;
  • AMH level When the subject's AMH level is 7.5 ng/ml or above and less than 10 ng/ml, the value of AMH is 1, a is any value selected from 1.8891674 to 2.4887798, and a is preferably 2.1889736;
  • the value of AMH is 1, a is any value selected from 2.1935842 to 2.8082163, and a is preferably 2.5009002;
  • the upper limit of menstrual cycle days is 0;
  • the upper limit of menstrual cycle days of the subject is 35 days or more and less than 45 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.1669412 to 1.6485894, and b is preferably 1.4077653;
  • the upper limit of menstrual cycle days of the subject is 45 days or more and less than 60 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.5889245 to 2.0947343, and b is preferably 1.8418294;
  • the upper limit of menstrual cycle days of the subject is 60 days or more and less than 90 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.6497983 to 2.3668561, and b is preferably 2.0083272;
  • the upper limit of menstrual cycle days is 90 days or more, the upper limit of menstrual cycle days is 1, b is any value selected from 1.8809757 to 2.5707838, and b is preferably 2.2258797;
  • the BMI value is 0;
  • the BMI value is 1, c is any value selected from -0.085964 to 0.6550568, and c is preferably 0.2845466;
  • the BMI value is 1, c is any value selected from 0.3957758 to 1.1728099, and c is preferably 0.7842928;
  • the BMI value is 1, c is any value selected from 0.7922476 to 1.6382346, and c is preferably 1.2152411;
  • d is any value selected from 0.269652 to 0.6809945, and d is preferably 0.4753233;
  • the value of AND is 1, d is any value selected from 0.7579538 to 1.252042, and d is preferably 1.0049979.
  • the basis for grouping prestored in the grouping module is:
  • the risk of the subject suffering from polycystic ovary syndrome is medium risk
  • a subject's risk of suffering from polycystic ovary syndrome is high risk when the calculated probability (p) of the subject suffering from polycystic ovary syndrome is ⁇ 50%.
  • a method for diagnosing polycystic ovary syndrome comprising:
  • the data collection step which obtains the anti-Müllerian hormone (AMH) level of the subject, collects the upper limit of the menstrual cycle days provided by the subject, collects the BMI of the subject, and obtains the androstenedione of the subject (AND) level data; and
  • the step of calculating the probability of suffering from polycystic ovary syndrome which calculates the above-mentioned data information obtained in the data collection step, so as to calculate the probability (p) that the subject suffers from polycystic ovary syndrome.
  • a default polycystic ovary syndrome grouping parameter is pre-stored in the grouping step, and according to the grouping parameter, the calculated probability (p) of suffering from polycystic ovary syndrome is grouped, so that the subject The risk of developing polycystic ovary syndrome was divided into groups.
  • the data transformation of the subject's anti-Müllerian hormone (AMH) level, the upper limit of menstrual cycle days, BMI, and androstenedione (AND) level is used.
  • AH anti-Müllerian hormone
  • BMI menstrual cycle days
  • AND androstenedione
  • the anti-Müllerian hormone (AMH) level refers to the anti-Mullerian hormone concentration in the venous blood of any day of the menstrual cycle of a female subject
  • the androstenedione (AND) level refers to the androstenedione concentration of the subject detected on any day during the menstrual period of the subject.
  • the anti-Müllerian hormone (AMH) level is converted into a five-category variable
  • the anti-Müllerian hormone (AMH) level will be divided into five groups, namely: the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml, and the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml.
  • AH level is 2.5 ng/ml and above and less than 5 ng/ml
  • anti-Müllerian hormone (AMH) level of the subject is 5 ng/ml and above and less than 7.5 ng/ml
  • subject's anti-Mullerian hormone (AMH) level is The Mullerian hormone (AMH) level is 7.5 ng/ml and above and less than 10 ng/ml
  • the subject's anti-Mullerian hormone (AMH) level is greater than or equal to 10 ng/ml.
  • the subject's upper limit of menstrual cycle days is converted into a five-category variable
  • the upper limit of the menstrual cycle days of the subjects is divided into five groups, namely, the upper limit of the menstrual cycle days of the subjects is less than 35 days, the upper limit of the menstrual cycle days of the subjects is 35 days and above but less than 45 days, and the subjects’ menstrual cycle days are less than 45 days.
  • the upper limit of the menstrual cycle days is 45 days and above and less than 60 days, the upper limit of the menstrual cycle days of the subject is 60 days and more but less than 90 days, and the upper limit of the menstrual cycle days of the subject is 90 days and above.
  • the subject's BMI is converted into a four-category variable
  • the BMI of the subject is divided into four groups, respectively: the subject's BMI is less than 18.5, the subject's BMI is 18.5 and above but less than 24, the subject's BMI is 24 and above but less than 28, and the subject Those with a BMI of 28 or above.
  • the subject's androstenedione (AND) level was converted into a three-category variable
  • the subject's androstenedione (AND) level ranges into three groups, namely: the subject's androstenedione (AND) level is less than 5nmol/L, the subject's androstenedione (AND) level is within 5nmol/L and above and less than 10nmol/L, and the androstenedione (AND) level of the subject is 10nmol/L and above.
  • the anti-Müllerian hormone (AMH) level the anti-Müllerian hormone (AMH) level, upper limit of menstrual cycle days, BMI, and androstenedione (AND
  • AH anti-Müllerian hormone
  • BMI upper limit of menstrual cycle days
  • AND androstenedione
  • the formula is the following formula one:
  • p is the calculated probability of the subject suffering from polycystic ovary syndrome, and a, b, c, d, i are unitless parameters;
  • AMH the upper limit of menstrual cycle days, BMI or AND take the value of 0 or 1.
  • i is any value selected from -4.91525 ⁇ -4.081495, and i is preferably -4.498372;
  • the value of AMH is 0;
  • AMH level When the subject's AMH level is 2.5 ng/ml or above and less than 5 ng/ml, the value of AMH is 1, a is any value selected from 0.3883373 to 0.8463509, and a is preferably 0.6173441;
  • AMH level When the subject's AMH level is 5 ng/ml and above and less than 7.5 ng/ml, the value of AMH is 1, a is any value selected from 1.2694194 to 1.7629597, and a is preferably 1.5161895;
  • AMH level When the subject's AMH level is 7.5 ng/ml or above and less than 10 ng/ml, the value of AMH is 1, a is any value selected from 1.8891674 to 2.4887798, and a is preferably 2.1889736;
  • the value of AMH is 1, a is any value selected from 2.1935842 to 2.8082163, and a is preferably 2.5009002;
  • the upper limit of menstrual cycle days is 0;
  • the upper limit of menstrual cycle days of the subject is 35 days or more and less than 45 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.1669412 to 1.6485894, and b is preferably 1.4077653;
  • the upper limit of menstrual cycle days of the subject is 45 days or more and less than 60 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.5889245 to 2.0947343, and b is preferably 1.8418294;
  • the upper limit of menstrual cycle days of the subject is 60 days or more and less than 90 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.6497983 to 2.3668561, and b is preferably 2.0083272;
  • the upper limit of menstrual cycle days is 90 days or more, the upper limit of menstrual cycle days is 1, b is any value selected from 1.8809757 to 2.5707838, and b is preferably 2.2258797;
  • the BMI value is 0;
  • the BMI value is 1, c is any value selected from -0.085964 to 0.6550568, and c is preferably 0.2845466;
  • the BMI value is 1, c is any value selected from 0.3957758 to 1.1728099, and c is preferably 0.7842928;
  • the BMI value is 1, c is any value selected from 0.7922476 to 1.6382346, and c is preferably 1.2152411;
  • d is any value selected from 0.269652 to 0.6809945, and d is preferably 0.4753233;
  • the value of AND is 1, d is any value selected from 0.7579538 to 1.252042, and d is preferably 1.0049979.
  • the basis for grouping prestored in the grouping step is:
  • the risk of the subject suffering from polycystic ovary syndrome is medium risk
  • a subject's risk of suffering from polycystic ovary syndrome is high risk when the calculated probability (p) of the subject suffering from polycystic ovary syndrome is ⁇ 50%.
  • the present invention establishes a mathematical model with 4 parameters, that is, a model that considers AMH, menstrual cycle days, BMI and androstenedione level, thereby replacing the simple use of AMH critical value to diagnose polycystic ovary syndrome in the prior art happensing.
  • the present invention has the following beneficial effects: firstly, the system constructed by the present invention does not emphasize fixed parameters, but focuses on screening prediction parameters from multiple variables so as to be used to construct the prediction system of the present invention, and deeply The prediction accuracy of its construction system is verified. Second, applicants for the present invention employed a large sample size and externally validated their stability.
  • the polycystic ovary syndrome diagnostic model established by the inventor may be helpful for clinical screening and diagnosis of polycystic ovary syndrome, and may further clarify the etiology of polycystic ovary syndrome.
  • the probability (p) of the subject suffering from polycystic ovary syndrome can be calculated, and according to the default polycystic ovary syndrome grouping parameters stored in the system, the subject is suffering from polycystic ovary syndrome The probability (p) is grouped to determine the risk of subjects suffering from polycystic ovary syndrome.
  • menstrual period refers to the number of days that each menstruation lasts, generally 3 to 7 days.
  • the androstenedione (AND) level refers to the androstenedione concentration of the subject detected on any day of the subject's menstrual period, for example, it can be the first day, the second day, the third day of the menstrual period.
  • the androstenedione data are relatively stable during the menstrual period.
  • the menstrual cycle refers to the time interval between the first day of two menstrual periods, and the upper limit of the subject's menstrual cycle is provided by the subject.
  • the subject's menstrual cycle based on past experience is usually 30 days. -90 days, then the upper limit of the menstrual cycle days obtained in the present invention is 90 days.
  • Anti-Müllerian hormone is a hormone secreted by the granulosa cells of ovarian small follicles. Female babies in the fetal period start to produce AMH from 36 weeks. The more small follicles in the ovary, the higher the concentration of AMH. Conversely, when follicles are gradually consumed with age and various factors, the concentration of AMH will also decrease, and the closer to menopause, AMH will gradually tend to 0.
  • BMI Body Mass Index
  • body mass index also known as body mass index. It is a number obtained by dividing the weight in kilograms by the square of the height in meters. Mainly used for statistical purposes, BMI is a neutral and reliable indicator when we need to compare and analyze the health impact of a person's weight on people of different heights.
  • Androstenedione is one of the four male hormones in the female blood circulation. Androstenedione is the main precursor of testosterone. The androstenedione in the circulation is secreted by the ovary and the adrenal gland. The androstenedione There is a dynamic equilibrium relationship between the diketone level and the testosterone level, and the high concentration of androstenedione in the circulating blood indicates hyperandrogenism.
  • Continuous variables In statistics, variables can be divided into continuous variables and categorical variables according to whether the variable value is continuous or not.
  • a variable that can take any value within a certain interval is called a continuous variable, and its value is continuous, and two adjacent values can be infinitely divided, that is, it can take an infinite number of values.
  • the specifications and sizes of production parts, and the height, weight, and chest circumference of anthropometric measurements are continuous variables, and their values can only be obtained by measurement or measurement.
  • a variable whose value can only be calculated using natural numbers or integer units is a discrete variable. For example, the number of enterprises, the number of employees, the number of equipment, etc., can only be counted by the number of units of measurement, and the value of such variables is generally obtained by counting.
  • Categorical variables are variables in terms of geographic location, demographics, etc. that serve to group survey respondents into groups. Descriptive variables describe the difference between a certain customer group and other customer groups. Most categorical variables are also descriptive variables. Categorical variables can be divided into two categories: unordered categorical variables and ordered categorical variables. Among them, unordered categorical variable (unordered categorical variable) refers to the difference between the categories or attributes without degree and order. It can be divided into 1. two classifications again, as gender (male, female), drug reaction (negative and positive) etc.; business, learning, military) and so on. In an ordinal categorical variable, there is a degree of difference between categories.
  • urine sugar test results are classified by -, ⁇ , +, ++, +++; curative effects are classified by cure, markedly effective, improved, and ineffective.
  • ordered categorical variables it should be grouped according to the order of rank, the number of observation units in each group should be counted, and the frequency table of ordered variables (each rank) should be compiled. The data obtained are called rank data.
  • Variable types are not static, and various types of variables can be transformed according to the needs of the research purpose.
  • the amount of hemoglobin (g/L) is originally a numerical variable. If it is divided into two categories according to normal and low hemoglobin, it can be analyzed according to binomial classification data; When the height increase is divided into five grades, it can be analyzed according to grade data.
  • the categorical data can also be quantified. For example, if the patient's nausea reaction can be represented by 0, 1, 2, 3, it can be analyzed according to the numerical variable data (quantitative data).
  • Logistic regression is a generalized linear regression analysis model, which is often used in data mining, automatic disease diagnosis, economic forecasting and other fields. For example, explore the risk factors that cause diseases, and predict the probability of disease occurrence based on risk factors.
  • gastric cancer two groups of people are selected, one is the gastric cancer group and the other is the non-gastric cancer group. The two groups of people must have different signs and lifestyles. Therefore, the dependent variable is gastric cancer, the value is "yes" or "no", and the independent variable can include many, such as age, sex, eating habits, Helicobacter pylori infection, etc. Independent variables can be either continuous or categorical.
  • the dependent variable of logistic regression can be binary or multi-category.
  • the data fitting model used in this paper is a logistic regression model that penalizes the absolute magnitude of the coefficients of the regression model based on the value of ⁇ . With larger penalties, the estimates for weaker factors tend to approach zero, so only the strongest predictors remain in the model.
  • Least absolute shrinkage and selection operator regression (often referred to simply as Lasso regression) is a compression estimation method based on the idea of reducing the variable set (order reduction). By constructing a penalty function, it can compress the coefficients of variables and make some regression coefficients become 0, so as to achieve the purpose of variable selection. It is an algorithm that uses a penalty function to improve the predictive ability of the model.
  • the algorithm uses 1-norm constraints not only to solve high-dimensional and collinear problems, but also to make the established model "sparse", that is, the algorithm is modeling It has the effect of automatic wavelength selection.
  • 10-fold cross-validation (10-fold cross-validation), or ten-fold cross-validation, is a commonly used test method to test the accuracy of algorithms.
  • the data set is divided into ten parts, and 9 parts are used as training data and 1 part is used as test data in turn for experimentation.
  • Each trial will result in a corresponding correct rate (or error rate).
  • the average of the correct rate (or error rate) of the results of 10 times is used as an estimate of the accuracy of the algorithm.
  • multiple 10-fold cross-validation (for example, 10 times of 10-fold cross-validation) is required, and then the average value is used as an estimate of the algorithm. Estimates of Accuracy.
  • Ten-fold cross-validation chooses to divide the data set into 10 parts because a large number of experiments with a large number of data sets and using different learning techniques have shown that 10-fold is the appropriate choice to obtain the best error estimate, and there are also some theoretical reasons Can prove it.
  • Collinear that is, synlinear or synlinear.
  • collinearity is multicollinearity.
  • Multicollinearity refers to the existence of precise correlation or high correlation between the explanatory variables in the linear regression model, which makes the model estimation distorted or difficult to estimate accurately.
  • the model design is improper, resulting in the general correlation among the explanatory variables in the design matrix.
  • Complete collinearity is rare, and generally collinearity to a certain extent occurs, that is, approximate collinearity.
  • Overfitting is "an analysis result that corresponds too closely or precisely to a particular data set, so that it may not fit other data or reliably predict future observations".
  • An overfitting model is a statistical model that contains more parameters than can be justified by the data. The essence of overfitting is to unknowingly extract some residual variation (i.e. noise) as if the variation represented the underlying model structure. In other words, the model memorizes a large number of examples instead of learning attentional features. The likelihood of overfitting depends not only on the number of parameters and data, but also on the consistency of the model structure with the shape of the data, and the magnitude of the model error compared to the expected level of noise or error in the data. Even if the fitted model does not have an excess of parameters, it can be expected that the fitted relationship will perform worse on the new dataset than on the fitted dataset (sometimes this occurs called shrinkage).
  • Receiver operating characteristic curve (receiver operating characteristic curve, referred to as ROC curve), also known as sensitivity curve (sensitivity curve).
  • ROC curve Receiver operating characteristic curve
  • sensitivity curve sensitivity curve
  • the present invention provides a system for diagnosing polycystic ovary syndrome, which includes:
  • a data collection module which is used to obtain the anti-Müllerian hormone (AMH) level of the subject, collect the upper limit of the menstrual cycle days provided by the subject, collect the BMI of the subject, and obtain the androsene of the subject Data on diketone (AND) levels; and
  • the module for calculating the probability of suffering from polycystic ovary syndrome is used to calculate the above-mentioned data information obtained in the data collection module, so as to calculate the probability (p) of the subject suffering from polycystic ovary syndrome.
  • the module for calculating the probability of suffering from polycystic ovary syndrome the data transformation of the subject's anti-Müllerian hormone (AMH) level, the upper limit of menstrual cycle days, BMI, and androstenedione (AND) level is used.
  • AMH anti-Müllerian hormone
  • BMI lower limit of menstrual cycle days
  • AND androstenedione
  • the anti-Mullerian hormone (AMH) level refers to the anti-Mullerian hormone concentration in the venous blood of any day of the menstrual cycle of a female subject. Such as the subject's menstrual cycle is 28 days, then the anti-Mullerian hormone (AMH) level can be the anti-Mullerian hormone concentration in the venous blood on the first day of the menstrual cycle, and can be the concentration of the anti-Mullerian hormone on the 10th day of the menstrual cycle.
  • the concentration of anti-Mullerian hormone in venous blood on day 1 can also be the concentration of anti-Mullerian hormone in venous blood on day 28 of the menstrual cycle.
  • the androstenedione (AND) level refers to the androstenedione concentration of the subject detected on any day during the menstrual period of the subject. Such as the tester's menstrual cycle is 25 days, then the androstenedione (AND) level can be the tester's androstenedione concentration detected on the first day of the tester's menstrual period, can be the tester's
  • the subject's androstenedione concentration detected on the 3rd day of the subject's menstrual period may also be the subject's androstenedione concentration detected on the 25th day of the subject's menstrual period.
  • the inventors of the present application have conducted in-depth research and converted the anti-Müllerian hormone (AMH) level into five Categorical variables, that is, the anti-Müllerian hormone (AMH) level is divided into five groups, which are: the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml, the anti-Mullerian hormone (AMH) level of the tester is less than 2.5ng/ml, and the anti-Mullerian hormone (AMH) level of the tester is The level of AMH is 2.5 ng/ml and above and less than 5 ng/ml, and the anti-Müllerian hormone (AMH) level of the subject is 5 ng/ml and above and less than 7.5 ng/ml.
  • the anti-Müllerian hormone (AMH) level of the subject is 7.5 ng/ml and above and less than 10 ng/ml, and the anti-Mullerian hormone (AMH) level of the subject is greater than or equal to 10 ng/ml.
  • the inventors of the present application conducted in-depth research, and by exploring the distribution of independent variables and outcome variables, the upper limit of the subject's menstrual cycle days was converted into a five-category variable , that is, the upper limit of the menstrual cycle days of the subjects is divided into five groups, respectively, the upper limit of the menstrual cycle days of the subjects is less than 35 days, the upper limit of the menstrual cycle days of the subjects is 35 days or more and less than 45 days, and the subjects The upper limit of the menstrual cycle days of the subject is 45 days and above but less than 60 days, the upper limit of the menstrual cycle days of the subject is 60 days and more but less than 90 days, and the upper limit of the menstrual cycle days of the subject is 90 days and above.
  • the inventors of the present application conducted in-depth research, and by exploring the distribution of independent variables and outcome variables, they converted the subject's BMI into a four-category variable, that is, the subject's
  • the BMI is divided into four groups, namely, the subject's BMI is less than 18.5, the subject's BMI is 18.5 and above and less than 24, the subject's BMI is 24 and above but less than 28, and the subject's BMI is 28 and above.
  • the inventors of the present application conducted in-depth research and converted the subject's androstenedione (AND) level into three Categorical variables, that is, the subject's androstenedione (AND) level ranges into three groups, namely: the subject's androstenedione (AND) level is less than 5nmol/L, the subject's androstenedione (AND) ) level is 5 nmol/L and above and less than 10 nmol/L, and the subject's androstenedione (AND) level is 10 nmol/L and above.
  • the use of such multi-categorical variables for data analysis can convert the nonlinear relationship between the independent variable and the outcome variable into a linear relationship, and more accurately calculate the subject's disease risk.
  • the probability of polycystic ovary syndrome, and the model stability is better.
  • the anti-Müllerian hormone (AMH) level the anti-Müllerian hormone (AMH) level, upper limit of menstrual cycle days, BMI, and androstenedione (AND
  • AH anti-Müllerian hormone
  • BMI upper limit of menstrual cycle days
  • AND androstenedione
  • the formula used to calculate the probability (p) of suffering from polycystic ovary syndrome is obtained by fitting multi-categorical variables converted from data at the ) level. And according to the grouping criteria, the probability (p) of subjects suffering from polycystic ovary syndrome was grouped.
  • an existing database refers to a database composed of subjects who are receiving treatment or have previously received treatment and meet the following inclusion and exclusion criteria. There is no agreement on the sample size of the database. Of course, the larger the sample size of the database The larger the better, for example, 100 subjects, 200 subjects, 300 subjects, preferably 400 subjects or more, more preferably 500 subjects or more. In a specific embodiment, an existing database consisting of 11,720 samples is used.
  • Analytical data A total of 21,219 ovulation induction treatment cycles visited the Reproductive Medicine Center of Peking University Third Hospital from January to December 2019, excluding menstrual cycle, BMI, testosterone, androstenedione, and antral follicle count (AFC) A total of 11,720 cycles were included in the final analysis.
  • the number of menstrual cycle days in this study refers to the upper limit of the menstrual cycle duration. For example, if a patient has a menstrual cycle of 30-90 days, use 90 days for the analysis. For unidentified data in our analyses, informed patient consent was not required, in accordance with the Helsinki statement.
  • the module for calculating the probability of suffering from polycystic ovary syndrome uses the following formula (1) to calculate the probability (p) of the subject suffering from polycystic ovary syndrome:
  • p is the calculated probability of the subject suffering from polycystic ovary syndrome, and a, b, c, d, i are unitless parameters;
  • AMH the upper limit of menstrual cycle days, BMI or AND take the value of 0 or 1.
  • i is any value selected from -4.91525 to -4.081495, and i is preferably -4.498372; when the subject's AMH level is less than 2.5ng/ml, the value of AMH is 0; when the subject's AMH level When the level of AMH is 2.5 ng/ml and above but less than 5 ng/ml, the value of AMH is 1, a is any value selected from 0.3883373 to 0.8463509, and a is preferably 0.6173441; when the AMH level of the subject is 5 ng/ml and above And when it is less than 7.5ng/ml, the value of AMH is 1, a is any value selected from 1.2694194 ⁇ 1.7629597, a is preferably 1.5161895; when the subject’s AMH level is 7.5ng/ml or above and less than 10ng/ml When the AMH level is greater than or equal to 10ng/ml, the value of AMH is 1, and a is selected from 2.193
  • the default polycystic ovary syndrome grouping parameters are pre-stored in the grouping module of the present application, and the grouping basis pre-stored in the grouping module is: when the calculated probability (p) of the subject suffering from polycystic ovary syndrome ⁇ When 10%, the risk of the subject suffering from polycystic ovary syndrome is low risk; when 10% ⁇ the calculated probability (p) of the subject suffering from polycystic ovary syndrome The risk of cystic ovary syndrome is intermediate risk; when the calculated probability (p) of the subject is ⁇ 50% of suffering from polycystic ovary syndrome, the risk of the subject is high risk of polycystic ovary syndrome.
  • the present application also relates to a method for diagnosing polycystic ovary syndrome, which includes: a data collection step, which obtains the anti-Müllerian hormone (AMH) of the subject level, collecting the upper limit of menstrual cycle days provided by the subject, collecting the subject's BMI, and obtaining the data of the subject's androstenedione (AND) level; and the steps of calculating the probability of suffering from polycystic ovary syndrome , which calculates the above data information obtained in the data collection step, so as to calculate the probability (p) that the subject suffers from polycystic ovary syndrome.
  • AMH anti-Müllerian hormone
  • AND data of the subject's androstenedione
  • AMH anti-Müllerian hormone
  • AND data on androstenedione
  • the upper limit of the menstrual cycle days refers to the upper limit of the menstrual cycle duration. For example, if a subject provides that their past menstrual cycles ranged from 30-90 days during treatment, use 90 days as the upper limit of menstrual cycle days.
  • Rotterdam criteria (2003 Rotterdam criteria, Group, R.E.A.-S.P.c.w. Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome. Fertil PC Steril 81, 19-24) whether the subjects were diagnosed with OS (2003) , which requires the presence of at least two of the following: (1) ovulatory dysfunction (i.e., oligonovulation and/or anovulation); (2) hyperandrogenism (high levels of testosterone or androstenedione in blood tests ) or clinical manifestations of androgen excess; (3) polycystic ovary confirmed by ultrasonography.
  • ovulatory dysfunction i.e., oligonovulation and/or anovulation
  • hyperandrogenism high levels of testosterone or androstenedione in blood tests
  • hyperandrogen refers to acne, androgenetic alopecia or hirsutism; hyperandrogenemia refers to elevated serum total testosterone or androstenone levels.
  • the following criteria are adopted for the diagnosis of hirsutism, using the modified Ferriman-Galwey score > 4, or hair growth involving the upper lip, thigh and lower abdomen, and the hair growth score > 2 to diagnose hirsutism.
  • Measuring androgen levels can be helpful in rare instances when an androgen-secreting tumor is suspected (eg, when a subject has an overt viral infection or a rapid onset of symptoms related to PCOS).
  • a menstrual cycle lasting more than 35 days but less than 6 months is diagnosed as oligomenorrhea.
  • Amenorrhea is the absence of menstruation for more than 6 months after a cyclic pattern develops.
  • Polycystic ovaries on ultrasonography were defined as at least one ovary containing 12 or more follicles with a diameter of 2-9 mm or an ovarian volume greater than 10 mL.
  • a single ovary meeting one or both of the above two definitions can be diagnosed as polycystic ovary.
  • Hyperprolactinemia was diagnosed by two serum prolactin (PRL) levels exceeding 25 ng/mL.
  • the number of antral follicles with a diameter of 2-10 mm in both ovaries of the subject was counted by transvaginal ultrasound scanning.
  • subjects' venous blood was collected to measure the concentrations of prolactin (PRL), luteinizing hormone (LH), testosterone, androstenedione, and serum estradiol ( E2 ).
  • Plasma samples for AMH measurements were taken on any day of the menstrual cycle. Blood samples were collected and immediately inverted five times and centrifuged for further endocrine evaluation.
  • Serum levels of PRL, LH, testosterone, androstenedione and E2 were tested using a Siemens Immulite 2000 immunoassay system (Siemens Healthcare Diagnostics, Shanghai, China). Quality controls for PRL, LH, Testosterone, Androstenedione, and E2 were provided by Bio-RAD Laboratories (Hercules, CA, USA; Lyphochek Immunoassay Plus Control, Class III, Cat. No. 370, Lot No. 40370) . Serum AMH concentrations were measured using an ultrasensitive two-point ELISA (Ansh Labs LLC; Webster, TX, USA) using the quality control included with the kit. The coefficients of variation for tertiary quality control, AMH, PRL and LH were less than 6%, and the coefficients of variation for E 2 , androstenedione and testosterone were less than 10%.
  • the inventors used L1-penalized least absolute shrinkage and selection regression for multivariate analysis and internal validation using 10-fold cross-validation.
  • This is a logistic regression model that penalizes the absolute magnitude of the coefficients of the regression model based on the value of lambda. With larger penalties, the estimates for weaker factors tend to approach zero, so only the strongest predictors remain in the model. The most predictive covariates were selected by the minimum value ( ⁇ min). Subsequently, variables identified by LASSO regression analysis were input into logistic regression models, and variables that were consistently statistically significant were used to construct PCOS diagnostic models.
  • the inventors evaluated the performance of the PCOS model using the receiver-operator characteristic curve (AUC), using the area under sensitivity and specificity.
  • AUC receiver-operator characteristic curve
  • Table 1 lists the basic characteristics of the variables collected in the examples. These indicators are of great significance in univariate analysis when diagnosing PCOS.
  • the continuous variables were converted into categorical variables.
  • the grouping criteria of the independent variables are mainly based on the data exploration before analysis combined with the clinical experience of the inventors of the present application. The grouping criteria for each independent variable remained the same across the three different models.
  • BMI in Table 1 means body mass index
  • AMH means anti-Müllerian hormone level
  • TES means testosterone level
  • AND means androstenedione level
  • AFC means antral follicle count.
  • LASSO logistic regression with 10-fold cross-validation uses smaller corrected Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), indicating that LASSO logistic regression is used for data analysis and fitting. The stability is higher, Therefore, based on the results of the examples, the inventors of the present application chose to use LASSO logistic regression in the following analysis and construction system.
  • AIC Akaike Information Criterion
  • BIC Bayesian Information Criterion
  • AIC means the corrected Akaike information criterion
  • BIC means Bayesian information criterion
  • the inventors of the present application noticed that there are literature reports that the level of AMH is expected to replace AFC. Therefore, in further analysis, in this example, the inventors used LASSO logistic regression with 10-fold cross-validation to establish PCOS diagnostic models with or without AFC, so as to confirm whether the subsequent construction of the analysis system requires the use of AFC data.
  • the comparison results of the AUC data in the training set and the validation set with and without AFC are shown in Table 3 below.
  • antral follicle count is the number of follicles with a diameter less than 8 mm in early Gn-dependent follicular growth. It is well known that the pool of primordial follicles in the ovary is related to the number of growing antral follicles, therefore, in theory, AFC should be able to reflect as accurately as possible the pool of remaining ovarian follicles.
  • obtaining good AFC results requires ultrasonography by skilled transvaginal sonography (TVS) specialists, which is time-consuming and resource-intensive.
  • TVS transvaginal sonography
  • the inventors' in-depth research verified that the level of AMH proposed in the prior art is expected to replace AFC, thereby providing a preliminary basis for constructing a system and method for convenient detection.
  • Model 1 without AFC the inventors of the present application further investigated the changing trend of each variable, and the data showed that age and testosterone level did not change with the occurrence of PCOS. Since the contribution of age and testosterone in the model construction is small, the inventors excluded the use of these two variables in the subsequent embodiment system construction.
  • the inventor firstly divided the data of all 11720 subjects into an internal verification group and an external verification data group according to the ratio of 80%:20%.
  • LASSO regression was combined with 10-fold logistic regression to determine the best model (i.e. model 3).
  • Raw data and their corresponding predicted data for the probability of subjects suffering from PCOS were calculated.
  • Model 3 Estimated parameter values and p-values for each variable in Model 3 are shown in Table 5.
  • Table 6 shows the AUC, sensitivity and specificity in the modeling data, internal validation data and external validation data.
  • BMI body mass index
  • AND means androstenedione level.
  • formula 1 which can be based on the subject's anti-Mullerian hormone ( AMH) levels, upper limit of menstrual cycle days, BMI, and androstenedione (AND) levels were used to calculate the probability (p) of subjects suffering from PCOS.
  • AMH anti-Mullerian hormone
  • BMI menstrual cycle days
  • AND androstenedione
  • p is the calculated probability of the subject suffering from PCOS, and a, b, c, d, i are unitless parameters;
  • AMH the upper limit of menstrual cycle days, BMI or AND take the value of 0 or 1.
  • i is any value selected from -4.91525 ⁇ -4.081495, and i is preferably -4.498372;
  • the value of AMH is 0;
  • AMH level When the subject's AMH level is 2.5 ng/ml or above and less than 5 ng/ml, the value of AMH is 1, a is any value selected from 0.3883373 to 0.8463509, and a is preferably 0.6173441;
  • AMH level When the subject's AMH level is 5 ng/ml and above and less than 7.5 ng/ml, the value of AMH is 1, a is any value selected from 1.2694194 to 1.7629597, and a is preferably 1.5161895;
  • AMH level When the subject's AMH level is 7.5 ng/ml or above and less than 10 ng/ml, the value of AMH is 1, a is any value selected from 1.8891674 to 2.4887798, and a is preferably 2.1889736;
  • the value of AMH is 1, a is any value selected from 2.1935842 to 2.8082163, and a is preferably 2.5009002;
  • the upper limit of menstrual cycle days is 0,
  • the upper limit of menstrual cycle days of the subject is 35 days or more and less than 45 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.1669412 to 1.6485894, and b is preferably 1.4077653;
  • the upper limit of menstrual cycle days of the subject is 45 days or more and less than 60 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.5889245 to 2.0947343, and b is preferably 1.8418294;
  • the upper limit of menstrual cycle days of the subject is 60 days or more and less than 90 days, the upper limit of menstrual cycle days is 1, b is any value selected from 1.6497983 to 2.3668561, and b is preferably 2.0083272;
  • the upper limit of menstrual cycle days is 90 days or more, the upper limit of menstrual cycle days is 1, b is any value selected from 1.8809757 to 2.5707838, and b is preferably 2.2258797;
  • the BMI value is 0;
  • the BMI value is 1, c is any value selected from -0.085964 to 0.6550568, and c is preferably 0.2845466;
  • the BMI value is 1, c is any value selected from 0.3957758 to 1.1728099, and c is preferably 0.7842928;
  • the BMI value is 1, c is any value selected from 0.7922476 to 1.6382346, and c is preferably 1.2152411;
  • d is any value selected from 0.269652 to 0.6809945, and d is preferably 0.4753233; when the AND level of the subject When the value is 10 nmol/L or above, the value of AND is 1, d is any value selected from 0.7579538 to 1.252042, and d is preferably 1.0049979.
  • the relationship between the predicted probability calculated by the above formula 1 and the actual incidence rate of PCOS is that the actual incidence rate of PCOS increases with the increase of the predicted probability.
  • Table 7 shows the top ten groups of women most likely to be predicted to have PCOS. Detailed information includes the upper limit of menstrual cycle days, AMH, BMI, and androstenedione levels, the number of cases with or without actual PCOS, the predicted probability of PCOS occurrence, and the actual incidence of PCOS.
  • the calculated probability (p) of subjects suffering from polycystic ovary syndrome When the calculated probability (p) of subjects suffering from polycystic ovary syndrome ⁇ 10%, the risk of subjects suffering from polycystic ovary syndrome is low risk; when 10% ⁇ the calculated probability of subjects suffering from polycystic ovary syndrome When the probability (p) of cystic ovary syndrome ⁇ 50%, the risk of the subject suffering from polycystic ovary syndrome is medium risk; when the calculated probability (p) of the subject suffering from polycystic ovary syndrome is ⁇ 50% , the subject was at high risk for polycystic ovary syndrome.
  • Vagios et al. used AMH and BMI to construct a diagnostic model for predicting PCOS (Vagios, S., James, K.E., Sacha, C.R., et al.
  • a patient-specific model combining antimullerian hormone and body mass index as a predictor of polycystic ovary syndrome and other oligo-anovulation disorders.
  • the inventors of the present application have established a mathematical model with 4 parameters (that is, a model including AMH level, upper limit of menstrual cycle days, BMI and androstenedione level) through in-depth research. ) and constructed a system and method based on the model, thereby replacing the current situation of simply using the AMH critical value to diagnose PCOS in the prior art.
  • some researchers in the prior art used mouse models to show that excessive AMH can also lead to hyperandrogenism and ovulation disorders.
  • the AUCs are respectively 0.852, 0.857, and 0.838.
  • the main effect of each predictor on the model in the model constructed by the present invention is AMH 41.2%, upper menstrual cycle days 35.2%, BMI 4.3% and androstenedione 3.7%.
  • the effect of the upper limit of the number of days of the menstrual cycle is second only to the level of AMH, so it is difficult to achieve good results in the prior art prediction method that only considers AMH.
  • the system constructed by the inventors of the present application after in-depth research reveals that both AMH and BMI play an important role in diagnosing PCOS.
  • the inventors of the present application considered that the upper limit of menstrual cycle days exceeding 35 days indicates chronic anovulation, and the longer the menstrual cycle, the more serious the ovulation disorder.
  • BMI is used to assess the severity of obesity, as obese individuals face an increased risk of long-term adverse metabolic disturbances. Therefore, the parameters of the system constructed in the present invention can cover all three aspects of the Rotterdam criteria as well as metabolic abnormalities. It can be seen from the results of the examples that AMH, the upper limit of menstrual cycle days, and BMI androstenedione are further combined to further improve the accuracy of predicting PCOS.
  • the prediction model of suffering from PCOS established in the present invention may become a potential quantitative tool for diagnosing Asian population suffering from PCOS in the future, and also supports excessive AMH secretion as a potential therapeutic target for PCOS.
  • Vagios et al. used AMH and BMI to construct a diagnostic model for predicting PCOS, which also used logistic regression to do so.
  • the system constructed by the present invention does not emphasize fixed parameters, but focuses on screening prediction parameters from multiple variables to construct the prediction system of the present invention, and thoroughly verifies the prediction accuracy of the constructed system.
  • the study by Vagios et al. used BMI-stratified analysis and focused only on AMH and BMI parameters.
  • the applicant of the present invention used a large sample size and was externally validated, indicating its stability; however, the study by Vagios et al. did not have external validation, so the diagnostic performance in different populations is inconclusive.
  • the results of the present invention are shown in Table 4, the contribution of age when adjusting for upper menstrual cycle days, serum AMH levels, AFC, BMI, serum androstenedione levels, and serum testosterone levels It is very small, only 0.2%. Therefore, for age, which is a very critical parameter in the field of gynecology or obstetrics and gynecology diagnosis, age is no longer considered in the system and method finally constructed by the present invention. Unique compared to obstetrics and gynecology related diseases.
  • AMH content is expected to replace AFC (one of the criteria obtained by ultrasonography).
  • AFC one of the criteria obtained by ultrasonography.
  • the use of ultrasonic testing requires expensive equipment and highly trained personnel, which leads to increased costs, poor accuracy and repeatability.
  • transvaginal ultrasound is unacceptable or invasive.
  • applying simple cut-off values to the diagnosis of PCOS has its drawbacks.
  • the prediction results of the model constructed by the present invention show that the contribution of AMH to model 1 (without AFC) is 35.1%, while the contribution of the combination of AMH and AFC in model 2 (with AFC) is 35.5%, which shows that AMH can replace the model AFC in 1.
  • PCOS diagnostic criteria will be available worldwide in the future.
  • PCOS is closely related to obesity, thin women with PCOS are often difficult to diagnose, and up to 30% of reproductive women with PCOS maintain a normal weight, and these thin PCOS patients are often missed.
  • the actual incidence rate of PCOS in people with BMI less than 18.5kg/m 2 is 64/1071.
  • AMH >10 ng/mL
  • the incidence of PCOS increased to 21/49.
  • BMI ⁇ 18.5kg/ m2 and AMH>10ng/mL and menstrual cycle duration>90 days the incidence of PCOS increased to 10/13.
  • These normal-weight or thin women still face fertility challenges, elevated androgen levels and resulting symptoms (such as acne, hirsutism, hair loss), and increased risk of diabetes and cardiovascular disease.
  • the PCOS diagnostic model established by the inventor may help diagnose these patients, and prompt them to need timely treatment, and Hope to facilitate their long-term health management.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

一种诊断多囊卵巢综合征的系统,其包括:数据采集模块,其用于获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及计算罹患多囊卵巢综合征的概率的模块,其用于将数据采集模块中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。利用所述系统,可以计算出受试者罹患多囊卵巢综合征的概率(p),并依据系统预存的默认的多囊卵巢综合征分组参数,对该受试者罹患多囊卵巢综合征的概率(p)进行分组,从而判断受试者罹患多囊卵巢综合征的风险。

Description

一种诊断多囊卵巢综合征的系统和方法 技术领域
本发明涉及一种用于评估受试者罹患多囊卵巢综合征的概率的系统和方法,以及涉及一种协助辅助诊断多囊卵巢综合征的系统和方法,利用本发明的系统和方法可以评估受试者罹患多囊卵巢综合征的概率,从而辅助诊断是否罹患多囊卵巢综合征,以及评估受试者在经过了相应的治疗之后罹患卵巢综合征的情况是否得到了改善。
背景技术
育龄妇女中多囊卵巢综合征(PCOS)的发病率在5%至20%之间,是最常见的内分泌和代谢疾病之一,需特别关注这一类人群的长期健康问题。PCOS患者只有一小部分因为不孕不育因素就诊,相当比例的PCOS患者未就诊,因此,相当多的PCOS患者不能对其未来的潜在代谢疾病发生风险进行有效管理。另外,由于PCOS发病机制未知,目前国际上常用的诊断标准也备受争议。
基于上述背景,在本领域中亟需与PCOS发病机制相关的新的诊断标准。另外,当前用于筛查和诊断PCOS的临床实践,对于普通妇科医生和初级保健医师而言并非易事,常常会造成漏诊。因此,研发出一套方便快捷、易于推广的PCOS诊断系统意义重大。
发明内容
PCOS主要是一种高雄激素性疾病,这一点已通过雄激素诱导的PCOS的各种啮齿类动物模型得到了验证。然而,目前患者是如何产生过量的雄激素仍然是未知的。最近使用小鼠模型进行的研究表明,AMH参与调节下丘脑-垂体-卵巢(H-P-O)轴,并可能刺激过量雄激素的产生。在小鼠的妊娠第16.5、17.5和18.5天施用重组人AMH可激活促性腺激素释放激素(GnRH)分泌神经元中的AMH受体,并增加促黄体生成激素(LH)脉冲的频率,从而 导致血清LH和睾丸激素水平升高;在妊娠第19.5天降低了雌性小鼠的雌二醇(E2)和孕酮的水平。高水平的AMH诱导血清LH和睾丸激素升高,导致母亲和女性后代的排卵或无排卵以及卵母细胞发育不良。因此,AMH被越来越多地认为是诊断这种疾病的潜在标志。
在本申请中,本申请的发明人尝试建立使用AMH水平和其他指标诊断和预测PCOS的系统和方法,这可能有助于用于在临床上筛查和诊断PCOS,并且有可能进一步阐明PCOS的病因。
具体来说,本发明涉及如下内容:
1.一种诊断多囊卵巢综合征的系统,其包括:
数据采集模块,其用于获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及
计算罹患多囊卵巢综合征的概率的模块,其用于将数据采集模块中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。
2.根据项1所述的系统,其还包括:
分组模块,在所述分组模块中预存有默认的多囊卵巢综合征分组参数,并且依据该分组参数,对所述计算得到的罹患多囊卵巢综合征的概率(p)进行分组,从而对受试者罹患多囊卵巢综合征的风险进行分组。
3.根据项1或2所述的系统,其中,
在计算罹患多囊卵巢综合征的概率的模块中,利用将受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量来计算受试者罹患多囊卵巢综合征的概率(p)。
4.根据项1~3中任一项所述的系统,其中,
所述抗缪勒氏管激素(AMH)水平是指女性受试者月经周期任何一天的静脉血中的抗缪勒氏管激素浓度,
所述雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度。
5.根据项1~4中任一项所述的系统,其中,
在计算罹患多囊卵巢综合征的概率的模块中,将所述抗缪勒氏管激素(AMH)水平转换成五分类变量,
即将所述抗缪勒氏管激素(AMH)水平分为五组,分别为:受试者的抗缪勒氏管激素(AMH)水平小于2.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在2.5ng/ml及以上且小于5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在5ng/ml及以上且小于7.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在7.5ng/ml及以上且小于10ng/ml,以及受试者的抗缪勒氏管激素(AMH)水平大于等于10ng/ml。
6.根据项1~5中任一项所述的系统,其中,
在计算罹患多囊卵巢综合征的概率的模块中,将所述受试者的月经周期天数上限转换成五分类变量,
即将受试者的月经周期天数上限分为五组,分别为受试者的月经周期天数上限小于35天,受试者的月经周期天数上限在35天及以上且小于45天,受试者的月经周期天数上限在45天及以上且小于60天,受试者的月经周期天数上限在60天及以上且小于90天,以及受试者的月经周期天数上限在90天及以上。
7.根据项1~6中任一项所述的系统,其中,
在计算罹患多囊卵巢综合征概率的模块中,将受试者的BMI转换成四分类变量,
即将受试者的BMI分为四组,分别为受试者的BMI小于18.5,受试者的BMI在18.5及以上且小于24,受试者的BMI在24及以上且小于28,以及受试者的BMI在28及以上。
8.根据项1~7中任一项所述的系统,其中,
在计算罹患多囊卵巢综合征的概率的模块中,将受试者的雄烯二酮(AND)水平转换成三分类变量,
即将受试者的雄烯二酮(AND)水平范围三组,分别为:受试者的雄烯二酮(AND)水平小于5nmol/L,受试者的雄烯二酮(AND)水平在5nmol/L及以上且小于10nmol/L,以及受试者的雄烯二酮(AND)水平在10nmol/L及以上。
9.根据项1~8中任一项所述的系统,其中,
在计算罹患多囊卵巢综合征概率的模块中,预先存储有基于现有数据库中受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量拟合而成的用于计算罹患多囊卵巢综合征的概率(p)的公式。
10.根据项9所述的系统,其中,
所述公式为如下公式一:
p=1/[1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)](公式一)
其中,p为计算出的受试者罹患多囊卵巢综合征的概率,a、b、c、d、i为无单位参数;
在计算罹患多囊卵巢综合征的概率的模块中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获取a、b、c、d的取值并带入公式一进行计算,
在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
11.根据项10所述的系统,其中,
i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;
当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;
当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;
当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;
当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;
当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;
当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0;
当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;
当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;
当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;
当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;
当受试者的BMI小于18.5时,BMI取值为0;
当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;
当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;
当受试者的BMI在28及以上时,BMI取值为1,c为选自0.7922476~1.6382346中的任意数值,c优选为1.2152411;
当受试者的AND水平小于5nmol/L时,AND取值为0;
当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;
当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
12.根据项1~11中任一项所述的系统,其中,
在所述分组模块中预存的分组依据为:
当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;
当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;
当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
13.一种诊断多囊卵巢综合征的方法,其包括:
数据采集步骤,其获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及
计算罹患多囊卵巢综合征的概率的步骤,其将数据采集步骤中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。
14.根据项13所述的方法,其还包括:
分组步骤,在所述分组步骤中预存有默认的多囊卵巢综合征分组参数,并且依据该分组参数,对所述计算得到的罹患多囊卵巢综合征的概率(p)进行 分组,从而对受试者罹患多囊卵巢综合征的风险进行分组。
15.根据项13或14所述的方法,其中,
在计算罹患多囊卵巢综合征的概率的步骤中,利用将受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量来计算受试者罹患多囊卵巢综合征的概率(p)。
16.根据项13~15中任一项所述的方法,其中,
所述抗缪勒氏管激素(AMH)水平是指女性受试者月经周期任何一天的静脉血中的抗缪勒氏管激素浓度,
所述雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度。
17.根据项13~16中任一项所述的方法,其中,
在计算罹患多囊卵巢综合征的概率的步骤中,将所述抗缪勒氏管激素(AMH)水平转换成五分类变量,
即将所述抗缪勒氏管激素(AMH)水平分为五组,分别为:受试者的抗缪勒氏管激素(AMH)水平小于2.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在2.5ng/ml及以上且小于5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在5ng/ml及以上且小于7.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在7.5ng/ml及以上且小于10ng/ml,以及受试者的抗缪勒氏管激素(AMH)水平大于等于10ng/ml。
18.根据项13~17中任一项所述的方法,其中,
在计算罹患多囊卵巢综合征的概率的步骤中,将所述受试者的月经周期天数上限转换成五分类变量,
即将受试者的月经周期天数上限分为五组,分别为受试者的月经周期天数上限小于35天,受试者的月经周期天数上限在35天及以上且小于45天,受试者的月经周期天数上限在45天及以上且小于60天,受试者的月经周期天数上限在60天及以上且小于90天,以及受试者的月经周期天数上限在90天及以上。
19.根据项13~18中任一项所述的方法,其中,
在计算罹患多囊卵巢综合征概率的步骤中,将受试者的BMI转换成四分类变量,
即将受试者的BMI分为四组,分别为受试者的BMI小于18.5,受试者 的BMI在18.5及以上且小于24,受试者的BMI在24及以上且小于28,以及受试者的BMI在28及以上。
20.根据项13~19中任一项所述的方法,其中,
在计算罹患多囊卵巢综合征的概率的步骤中,将受试者的雄烯二酮(AND)水平转换成三分类变量,
即将受试者的雄烯二酮(AND)水平范围三组,分别为:受试者的雄烯二酮(AND)水平小于5nmol/L,受试者的雄烯二酮(AND)水平在5nmol/L及以上且小于10nmol/L,以及受试者的雄烯二酮(AND)水平在10nmol/L及以上。
21.根据项13~20中任一项所述的方法,其中,
在计算罹患多囊卵巢综合征概率的步骤中,预先存储有基于现有数据库中受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量拟合而成的用于计算罹患多囊卵巢综合征的概率(p)的公式。
22.根据项21所述的方法,其中,
所述公式为如下公式一:
p=1/1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)(公式一)
其中,p为计算出的受试者罹患多囊卵巢综合征的概率,a、b、c、d、i为无单位参数;
在计算罹患多囊卵巢综合征的概率的步骤中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获取a、b、c、d的取值并带入公式一进行计算,
在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
23.根据项22所述的方法,其中,
i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;
当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;
当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;
当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;
当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;
当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;
当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0;
当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;
当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;
当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;
当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;
当受试者的BMI小于18.5时,BMI取值为0;
当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;
当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;
当受试者的BMI在28及以上时,BMI取值为1,c为选自0.7922476~1.6382346中的任意数值,c优选为1.2152411;
当受试者的AND水平小于5nmol/L时,AND取值为0;
当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;
当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
24.根据项13~23中任一项所述的方法,其中,
在所述分组步骤中预存的分组依据为:
当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;
当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;
当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
发明效果
本发明建立了一个具有4个参数的数学模型,即考虑AMH、月经周期天数、BMI和雄烯二酮水平的模型,从而代替了现有技术中简单采用AMH临界值来诊断多囊卵巢综合征的情况。与现有技术相比,本发明具有以下有益效果:首先,本发明构建的系统没有强调固定参数,而是着重于从多个变量中筛选预测参数从而用于构建本发明的预测系统,并深入地验证了其构建系统的预测准确性。其次,本发明的申请人采用的样本量较大,并经过外部验证,表明其稳定性。可见本发明人建立的多囊卵巢综合征诊断模型可能有助于在临床上筛查和诊断多囊卵巢综合征,并且有可能进一步阐明多囊卵巢综合征的病因。利用本发明的系统,可以计算出受试者罹患多囊卵巢综合征的概率(p),并依据系统预存的默认的多囊卵巢综合征分组参数,对该受试者罹患多囊卵巢综合征的概率(p)进行分组,从而判断受试者罹患多囊卵巢综合征的风险。
具体实施方式
在本文中,月经期是指每次月经持续的天数,一般为3~7天。在本文中,雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度,例如可以是月经期第一天、第二天、第三天、第四天、第五天、第六天或第七天等所检测到的受试者的雄烯二酮浓度。通常雄烯二酮数据在月经期比较稳定。
在本文中,月经周期是指两次月经第1日的时间间隔,受试者月经周期天数上限是由受试者主动提供的,例如受试者基于过往的经验提供的其月经周期通常是30-90天,那么在本发明中获取的月经周期天数上限为90天。
抗缪勒氏管激素(AMH)是一种由卵巢小卵泡的颗粒层细胞所分泌的荷尔蒙,胎儿时期的女宝宝从36周开始制造AMH,卵巢内的小卵泡数量越多,AMH的浓度便越高;反之,当卵泡随着年龄及各种因素逐渐消耗, AMH浓度也会随之降低,越接近更年期,AMH便渐趋于0。
BMI(Body Mass Index)是体质指数又称体重指数的简称,是用体重公斤数除以身高米数平方得出的数字,是国际上常用的衡量人体胖瘦程度以及是否健康的一个标准。主要用于统计用途,当我们需要比较及分析一个人的体重对于不同高度的人所带来的健康影响时,BMI值是一个中立而可靠的指标。
雄烯二酮(AND)是女性血循环中的四种雄性激素之一,雄烯二酮是睾酮的主要前体物质,循环中的雄烯二酮由卵巢和肾上腺的分泌各占一半,雄烯二酮水平和睾酮水平存在动态平衡的关系,循环血中雄烯二酮浓度过高提示高雄激素血症。
连续变量:在统计学中,变量按变量值是否连续可分为连续变量与分类变量两种。在一定区间内可以任意取值的变量叫连续变量,其数值是连续不断的,相邻两个数值可作无限分割,即可取无限个数值。例如,生产零件的规格尺寸,人体测量的身高、体重、胸围等为连续变量,其数值只能用测量或计量的方法取得。反之,其数值只能用自然数或整数单位计算的则为离散变量。例如,企业个数,职工人数,设备台数等,只能按计量单位数计数,这种变量的数值一般用计数方法取得。
分类变量是指地理位置、人口统计等方面的变量,其作用是将调查响应者分群。描述变量是描述某一个客户群与其他客户群的区别。大部分分类变量也就是描述变量。分类变量可以分为无序分类变量和有序分类变量两大类。其中,无序分类变量(unordered categorical variable)是指所分类别或属性之间无程度和顺序的差别。其又可分为①二项分类,如性别(男、女),药物反应(阴性和阳性)等;②多项分类,如血型(O、A、B、AB),职业(工、农、商、学、兵)等。而有序分类变量(ordinal categorical variable)各类别之间有程度的差别。如尿糖化验结果按-、±、+、++、+++分类;疗效按治愈、显效、好转、无效分类。对于有序分类变量,应先按等级顺序分组,清点各组的观察单位个数,编制有序变量(各等级)的频数表,所得资料称为等级资料。
变量类型不是一成不变的,根据研究目的的需要,各类变量之间可以进行转化。例如血红蛋白量(g/L)原属数值变量,若按血红蛋白正常与偏低分为两类时,可按二项分类资料分析;若按重度贫血、中度贫血、轻度贫 血、正常、血红蛋白增高分为五个等级时,可按等级资料分析。有时亦可将分类资料数量化,如可将病人的恶心反应以0、1、2、3表示,则可按数值变量资料(定量资料)分析。
逻辑回归(logistics regression),是一种广义的线性回归分析模型,常用于数据挖掘,疾病自动诊断,经济预测等领域。例如,探讨引发疾病的危险因素,并根据危险因素预测疾病发生的概率等。以胃癌病情分析为例,选择两组人群,一组是胃癌组,一组是非胃癌组,两组人群必定具有不同的体征与生活方式等。因此因变量就为是否胃癌,值为“是”或“否”,自变量就可以包括很多了,如年龄、性别、饮食习惯、幽门螺杆菌感染等。自变量既可以是连续的,也可以是分类的。然后通过逻辑回归回归分析,可以得到自变量的权重,从而可以大致了解到底哪些因素是胃癌的危险因素。同时根据该权值可以根据危险因素预测一个人患癌症的可能性。逻辑回归的因变量可以是二分类的,也可以是多分类的。
在本文中使用的数据拟合模型是一个逻辑回归模型,它基于λ的值对回归模型的系数的绝对大小进行惩罚。惩罚越大,对较弱因素的估计就趋近于零,因此只有最强的预测变量保留在模型中。
最小绝对收缩和选择算子回归(通常简单地称为Lasso回归),是以缩小变量集(降阶)为思想的压缩估计方法。它通过构造一个惩罚函数,可以将变量的系数进行压缩并使某些回归系数变为0,进而达到变量选择的目的。它是一种利用罚函数来提高模型预测能力的算法,该算法使用1-范数约束不仅能够解决高维度和共线性问题,还能使建立的模型具有“稀疏性”,即算法在建模中具有自动进行波长选择的效果。
10倍交叉验证(10-fold cross-validation),或称十折交叉验证,是常用的测试方法,用来测试算法准确性。在验证时将数据集分成十份,轮流将其中9份作为训练数据,1份作为测试数据,进行试验。每次试验都会得出相应的正确率(或差错率)。10次的结果的正确率(或差错率)的平均值作为对算法精度的估计,一般还需要进行多次10折交叉验证(例如10次10折交叉验证),再求其均值,作为对算法准确性的估计。十折交叉验证之所以选择将数据集分为10份,是因为通过利用大量数据集、使用不同学习技术进行的大量试验,表明10折是获得最好误差估计的恰当选择,而且也有一些理论根据可以证明这一点。
共线性,即同线性或同线型。统计学中,共线性即多重共线性。多重共线性(Multicollinearity)是指线性回归模型中的解释变量之间由于存在精确相关关系或高度相关关系而使模型估计失真或难以估计准确。一般来说,由于经济数据的限制使得模型设计不当,导致设计矩阵中解释变量间存在普遍的相关关系。完全共线性的情况并不多见,一般出现的是在一定程度上的共线性,即近似共线性。
过度拟合是“过于紧密或精确地对应于特定数据集的分析结果,因此可能无法拟合其他数据或可靠地预测未来的观察结果”。一种过度拟合模型是一个统计模型包含多个参数比可以由数据是合理的。过度拟合的本质是在不知不觉中提取了一些残余变化(即噪声),好像该变化代表了基础模型结构一样。换句话说,该模型记住了大量示例,而不是学习注意特征。过度拟合的可能性不仅取决于参数和数据的数量,还取决于模型结构与数据形状的一致性,以及与预期的噪声或数据误差水平相比模型误差的大小。即使拟合模型没有过多的参数,也可以预期,拟合关系在新数据集上的表现将比在拟合数据集上的表现差(有时会出现这种现象称为收缩)。
接收者操作特征曲线(receiver operating characteristic curve,简称ROC曲线),又称为感受性曲线(sensitivity curve)。得此名的原因在于曲线上各点反映着相同的感受性,它们都是对同一信号刺激的反应,只不过是在几种不同的判定标准下所得的结果而已。接受者操作特性曲线就是以虚惊概率为横轴,击中概率为纵轴所组成的坐标图,和被试在特定刺激条件下由于采用不同的判断标准得出的不同结果画出的曲线。
本发明提供一种诊断多囊卵巢综合征的系统,其包括:
数据采集模块,其用于获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及
计算罹患多囊卵巢综合征的概率的模块,其用于将数据采集模块中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。在计算罹患多囊卵巢综合征的概率的模块中,利用将受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量来计算受试者罹患多囊卵巢综合征的概率(p)。其中,所述抗缪勒氏管激素(AMH)水平是指女性受试者月经周期任何一天的静 脉血中的抗缪勒氏管激素浓度。比如受试者的月经周期为28天,则所述抗缪勒氏管激素(AMH)水平可以是月经周期第1天的静脉血中的抗缪勒氏管激素浓度,可以是月经周期第10天的静脉血中的抗缪勒氏管激素浓度,也可以是月经周期第28天的静脉血中的抗缪勒氏管激素浓度。
所述雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度。比如受试者的月经周期为25天,则所述雄烯二酮(AND)水平可以是受试者月经期中第1天所检测的受试者的雄烯二酮浓度,可以是受试者月经期中第3天所检测的受试者的雄烯二酮浓度,也可以是受试者月经期中第25天所检测的受试者的雄烯二酮浓度。
在计算罹患多囊卵巢综合征的概率的模块中,本申请的发明人经过深入研究,通过探索自变量与结局变量的分布情况,将所述抗缪勒氏管激素(AMH)水平转换成五分类变量,即将所述抗缪勒氏管激素(AMH)水平分为五组,分别为:受试者的抗缪勒氏管激素(AMH)水平小于2.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在2.5ng/ml及以上且小于5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在5ng/ml及以上且小于7.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在7.5ng/ml及以上且小于10ng/ml,以及受试者的抗缪勒氏管激素(AMH)水平大于等于10ng/ml。
在计算罹患多囊卵巢综合征的概率的模块中,本申请的发明人经过深入研究,通过探索自变量与结局变量的分布情况,将所述受试者的月经周期天数上限转换成五分类变量,即将受试者的月经周期天数上限分为五组,分别为受试者的月经周期天数上限小于35天,受试者的月经周期天数上限在35天及以上且小于45天,受试者的月经周期天数上限在45天及以上且小于60天,受试者的月经周期天数上限在60天及以上且小于90天,以及受试者的月经周期天数上限在90天及以上。
在计算罹患多囊卵巢综合征概率的模块中,本申请的发明人经过深入研究,通过探索自变量与结局变量的分布情况,将受试者的BMI转换成四分类变量,即将受试者的BMI分为四组,分别为受试者的BMI小于18.5,受试者的BMI在18.5及以上且小于24,受试者的BMI在24及以上且小于28,以及受试者的BMI在28及以上。
在计算罹患多囊卵巢综合征的概率的模块中,本申请的发明人经过深入研究,通过探索自变量与结局变量的分布情况,将受试者的雄烯二酮(AND) 水平转换成三分类变量,即将受试者的雄烯二酮(AND)水平范围三组,分别为:受试者的雄烯二酮(AND)水平小于5nmol/L,受试者的雄烯二酮(AND)水平在5nmol/L及以上且小于10nmol/L,以及受试者的雄烯二酮(AND)水平在10nmol/L及以上。
通过将上述四个变量变换成不同的多分类变量,利用这样的多分类变量来进行数据分析可以将自变量与结局变量的非线性关系转换为线性关系,更为准确地计算出受试者罹患多囊卵巢综合征的概率,且模型稳定性更好。
在计算罹患多囊卵巢综合征概率的模块中,预先存储有基于现有数据库中受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量拟合而成的用于计算罹患多囊卵巢综合征的概率(p)的公式。并根据分组标准对受试者罹患多囊卵巢综合征的概率(p)进行分组。
在本发明中,现有数据库是指能够获取的正在接受治疗或以前接受治疗满足下述纳入和排除标准的受试者组成的数据库,对于数据库的样本量没有任何约定,当然数据库的样本量越大越好,例如可以是利用100个受试者,200个受试者,300个受试者,优选为400个受试者以上,更优选为500个受试者以上。在一个具体的实施例中,采用的11720个样本组成的现有数据库。
分析数据:所有2019年一月至十二月来北京大学第三医院生殖医学中心就诊的合计21219个促排卵治疗周期,排除月经周期、BMI、睾酮、雄烯二酮、窦卵泡计数(AFC)等记录不全的周期,共11720个周期纳入最后的分析。本研究中的月经周期天数是指月经周期持续时间的上限。例如,如果患者的月经周期为30-90天,则使用90天进行分析。对于我们分析中不明身份的数据,无需患者知情同意,这符合赫尔辛基的声明。
计算罹患多囊卵巢综合征概率的模块利用如下公式(一)计算出受试者罹患多囊卵巢综合征的概率(p):
p=1/1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)(公式一)
其中,p为计算出的受试者罹患多囊卵巢综合征的概率,a、b、c、d、i为无单位参数;
在计算罹患多囊卵巢综合征的概率的模块中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获 取a、b、c、d的取值并带入公式一进行计算,
在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
进一步地,i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0;当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;当受试者的BMI小于18.5时,BMI取值为0;当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;当受试者的BMI在28及以上时,BMI取值为1,c为选自0.7922476~1.6382346中的任意数值,c优选为1.2152411;当受试者的AND水平小于5nmol/L时,AND取值为0;当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
在本申请的分组模块中预存有默认的多囊卵巢综合征分组参数,在所述分组模块中预存的分组依据为:当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
在本申请的另外的一个具体的实施方式中,本申请还涉及一种诊断多囊卵巢综合征的方法,其包括:数据采集步骤,其获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及计算罹患多囊卵巢综合征的概率的步骤,其将数据采集步骤中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。
如上所述,本申请的方法中所进行的步骤中的具体内容,对于受试者的抗缪勒氏管激素(AMH)水平、受试者主动提供的月经周期天数上限、受试者的BMI和受试者的雄烯二酮(AND)水平的数据的获取,分组以及处理方式均可以参照上述本申请涉及的系统的各模块进行的步骤。
实施例
实验数据的选定
在本实施例中采用了北京大学第三医院的病例数据,本申请的申请人收集在2019年1月至12月之间的21219个进行了促排卵周期的受试者的记录,并且经过筛选从中排除了无月经周期数据的3289个受试者的周期数据,无体重指数(BMI)信息的150个受试者的周期数据,无睾丸激素水平的3180个受试者的周期数据,无雄烯二酮水平的31个受试者的周期数据,以及无窦卵泡计数(AFC)信息的3849个受试者的周期数据。最后选定了11720个受试者的周期数据进行统计分析,并用于在本实施例中构建本发明的系统。
在本实施例中,月经周期天数上限是指月经周期持续时间的上限。例如,如果受试者在治疗期间提供的其过往的月经周期为30-90天,则使用90天作为月经周期天数上限。
对于本实施例进行的分析中不涉及患者身份信息,无需患者知情同意,这符合赫尔辛基声明。
PCOS的临床诊断
根据2003年鹿特丹标准(2003Rotterdam criteria,Group,R.E.A.-S.P.c.w.Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome.Fertil Steril 81,19-25(2004))诊断受试者是否罹患PCOS,该标准要求至少存在以下中的两项:(1)排卵功能障碍(即稀发排卵和/或无排卵);(2)高雄激素血症(血液测试中睾丸激素或雄烯二酮水平高)或雄激素过多的临床表现;(3)超声检查确定的多囊卵巢。同时排除表型相似的雄激素过多疾病(如先天性肾上腺增生,分泌雄激素的肿瘤、库欣综合征、甲状腺功能障碍和高泌乳素血症)之后,最终诊断是否罹患PCOS。
高雄激素临床表现是指痤疮、雄激素性脱发或多毛症;高雄激素血症指血清总睾丸激素或雄烯酮含量升高。在本实施例中,对多毛症的诊断采取如下标准,采用改良的Ferriman-Galwey评分>4,或涉及上唇,大腿和小腹的毛发生长,且毛发生长评分>2来诊断多毛症。在怀疑有雄激素分泌性肿瘤的罕见情况下(例如,当受试者出现明显的病毒感染或与PCOS有关的症状迅速发作时),测量雄激素水平会有所帮助。
其中,月经周期持续超过35天但少于6个月的月经周期诊断为月经稀发。闭经是指在形成周期性模式后超过6个月内没有月经。超声检查中的多囊卵巢定义为至少一侧卵巢包含12个或更多直径为2-9mm的卵泡或卵巢体积大于10mL。单个卵巢满足以上两个定义之一或全部两个定义即可以被诊断为多囊卵巢。高泌乳素血症是两次血清催乳素(PRL)含量超过25ng/mL来诊断的。
窦卵泡计数和内分泌测定
在本实施例中,在月经周期或月经期第2天,通过经阴道超声扫描计数受试者两个卵巢中直径为2-10mm的窦卵泡的数量。在同一天,收集受试者静脉血以测量催乳素(PRL)、促黄体生成素(LH)、睾丸激素、雄烯二酮和血清雌二醇(E 2)的浓度。在月经期的任何一天都采集用于测量AMH的血样。收集血样并立即倒转五次并离心以进行进一步的内分泌评估。
使用Siemens Immulite 2000免疫测定系统(Siemens Healthcare Diagnostics,上海,中国)测试PRL、LH、睾丸激素,雄烯二酮和E 2的血清 水平。PRL、LH、睾丸激素、雄烯二酮和E 2的质量控制由Bio-RAD实验室提供(美国加利福尼亚州赫尔克里士;Lyphochek免疫测定Plus对照,三级,目录号370,批号40370)。使用试剂盒随附的质量控制,使用超灵敏两点ELISA(Ansh Labs LLC;Webster,TX,USA)测量血清AMH浓度。对于三级质控、AMH、PRL和LH的测定变异系数小于6%,E 2、雄烯二酮和睾丸激素的变异系数小于10%。
数据分析
本实施例中的所有分析均使用SAS JMP Pro(版本14.2;SAS Institute,Cary,NC,美国)来进行,并且在分析中如果p<0.05则被认为是具有统计学意义的。正态分布的变量显示为均值和标准差,而非正态分布的变量显示为中位数和四分位数。对于变量选择,将七个或八个变量输入到选择过程中。应用最小绝对收缩和选择算子回归(LASSO)来最小化从同一受试者测量的变量的潜在共线性和变量的过度拟合。
在本实施例中,发明人对多变量分析使用了L1最小化的最小绝对收缩和选择回归(L1-penalized least absolute shrinkage and selection regression),并使用10倍交叉验证进行了内部验证。这是一个逻辑回归模型,它基于λ的值对回归模型的系数的绝对大小进行惩罚。惩罚越大,对较弱因素的估计就趋近于零,因此只有最强的预测变量保留在模型中。预测性最强的协变量由最小值(λmin)选择。随后,将通过LASSO回归分析确定的变量输入到逻辑回归模型中,并将始终具有统计学意义的变量用于构建PCOS诊断模型。
在本实施例中,发明人使用接收者-操作者特征曲线(AUC),采用灵敏度和特异性下的面积评估了PCOS模型的性能。
变量选择
首先如下表1列出了实施例中收集的变量的基本特征。这些指标在诊断PCOS时在单变量分析中均具有重要意义。为了分析变量之间的相关性,将连续变量转换为分类变量。自变量的分组标准主要基于分析前的数据探索并结合本申请发明人的临床经验。在三个不同模型中,每个自变量的分组标准保持不变。
表1变量特征
Figure PCTCN2021097896-appb-000001
其中,表1中BMI表示体重指数;AMH表示抗缪勒氏管激素水平;TES表示睾丸激素水平;AND表示雄烯二酮水平;AFC表示窦卵泡计数。
为了建立更好的模型,分别使用了逻辑回归和LASSO逻辑回归,再进行10倍交叉验证,其中两种回归和10倍交叉验证的结果示于表2中。具有10倍交叉验证的LASSO逻辑回归使用较小的校正后的Akaike信息准则(AIC)和贝叶斯信息准则(BIC),表明采用LASSO逻辑回归来进行数据分析和拟合的稳定性更高,因此基于实施例的结果,本申请的发明人选择在以下分析和构建系统中使用LASSO逻辑回归。
表2两种数学模型构建方法的比较
Figure PCTCN2021097896-appb-000002
其中,AIC表示校正后的Akaike信息准则;BIC表示贝叶斯信息准则
本申请的发明人研究的过程中注意到有文献报道称AMH水平有望替代AFC。因此在进一步分析中,在本实施例中发明人使用具有10倍交叉验证的LASSO逻辑回归分别建立有或没有AFC的PCOS诊断模型,从而确认后 续构建分析系统是否需要使用AFC数据。下表3中显示了在有无AFC的情况下,训练组和验证组中的AUC数据的比较结果。
表3的结果显示模型中包含AFC时并不能改善构建的模型的性能。其中每个变量的贡献如表4所示。模型1(不使用AFC)中AMH的主要影响为35.1%,模型2(使用AFC)中AMH和AFC的主要影响分别为18.3%和17.2%,基于上述分析结果提示可以不再使用AFC来进行建模。
另外,由于窦卵泡计数(AFC)是早期Gn依赖性卵泡生长中直径小于8mm的卵泡数。众所周知,卵巢中的原始卵泡池与正在生长的窦状卵泡的数量有关,因此,从理论上讲,AFC能够尽可能反映出剩余卵巢卵泡池的精确度。然而,要获得良好的AFC结果,需要熟练的经阴道超声(TVS)专家进行超声波检查,这既耗时又耗资源。而且AFC测量中缺乏标准,AFC会随着月经周期、避孕药的使用、以及TVS设备的灵敏度和分辨率等因素而发生变化,所有这些现有的混杂因素会使得对AFC的可靠评估更加困难。
因此,在本实施例中,首先经过发明人的深入研究验证了现有技术中提出的AMH水平有望替代AFC,从而为构建方便检测的系统和方法提供了初步的依据。
表3两个模型的AUC比较结果
Figure PCTCN2021097896-appb-000003
表4在模型1和模型2中每个变量主要影响的比较结果
Figure PCTCN2021097896-appb-000004
Figure PCTCN2021097896-appb-000005
随后,在不使用AFC的模型1中,本申请的发明人进一步考察每个变量的变化趋势,数据显示年龄和睾丸激素水平并未随PCOS的发生而变化。由于年龄和睾丸激素在模型构建中的贡献很小,在后续的实施例系统构建中,发明人排除了这两个变量的使用。
综上,确认在整个模型的构建中,使用受试者的月经周期天数上限、AMH水平、BMI和雄烯二酮水平来作为变量。
为此,首先发明人将所有11720个受试者的数据按80%:20%的比例分为内部验证组和外部验证数据组。在内部验证组中,将LASSO回归与10倍逻辑回归相结合来确定最佳模型(即模型3)。计算了受试者罹患PCOS的概率的原始数据及其相应的预测数据。
表5中显示了模型3中每个变量的估计参数值和p值。每种预测因子对模型3的主要影响是AMH 41.2%,月经周期天数上限35.2%,BMI 4.3%和雄烯二酮3.7%。表6显示了建模数据、内部验证数据和外部验证数据和中的AUC、敏感性和特异性。
表5实施例构建的模型3中每个预测变量对PCOS的影响进行多重分析
Figure PCTCN2021097896-appb-000006
Figure PCTCN2021097896-appb-000007
表中,BMI表示体重指数;AND表示雄烯二酮水平。
表6实施例构建的模型3的表现
Figure PCTCN2021097896-appb-000008
综上,基于本实施例中确认的模型3,可以获得用于计算罹患多囊卵巢综合征的概率(p)的公式,即公式一,其能够基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据来计算受试者罹患PCOS的概率(p)。
公式一:p=1/1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)
其中,p为计算出的受试者罹患PCOS的概率,a、b、c、d、i为无单位参数;
在计算罹患多囊卵巢综合征的概率的模块中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获 取a、b、c、d的取值并带入公式一进行计算,
在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;
当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;
当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;
当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;
当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;
当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;
当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0,
当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;
当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;
当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;
当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;
当受试者的BMI小于18.5时,BMI取值为0;
当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;
当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;
当受试者的BMI在28及以上时,BMI取值为1,c为选自 0.7922476~1.6382346中的任意数值,c优选为1.2152411;
当受试者的AND水平小于5nmol/L时,AND取值为0;
当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
利用上述公式一计算出的预测概率与PCOS实际发生率之间的关系为,PCOS的实际发生率随着预测概率的增加而增加。表7显示了最容易预测患有PCOS的前十大女性群体。详细信息包括月经周期天数上限、AMH、BMI和雄烯二酮水平,实际是否PCOS的病例数,预测PCOS的发生概率以及PCOS的实际发生率。当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
表7高度预测患有PCOS的十大女性群体
Figure PCTCN2021097896-appb-000009
基于上表7可以看出,利用本申请所构建的系统或方法,可以非常好地对受试者是否罹患PCOS进行预测,在表7中显示的预测PCOS发生概率最高的10类群体中,PCOS的预测概率均与PCOS的实际发生率非常接近,因此,预计在未来的临床诊断中,本申请所构建的模型可以有效地帮助临床医生来对受试者是否罹患PCOS进行辅助诊断。
虽然在本发明之前的研究已经建立了AMH水平与多囊卵巢形态之间的良好相关性。血清AMH已被越来越多地视为诊断PCOS的替代指标。此前的许多研究发现了诊断PCOS的不同AMH临界值。但是,由于在先的研究样本量小,对照不适当以及AMH检测不均一,因此AMH临界值在PCOS诊断中的应用受到了限制。这也是将AMH引入PCOS诊断中引起争议的原因。另外,尽管AMH可以作为PCOS的潜在诊断标志物,但是《2018年国际基于证据的多囊卵巢综合症评估和管理指南》不建议将其作为PCOS诊断的单一测试参数。其他的研究人员还结合AMH和其他参数,例如Vagios等人使用AMH和BMI来构建预测PCOS的诊断模型(Vagios,S.,James,K.E.,Sacha,C.R.,et al.A patient-specific model combining antimullerian hormone and body mass index as a predictor of polycystic ovary syndrome and other oligo-anovulation disorders.Fertil Steril 115,229-237(2021).10.1016/j.fertnstert.2020.07.023)。可见在现有技术中是否应该使用AMH来预测PCOS存在了很大的争议。
如上所述,鉴于现有技术中存在的问题,本申请的发明人经过深入研究建立了一个具有4个参数的数学模型(即包括AMH水平、月经周期天数上限、BMI和雄烯二酮水平的模型)并构建了基于该模型的系统和方法,从而代替了现有技术中简单采用AMH临界值来诊断PCOS的现状。此外,现有技术中有研究者采用小鼠模型显示过量的AMH还会导致雄激素过多和排卵障碍。
最近使用雌性动物模型进行的研究揭示了AMH作用的关键潜在机制。因此,在下丘脑中可以找到AMH的受体,并在妊娠第16.5、17.5和18.5天施用重组人AMH激活了GnRH-中的AMH受体。在妊娠第19.5天,怀孕小鼠的神经元分泌增多并导致LH脉冲频率增加,导致血清LH和睾丸激素水平升高,以及E 2和孕酮水平下降。高AMH诱导的血清LH和睾丸激素水平 升高导致寡聚排卵或无排卵,以及母体和雌性后代的卵母细胞发育不良。这些表型被母亲或后代施用GnRH拮抗剂所抑制。这些结果表明,AMH通过H-P-O轴调节卵泡发育,而过多的AMH会促进PCOS的发作。
在实施例采用的建模数据集、内部验证数据集和外部验证集中,AUC分别为0.852、0.857、0.838,由此可以看出本发明构建的模型中每个预测因子对模型的主要作用是AMH 41.2%,月经周期天数上限35.2%,BMI 4.3%和雄烯二酮3.7%。基于这部分的结果也可以看出,月经周期天数上限的作用仅次于AMH水平,因此现有技术中仅仅考虑AMH的预测方法难以实现良好的效果。而本申请的发明人经过深入研究构建的系统揭示了AMH和BMI在诊断PCOS中的均起到重要作用。
本申请的发明人考虑了月经周期天数上限超过35天表明慢性无排卵,月经周期越长,排卵障碍越严重。BMI用于评估肥胖的严重程度,因为肥胖者面临长期不良代谢紊乱的风险增加。因此,本发明构建的系统的参数可以涵盖鹿特丹标准的所有三个方面以及代谢异常。从实施例的结果可以看出考虑了AMH、月经周期天数上限,同时还进一步结合了BMI和雄烯二酮,更进一步提高了预测PCOS的准确性。
可见,基于本申请发明人深入地研究,在本发明中建立的罹患PCOS的预测模型可能会成为将来诊断亚洲人群罹患PCOS的潜在定量工具,并且也支持过量的AMH分泌作为PCOS的潜在治疗目标。
此外,虽然如上所述,在本申请之前,Vagios等人使用AMH和BMI来构建预测PCOS的诊断模型,其也使用了逻辑回归来进行。但本发明人构建的系统和Vagios等人构建的模型仍然存在很大的差异。首先,本发明构建的系统没有强调固定参数,而是着重于从多个变量中筛选预测参数从而用于构建本发明的预测系统,并深入地验证了其构建系统的预测准确性。Vagios等人的研究使用BMI分层分析,仅关注AMH和BMI参数。其次,本发明的申请人采用的样本量较大,并经过外部验证,表明其稳定性;但是Vagios等人的研究没有外部验证,因此在不同人群中的诊断性能尚无定论。
至于年龄在预测PCOS中的作用,本发明的结果在表4中的显示,调整月经周期天数上限、血清AMH水平、AFC、BMI、血清雄烯二酮水平和血清睾丸激素水平时,年龄的贡献很小,只有0.2%,因此对于年龄这个在妇 科或妇产科诊断领域中非常关键的参数,本发明最终构建的系统和方法中却不再考虑年龄,也进一步说明在预测PCOS与其它妇科或妇产科相关疾病相比的独特之处。
预计AMH含量的测量将替代AFC(超声检查获得的标准之一)。使用超声检测需要昂贵的设备和训练有素的人员,这会导致成本增加,准确性和可重复性差。在某些女性中,经阴道超声是不可接受的或侵入性的。此外,将简单的临界值应用于PCOS的诊断也有其缺点。本发明构建的模型的预测结果表明,AMH对模型1(不含AFC)的贡献为35.1%,而AMH和AFC组合在模型2(含AFC)中的贡献为35.5%,这表明AMH可以替代模型1中的AFC。未来将在全球范围内提供PCOS诊断标准。
由于PCOS与肥胖关系密切,因此,患有PCOS的苗条女性常常难以诊断,高达30%的PCOS生殖女性保持正常体重,这些瘦型PCOS患者经常被漏诊。在本发明人的研究数据中,BMI小于18.5kg/m 2的人PCOS的实际发病率为64/1071。当将此措施与AMH>10ng/mL结合使用时,PCOS的发生率增加到21/49。当BMI<18.5kg/m 2和AMH>10ng/mL且月经周期持续时间>90天时,PCOS的发生率增加到10/13。这些体重正常或瘦的女性仍然面临着生育挑战,雄激素水平升高以及由此产生的症状(例如痤疮,多毛症,脱发)以及罹患糖尿病和心血管疾病的风险增加。
基于本实施例中构建的模型在训练组、内部验证组和外部验证组的良好的预测效果,可见本发明人建立的PCOS诊断模型可能有助于诊断这些患者,并提示她们需要及时治疗,并希望能促进他们的长期健康管理。
上述的具体实施方案仅仅是示意性的、指导性的,而不是限制性的。本领域的普通技术人员在本说明书的启示下和在不脱离本发明权利要求所保护的范围的情况下,还可以做出很多种的形式,这些均属于本发明保护之列。

Claims (24)

  1. 一种诊断多囊卵巢综合征的系统,其包括:
    数据采集模块,其用于获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及
    计算罹患多囊卵巢综合征的概率的模块,其用于将数据采集模块中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。
  2. 根据权利要求1所述的系统,其还包括:
    分组模块,在所述分组模块中预存有默认的多囊卵巢综合征分组参数,并且依据该分组参数,对所述计算得到的罹患多囊卵巢综合征的概率(p)进行分组,从而对受试者罹患多囊卵巢综合征的风险进行分组。
  3. 根据权利要求1或2所述的系统,其中,
    在计算罹患多囊卵巢综合征的概率的模块中,利用将受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量来计算受试者罹患多囊卵巢综合征的概率(p)。
  4. 根据权利要求1~3中任一项所述的系统,其中,
    所述抗缪勒氏管激素(AMH)水平是指女性受试者月经周期任何一天的静脉血中的抗缪勒氏管激素浓度,
    所述雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度。
  5. 根据权利要求1~4中任一项所述的系统,其中,
    在计算罹患多囊卵巢综合征的概率的模块中,将所述抗缪勒氏管激素(AMH)水平转换成五分类变量,
    即将所述抗缪勒氏管激素(AMH)水平分为五组,分别为:受试者的抗缪勒氏管激素(AMH)水平小于2.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在2.5ng/ml及以上且小于5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在5ng/ml及以上且小于7.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在7.5 ng/ml及以上且小于10ng/ml,以及受试者的抗缪勒氏管激素(AMH)水平大于等于10ng/ml。
  6. 根据权利要求1~5中任一项所述的系统,其中,
    在计算罹患多囊卵巢综合征的概率的模块中,将所述受试者的月经周期天数上限转换成五分类变量,
    即将受试者的月经周期天数上限分为五组,分别为受试者的月经周期天数上限小于35天,受试者的月经周期天数上限在35天及以上且小于45天,受试者的月经周期天数上限在45天及以上且小于60天,受试者的月经周期天数上限在60天及以上且小于90天,以及受试者的月经周期天数上限在90天及以上。
  7. 根据权利要求1~6中任一项所述的系统,其中,
    在计算罹患多囊卵巢综合征概率的模块中,将受试者的BMI转换成四分类变量,
    即将受试者的BMI分为四组,分别为受试者的BMI小于18.5,受试者的BMI在18.5及以上且小于24,受试者的BMI在24及以上且小于28,以及受试者的BMI在28及以上。
  8. 根据权利要求1~7中任一项所述的系统,其中,
    在计算罹患多囊卵巢综合征的概率的模块中,将受试者的雄烯二酮(AND)水平转换成三分类变量,
    即将受试者的雄烯二酮(AND)水平范围三组,分别为:受试者的雄烯二酮(AND)水平小于5nmol/L,受试者的雄烯二酮(AND)水平在5nmol/L及以上且小于10nmol/L,以及受试者的雄烯二酮(AND)水平在10nmol/L及以上。
  9. 根据权利要求1~8中任一项所述的系统,其中,
    在计算罹患多囊卵巢综合征概率的模块中,预先存储有基于现有数据库中受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量拟合而成的用于计算罹患多囊卵巢综合征的概率(p)的公式。
  10. 根据权利要求9所述的系统,其中,
    所述公式为如下公式一:
    p=1/[1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)](公式一)
    其中,p为计算出的受试者罹患多囊卵巢综合征的概率,a、b、c、d、i为无单位参数;
    在计算罹患多囊卵巢综合征的概率的模块中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获取a、b、c、d的取值并带入公式一进行计算,
    在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
  11. 根据权利要求10所述的系统,其中,
    i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;
    当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;
    当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;
    当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;
    当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;
    当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;
    当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0;
    当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;
    当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;
    当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;
    当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;
    当受试者的BMI小于18.5时,BMI取值为0;
    当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;
    当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;
    当受试者的BMI在28及以上时,BMI取值为1,c为选自0.7922476~1.6382346中的任意数值,c优选为1.2152411;
    当受试者的AND水平小于5nmol/L时,AND取值为0;
    当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;
    当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
  12. 根据权利要求1~11中任一项所述的系统,其中,
    在所述分组模块中预存的分组依据为:
    当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;
    当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;
    当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
  13. 一种诊断多囊卵巢综合征的方法,其包括:
    数据采集步骤,其获取受试者的抗缪勒氏管激素(AMH)水平、收集受试者主动提供的月经周期天数上限、收集受试者的BMI、以及获取受试者的雄烯二酮(AND)水平的数据;以及
    计算罹患多囊卵巢综合征的概率的步骤,其将数据采集步骤中获取的上述数据信息进行计算,从而计算出受试者罹患多囊卵巢综合征的概率(p)。
  14. 根据权利要求13所述的方法,其还包括:
    分组步骤,在所述分组步骤中预存有默认的多囊卵巢综合征分组参数,并且依据该分组参数,对所述计算得到的罹患多囊卵巢综合征的概率(p)进行分组,从而对受试者罹患多囊卵巢综合征的风险进行分组。
  15. 根据权利要求13或14所述的方法,其中,
    在计算罹患多囊卵巢综合征的概率的步骤中,利用将受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量来计算受试者罹患多囊卵巢综合征的概率(p)。
  16. 根据权利要求13~15中任一项所述的方法,其中,
    所述抗缪勒氏管激素(AMH)水平是指女性受试者月经周期任何一天的静脉血中的抗缪勒氏管激素浓度,
    所述雄烯二酮(AND)水平是指受试者月经期中任一天所检测的受试者的雄烯二酮浓度。
  17. 根据权利要求13~16中任一项所述的方法,其中,
    在计算罹患多囊卵巢综合征的概率的步骤中,将所述抗缪勒氏管激素(AMH)水平转换成五分类变量,
    即将所述抗缪勒氏管激素(AMH)水平分为五组,分别为:受试者的抗缪勒氏管激素(AMH)水平小于2.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在2.5ng/ml及以上且小于5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在5ng/ml及以上且小于7.5ng/ml,受试者的抗缪勒氏管激素(AMH)水平在7.5ng/ml及以上且小于10ng/ml,以及受试者的抗缪勒氏管激素(AMH)水平大于等于10ng/ml。
  18. 根据权利要求13~17中任一项所述的方法,其中,
    在计算罹患多囊卵巢综合征的概率的步骤中,将所述受试者的月经周期天数上限转换成五分类变量,
    即将受试者的月经周期天数上限分为五组,分别为受试者的月经周期天数上限小于35天,受试者的月经周期天数上限在35天及以上且小于45天,受试者的月经周期天数上限在45天及以上且小于60天,受试者的月经周期天数上限在60天及以上且小于90天,以及受试者的月经周期天数上限在90天及以上。
  19. 根据权利要求13~18中任一项所述的方法,其中,
    在计算罹患多囊卵巢综合征概率的步骤中,将受试者的BMI转换成四分类变量,
    即将受试者的BMI分为四组,分别为受试者的BMI小于18.5,受试者 的BMI在18.5及以上且小于24,受试者的BMI在24及以上且小于28,以及受试者的BMI在28及以上。
  20. 根据权利要求13~19中任一项所述的方法,其中,
    在计算罹患多囊卵巢综合征的概率的步骤中,将受试者的雄烯二酮(AND)水平转换成三分类变量,
    即将受试者的雄烯二酮(AND)水平范围三组,分别为:受试者的雄烯二酮(AND)水平小于5nmol/L,受试者的雄烯二酮(AND)水平在5nmol/L及以上且小于10nmol/L,以及受试者的雄烯二酮(AND)水平在10nmol/L及以上。
  21. 根据权利要求13~20中任一项所述的方法,其中,
    在计算罹患多囊卵巢综合征概率的步骤中,预先存储有基于现有数据库中受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平的数据转换成的多分类变量拟合而成的用于计算罹患多囊卵巢综合征的概率(p)的公式。
  22. 根据权利要求21所述的方法,其中,
    所述公式为如下公式一:
    p=1/1+e -(i+a*AMH+b*月经周期天数上限+c*BMI+d*AND)(公式一)
    其中,p为计算出的受试者罹患多囊卵巢综合征的概率,a、b、c、d、i为无单位参数;
    在计算罹患多囊卵巢综合征的概率的步骤中,基于受试者的抗缪勒氏管激素(AMH)水平、月经周期天数上限、BMI、以及雄烯二酮(AND)水平来获取a、b、c、d的取值并带入公式一进行计算,
    在计算中,AMH、月经周期天数上限、BMI或AND取值为0或1。
  23. 根据权利要求22所述的方法,其中,
    i为选自-4.91525~-4.081495中的任意数值,i优选为-4.498372;
    当受试者的AMH水平小于2.5ng/ml时,AMH取值为0;
    当受试者的AMH水平在2.5ng/ml及以上且小于5ng/ml时,AMH取值为1,a为选自0.3883373~0.8463509中的任意数值,a优选为0.6173441;
    当受试者的AMH水平在5ng/ml及以上且小于7.5ng/ml时,AMH取值为1,a为选自1.2694194~1.7629597中的任意数值,a优选为1.5161895;
    当受试者的AMH水平在7.5ng/ml及以上且小于10ng/ml时,AMH取 值为1,a为选自1.8891674~2.4887798中的任意数值,a优选为2.1889736;
    当受试者的AMH水平大于等于10ng/ml时,AMH取值为1,a为选自2.1935842~2.8082163中的任意数值,a优选为2.5009002;
    当受试者的月经周期天数上限小于35天时,月经周期天数上限取值为0;
    当受试者的月经周期天数上限在35天及以上且小于45天时,月经周期天数上限取值为1,b为选自1.1669412~1.6485894中的任意数值,b优选为1.4077653;
    当受试者的月经周期天数上限在45天及以上且小于60天时,月经周期天数上限取值为1,b为选自1.5889245~2.0947343中的任意数值,b优选为1.8418294;
    当受试者的月经周期天数上限在60天及以上且小于90天时,月经周期天数上限取值为1,b为选自1.6497983~2.3668561中的任意数值,b优选为2.0083272;
    当受试者的月经周期天数上限在90天及以上时,月经周期天数上限取值为1,b为选自1.8809757~2.5707838中的任意数值,b优选为2.2258797;
    当受试者的BMI小于18.5时,BMI取值为0;
    当受试者的BMI在18.5及以上且小于24时,BMI取值为1,c为选自-0.085964~0.6550568中的任意数值,c优选为0.2845466;
    当受试者的BMI在24及以上且小于28时,BMI取值为1,c为选自0.3957758~1.1728099中的任意数值,c优选为0.7842928;
    当受试者的BMI在28及以上时,BMI取值为1,c为选自0.7922476~1.6382346中的任意数值,c优选为1.2152411;
    当受试者的AND水平小于5nmol/L时,AND取值为0;
    当受试者的AND水平在5nmol/L及以上且小于10nmol/L时,AND取值为1,d为选自0.269652~0.6809945中的任意数值,d优选为0.4753233;
    当受试者的AND水平在10nmol/L及以上时,AND取值为1,d为选自0.7579538~1.252042中的任意数值,d优选为1.0049979。
  24. 根据权利要求13~23中任一项所述的方法,其中,
    在所述分组步骤中预存的分组依据为:
    当计算出的受试者罹患多囊卵巢综合征的概率(p)<10%时,受试者罹患多囊卵巢综合征的风险是低危;
    当10%≤计算出的受试者罹患多囊卵巢综合征的概率(p)<50%时,受试者罹患多囊卵巢综合征的风险是中风险;
    当计算出的受试者罹患多囊卵巢综合征的概率(p)≥50%时,受试者罹患多囊卵巢综合征的风险是高风险。
PCT/CN2021/097896 2021-05-25 2021-06-02 一种诊断多囊卵巢综合征的系统和方法 WO2022246882A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110574591.3A CN113035354B (zh) 2021-05-25 2021-05-25 一种诊断多囊卵巢综合征的系统和方法
CN202110574591.3 2021-05-25

Publications (1)

Publication Number Publication Date
WO2022246882A1 true WO2022246882A1 (zh) 2022-12-01

Family

ID=76455870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097896 WO2022246882A1 (zh) 2021-05-25 2021-06-02 一种诊断多囊卵巢综合征的系统和方法

Country Status (2)

Country Link
CN (1) CN113035354B (zh)
WO (1) WO2022246882A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115620900B (zh) * 2022-12-13 2023-05-30 北京大学第三医院(北京大学第三临床医学院) 一种筛查多囊卵巢综合征的系统和方法
CN116543905A (zh) * 2023-05-09 2023-08-04 北京大学第三医院(北京大学第三临床医学院) 预测卵巢多囊样改变(pcom)的系统和方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170107573A1 (en) * 2015-10-19 2017-04-20 Celmatix Inc. Methods and systems for assessing infertility as a result of declining ovarian reserve and function
CN109602394A (zh) * 2018-12-12 2019-04-12 北京大学第三医院 评估受试者卵巢储备功能的系统
CN110570952A (zh) * 2018-06-05 2019-12-13 北京大学第三医院 预测拮抗剂方案下受试者卵巢低反应概率的系统及指导促性腺激素起始用药剂量选择的系统
CN111524604A (zh) * 2020-04-07 2020-08-11 北京大学第三医院(北京大学第三临床医学院) 评估受试者卵巢储备功能的系统
CN111785389A (zh) * 2020-07-10 2020-10-16 北京大学第三医院(北京大学第三临床医学院) 预测受试者出现卵巢储备新变化年限的系统和方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224827B (zh) * 2008-07-01 2019-12-24 小利兰·斯坦福大学托管委员会 用于发展ivf治疗协议的分析数据的计算机系统和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170107573A1 (en) * 2015-10-19 2017-04-20 Celmatix Inc. Methods and systems for assessing infertility as a result of declining ovarian reserve and function
CN110570952A (zh) * 2018-06-05 2019-12-13 北京大学第三医院 预测拮抗剂方案下受试者卵巢低反应概率的系统及指导促性腺激素起始用药剂量选择的系统
CN109602394A (zh) * 2018-12-12 2019-04-12 北京大学第三医院 评估受试者卵巢储备功能的系统
CN111524604A (zh) * 2020-04-07 2020-08-11 北京大学第三医院(北京大学第三临床医学院) 评估受试者卵巢储备功能的系统
CN111785389A (zh) * 2020-07-10 2020-10-16 北京大学第三医院(北京大学第三临床医学院) 预测受试者出现卵巢储备新变化年限的系统和方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAZILIYA, YASHENG ET AL.: "A Nomogram Model for Predicting Polycystic Ovary Syndrome", JOURNAL OF XINJIANG MEDICAL UNIVERSITY, vol. 43, no. 08, 31 August 2020 (2020-08-31), CN , pages 1113 - 1117, 1121, XP009541270, ISSN: 1009-5551 *

Also Published As

Publication number Publication date
CN113035354B (zh) 2022-07-12
CN113035354A (zh) 2021-06-25

Similar Documents

Publication Publication Date Title
Morse et al. Performance of human chorionic gonadotropin curves in women at risk for ectopic pregnancy: exceptions to the rules
Reddy et al. Challenging the definition of hypertension in pregnancy: a retrospective cohort study
CN109602394B (zh) 评估受试者卵巢储备功能的系统
WO2021031605A1 (zh) 预测受试者卵巢刺激过程中获得的卵母细胞数量的系统
WO2022246882A1 (zh) 一种诊断多囊卵巢综合征的系统和方法
WO2023138418A1 (zh) 女性盆底功能障碍性疾病风险预警模型及其构建方法和系统
WO2022006941A1 (zh) 预测受试者出现卵巢储备新变化年限的系统和方法
WO2022006942A1 (zh) 预测受试者出现卵巢储备新变化年限的系统和方法
CN115398238A (zh) 一种评定雌性患有pcos的风险的方法以及与其相关的产品和用途
Wah et al. Procedure-related fetal loss following chorionic villus sampling after first-trimester aneuploidy screening
WO2023103189A1 (zh) 预测受试者卵巢刺激过程中获得的卵母细胞数量的系统和方法
CN115620900B (zh) 一种筛查多囊卵巢综合征的系统和方法
Urquhart et al. Comparing time to diagnosis and treatment of patients with ruptured ectopic pregnancy based on type of ultrasound performed: a retrospective inquiry
CN111524604B (zh) 评估受试者卵巢储备功能的系统
Giuliano et al. Interpretation and use of statistics in nursing research
WO2023155399A1 (zh) 一种用于预测受试者的卵巢高反应的系统及方法
Oriji et al. Prediction of gestational diabetes mellitus in early pregnancy: is abdominal skin fold thickness 20 mm or more an independent risk predictor?
Marill et al. Utility of a single beta HCG measurement to evaluate for absence of ectopic pregnancy
Lim et al. Machine learning-based evaluation of application value of traditional Chinese medicine clinical index and pulse wave parameters in the diagnosis of polycystic ovary syndrome
WO2022222207A1 (zh) 用于评估受试者卵巢储备功能的系统
Tabacu et al. Relationship between serum leptin values and abdominal circumference assessed in the first trimester of pregnancy in obese women
Yaduvanshi et al. Screening of Polycystic Ovary Syndrome in Collegiate Females
Wee et al. Automated trisomy 21 assessment based on maternal serum markers using trivariate lognormal distribution
CN118173247A (zh) 一种女童中枢性性早熟的诊断系统
Aslan et al. AMH LEVELS MAY PREDICT FOR MULLERIAN ANOMALIES AND PREGNANCY OUTCOMES PATIENTS WITH PCOS

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942453

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21942453

Country of ref document: EP

Kind code of ref document: A1