CN117070620A - Marker and method for identifying central precocious girls in vitro - Google Patents
Marker and method for identifying central precocious girls in vitro Download PDFInfo
- Publication number
- CN117070620A CN117070620A CN202310948982.6A CN202310948982A CN117070620A CN 117070620 A CN117070620 A CN 117070620A CN 202310948982 A CN202310948982 A CN 202310948982A CN 117070620 A CN117070620 A CN 117070620A
- Authority
- CN
- China
- Prior art keywords
- mirna
- girl
- marker
- central
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003550 marker Substances 0.000 title claims abstract description 34
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000000338 in vitro Methods 0.000 title claims abstract description 17
- 108091070501 miRNA Proteins 0.000 claims abstract description 99
- 239000002679 microRNA Substances 0.000 claims abstract description 90
- 238000012163 sequencing technique Methods 0.000 claims abstract description 27
- 238000001514 detection method Methods 0.000 claims abstract description 21
- 210000002966 serum Anatomy 0.000 claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 210000004369 blood Anatomy 0.000 claims abstract description 9
- 239000008280 blood Substances 0.000 claims abstract description 9
- 108091061649 Homo sapiens miR-625 stem-loop Proteins 0.000 claims abstract description 7
- 238000007405 data analysis Methods 0.000 claims abstract description 7
- 238000007637 random forest analysis Methods 0.000 claims abstract description 3
- 238000012549 training Methods 0.000 claims description 10
- 208000006155 precocious puberty Diseases 0.000 claims description 7
- 238000002360 preparation method Methods 0.000 claims description 6
- 238000003908 quality control method Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 108091030146 MiRBase Proteins 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000002790 cross-validation Methods 0.000 claims description 3
- 210000000481 breast Anatomy 0.000 claims description 2
- 125000004122 cyclic group Chemical group 0.000 claims description 2
- 230000008030 elimination Effects 0.000 claims description 2
- 238000003379 elimination reaction Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 2
- 239000003147 molecular marker Substances 0.000 abstract 1
- 239000000243 solution Substances 0.000 description 9
- 238000002156 mixing Methods 0.000 description 7
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 5
- NMJREATYWWNIKX-UHFFFAOYSA-N GnRH Chemical compound C1CCC(C(=O)NCC(N)=O)N1C(=O)C(CC(C)C)NC(=O)C(CC=1C2=CC=CC=C2NC=1)NC(=O)CNC(=O)C(NC(=O)C(CO)NC(=O)C(CC=1C2=CC=CC=C2NC=1)NC(=O)C(CC=1NC=NC=1)NC(=O)C1NC(=O)CC1)CC1=CC=C(O)C=C1 NMJREATYWWNIKX-UHFFFAOYSA-N 0.000 description 5
- 108091067634 Homo sapiens miR-181c stem-loop Proteins 0.000 description 5
- 101000857870 Squalus acanthias Gonadoliberin Proteins 0.000 description 5
- 210000005259 peripheral blood Anatomy 0.000 description 5
- 239000011886 peripheral blood Substances 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 208000015474 Central precocious puberty Diseases 0.000 description 3
- 102400000932 Gonadoliberin-1 Human genes 0.000 description 3
- 101500026183 Homo sapiens Gonadoliberin-1 Proteins 0.000 description 3
- 108091044796 Homo sapiens miR-1290 stem-loop Proteins 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 238000001816 cooling Methods 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- XLXSAKCOAKORKW-AQJXLSMYSA-N gonadorelin Chemical compound C([C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N1[C@@H](CCC1)C(=O)NCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CC=1N=CNC=1)NC(=O)[C@H]1NC(=O)CC1)C1=CC=C(O)C=C1 XLXSAKCOAKORKW-AQJXLSMYSA-N 0.000 description 3
- 229960001442 gonadorelin Drugs 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 208000024172 Cardiovascular disease Diseases 0.000 description 2
- 208000008589 Obesity Diseases 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000003748 differential diagnosis Methods 0.000 description 2
- 230000002124 endocrine Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- -1 has-miR-874-3p Proteins 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000001990 intravenous administration Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 235000020824 obesity Nutrition 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 108090000623 proteins and genes Proteins 0.000 description 2
- 238000010839 reverse transcription Methods 0.000 description 2
- 239000003161 ribonuclease inhibitor Substances 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 108020005345 3' Untranslated Regions Proteins 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 108020004414 DNA Proteins 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 108700012941 GNRH1 Proteins 0.000 description 1
- 108091069047 Homo sapiens let-7i stem-loop Proteins 0.000 description 1
- 108091062137 Homo sapiens miR-454 stem-loop Proteins 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 101710086015 RNA ligase Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 208000020221 Short stature Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 208000030172 endocrine system disease Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000002745 epiphysis Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- JGBUYEVOKHLFID-UHFFFAOYSA-N gelred Chemical compound [I-].[I-].C=1C(N)=CC=C(C2=CC=C(N)C=C2[N+]=2CCCCCC(=O)NCCCOCCOCCOCCCNC(=O)CCCCC[N+]=3C4=CC(N)=CC=C4C4=CC=C(N)C=C4C=3C=3C=CC=CC=3)C=1C=2C1=CC=CC=C1 JGBUYEVOKHLFID-UHFFFAOYSA-N 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 101150108262 gnrh1 gene Proteins 0.000 description 1
- 230000011132 hemopoiesis Effects 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 230000002267 hypothalamic effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000004770 neurodegeneration Effects 0.000 description 1
- 208000015122 neurodegenerative disease Diseases 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 230000005305 organ development Effects 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000001817 pituitary effect Effects 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 239000003488 releasing hormone Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 239000012089 stop solution Substances 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 229910021642 ultra pure water Inorganic materials 0.000 description 1
- 239000012498 ultrapure water Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/10—Ploidy or copy number detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/178—Oligonucleotides characterized by their use miRNA, siRNA or ncRNA
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Analytical Chemistry (AREA)
- Organic Chemistry (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Theoretical Computer Science (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The application provides an in-vitro identification central precocious girl detection marker and a method, wherein the expression level of has-miR-584-5p, hsa-miR-625-3p and has-miR-7-5p in miRNA markers in serum of central precocious girl is found to be obviously changed; the method can be used as a biological molecular marker for detecting central precocious girls, and is used for the second-generation sequencing detection of circulating miRNA in a blood sample of a subject; the expression quantity of each miRNA in the test sample is obtained through data analysis, a feature subset is obtained through random forest modeling based on the expression quantity of the miRNA markers in the test sample, and the accuracy of miRNA combination is evaluated through calculation of AUC values, so that whether the subject suffers from central precocity is judged.
Description
Technical Field
The application relates to the technical field of biology, in particular to an in-vitro identification central precocious girl detection marker and a method thereof.
Background
Central precocity (Central precocious puberty), also known as true precocity, complete precocity, is the result of premature activation and maturation of the hypothalamic-pituitary-gonadal (HPG) axis. The incidence rate of children with sexual precocity in China is in a trend of increasing year by year and gradually decreasing in incidence age, and the incidence rate of girls is 10 times that of boys, and the children with sexual precocity are the second most endocrine diseases of children with obesity at present. The central precocious puberty refers to the fact that girls are clinically diagnosed through a GnRH excitation test, clinical confirmation standards are compiled according to the national endocrinological genetic metabonomics group of the Chinese medical society, and the biggest harm of children is that epiphysis closed in advance, so that short stature is caused, even the life height is less than 150cm, and the risks of onset of breast cancer, endometrial cancer, obesity, type 2 diabetes mellitus, cardiovascular diseases and the like after adults are obviously increased.
At present, central precocity cannot be effectively diagnosed through growth rate, height, body mass index and the like, or through bone age examination, hypothalamic pituitary nuclear magnetic resonance examination, uterine ovarian ultrasonic examination, endocrine hormone examination and the like. The clinically accepted diagnostic criteria for central precocity is the gonadotrophin releasing hormone (Gonadotrpin-releasing hormone, gnRH) challenge test. GnRH challenge tests diagnose central precocity by intravenous administration of gonadorelin, by intravenous blood withdrawal at 30, 60, 90 minutes before and after injection, for example, by a peak LH of >5.0I U/L and a peak LH/FSH of >0.6 after gonadorelin injection. Systemic and local allergic reactions occur in part of patients injected with gonadorelin, so that GnRH challenge tests are not suitable for outpatient service, hospitalization examination is needed, venous blood needs to be repeatedly extracted for the examination, and patients often have resistant moods, so that the application of the GnRH challenge tests in central precocity differential diagnosis is limited.
Micrornas (mirnas) are a class of non-coding short RNAs of about 19-25 nucleotides in length. It is capable of degrading target gene mRNA or inhibiting its translation by complete or incomplete pairing with the 3' UTR of the target gene mRNA. Past studies have shown that mirnas are involved in a variety of regulatory pathways including development, viral defense, hematopoiesis, organogenesis, cell proliferation and death, and the like. In recent years, a great deal of research shows that the abundance change of miRNAs is closely related to the occurrence and development of various diseases. Wherein blood circulating mirnas can exhibit significantly different expression profiles depending on the individual's different physiological and pathological states, and thus can be used to distinguish between normal and disease states. At present, the circulating miRNA is used for assisting in differential diagnosis, has dye fingers in various aspects of cancers, neurodegenerative diseases, cardiovascular and cerebrovascular diseases, aging and the like, but has not been developed in central precocity.
Therefore, it is necessary to develop in vitro detection markers of central precocity with clinical application value, and corresponding detection methods and reagents, so as to be used for rapid and convenient detection of central precocity population, and facilitate early clinical intervention.
Disclosure of Invention
The application aims to provide an in-vitro identification central precocious girl detection marker and a method, which are friendly to patients, simple in material acquisition compared with GnRH excitation tests and high in accuracy compared with other clinical detection means.
In a first aspect of the present application, there is provided an in vitro identification of a central precocious girl detection marker, said marker being a free miRNA marker from human serum, wherein said miRNA marker comprises: one or more than two of has-miR-584-5p, hsa-miR-625-3p and has-miR-7-5 p.
Further, the miRNA marker is mature miRNA in serum.
Furthermore, the expression level of the miRNA marker in blood is a relative expression level, and the miRNA marker has obvious statistical difference between central precocious girls and healthy girls, and can distinguish the central precocious girls from healthy girls and also distinguish the central precocious girls from simple breast precocious girls.
Furthermore, the free miRNA samples of the serum of the central precocious girl and the serum of the normal healthy girl form a training set, the samples are subjected to second generation sequencing and data analysis, and based on the miRNA with obvious differential expression between the serum sample of the central precocious girl and the serum sample of the normal healthy girl, the miRNA with statistical significance is screened out by using a recursive feature elimination cross validation (Recursive features eliminate cross validation, RFECV) method to be used as a marker.
The application also provides a method for identifying the central precocious girl detection marker in vitro, which is characterized by comprising the following steps:
(a) The method comprises the steps of forming a training set by using free miRNA samples of serum of a central precocious girl and normal healthy girl, and determining the expression level RPM of each miRNA in the samples by comparing the positions of the miRNAs in a human reference genome after library preparation, second-generation sequencing and data analysis of the samples;
(b) Taking the expression quantity RPM of miRNA as an independent variable, carrying out feature screening by utilizing a random forest, and selecting a feature subset with the most prediction capability for predicting the performance of the model;
(c) And (3) performing permutation and combination on the obtained feature subsets, drawing ROC curves one by one, calculating AUC values, and taking the highest AUC combination as a marker.
Further, in step (a), RPM of each miRNA expression level in the sample is obtained, specifically comprising the steps of:
(a1) Obtaining the next-machine data after library preparation and second-generation sequencing of the sample, and performing data quality control and pretreatment on the next-machine data through a quality control tool to obtain effective data with low-quality sequences and sequencing joints removed;
(a2) Comparing the sequence of the effective data with a human reference genome sequence to obtain miRNA position information positioned in the human reference genome sequence, wherein the miRNA position information is taken from a miRBase database, and when the 5 'end of a certain sequence is consistent with the 5' end position of a certain miRNA, the sequence is recorded as a sequencing sequence of the miRNA;
(a3) Determining the expression level RPM of each miRNA in the test sample, wherein RPM (reads per million) is the unit of expression level, and the expression level RPM of a miRNA is the percentage of the total amount of the miRNA sequencing sequences in the total amount of all sequencing sequences of the sample that are comparable to the human reference genome.
Further, in step (b), selecting a feature subset having the most predictive capability for predicting model performance, specifically comprising the steps of:
(b1) Carrying out Min-Max standardization treatment on the expression level RPM value of each miRNA, and firstly calculating the minimum value X of each feature min And maximum value X max For each feature j and sample i, a data scaled value X is calculated using equation S1 scaled The formula S1 is:
(b2) The two types of feature subsets were labeled, i.e., { "central precocity": 0, "NC":1}, training the middle sexual precocity and normal healthy serum samples;
(b3) Gradually converging by using an RFECV algorithm, reserving a feature subset with higher accuracy, so that through cyclic iteration, finally converging to a plurality of miRNAs, wherein the finally converged miRNAs are required to meet the following characteristics, (1) the miRNA combination shows the highest feature value; (2) The number of miRNAs can be in the range of computer acceptance when being arranged and combined.
In some embodiments, the convergence speed is controlled by modifying the min_features_to_select parameter, so as to obtain the effect of different feature quantities, and the feature subset with the highest accuracy is selected, so that the number of feature variables is determined.
Further, in step (c), the highest AUC combination is obtained, specifically comprising the steps of:
(c1) C, arranging and combining the feature subsets obtained in the step b3, drawing ROC curves one by combining, and calculating AUC values;
(c2) The miRNA combination with the maximum AUC value is taken as a marker.
Compared with the prior art, the application has the beneficial effects that:
(1) The peripheral blood sample is easier to obtain, has strong clinical operability and small wound, is favorable for a tested person to accept the detection, has wide application prospect, has better stability and higher content of serum miRNA, has relatively lower difficulties in extraction, library establishment and sequencing, and needs conventional experimental technology and easily purchased reagents and medicines;
(2) Compared with the existing means for detecting central precocity, the miRNA marker disclosed by the application has higher detection capability for central precocity identification, and the experimental cost based on second-generation sequencing is also in an acceptable range;
(3) By using the miRNA marker and the expression quantity RPM of the miRNA marker in the test sample, the application can judge whether the individual of the test sample suffers from central precocity or not by adopting a simple formula calculation, and the data analysis method is not complex, so that the application can be mastered by ordinary technicians quickly.
Drawings
The foregoing and other features of the present disclosure will be more fully described when considered in conjunction with the following drawings. It is appreciated that these drawings depict only several embodiments of the present disclosure and are therefore not to be considered limiting of its scope. The present disclosure will be described more specifically and in detail by using the accompanying drawings.
FIG. 1 shows that three miRNA combinations of has-miR-584-5p, hsa-miR-625-3p and has-miR-7-5p are obtained according to the arrangement of AUC, and the AUC value is 0.9885 at the highest; the combined AUC value of hsa-miR-625-3p, has-miR-874-3p, hsa-miR-181c-3p and hsa-miR-1290 is 0.9722 in the next highest; ROC plots with AUC values 0.9685 for three miRNA combinations of has-miR-584-5p, has-miR-652-3p and hsa-miR-181c-3 p.
Detailed Description
The following examples are described to aid in the understanding of the application and are not, nor should they be construed in any way to limit the scope of the application.
The experimental procedures, which do not address the specific conditions in the examples below, are followed by conventional experimental conditions, such as those described in the molecular cloning laboratory Manual of Sambrook et al (New York: cold Spring Harbor Laboratory Press, 1989), or by the manufacturer's recommendations. Percentages and parts are by weight unless otherwise indicated. Unless otherwise specified, the materials used in the examples are all commercially available products.
Example 1: obtaining training set samples
The applicant collected 28 cases of peripheral venous blood of girls with precocious central sexual intercourse from 7 months 2022 to 2 months 2023, each case containing 10mL of peripheral blood, with an average age of 8.6 years and an age distribution of 8.1-9.1 years. Meanwhile, 23 samples of peripheral venous blood of normal healthy girls (namely healthy controls without various diseases, the following is the same) are collected by the applicant, each sample contains 10mL of peripheral blood, the average age is 8.2 years, and the age distribution is 7.8-8.6. The two groups of samples are taken as training group samples, and the ages of the two groups of samples are not statistically significantly different, and the gender is female, so that the principle of gender and age matching is satisfied. For each peripheral blood sample, sequencing library preparation and second generation sequencing were performed to obtain off-machine data.
Example 2: obtaining sample miRNA expression profiles by sequencing libraries
Each training set sample was subjected to library preparation and second generation sequencing using the following reagents and procedures:
(1) Collecting 10mL of peripheral blood sample by a dry blood collection tube, standing at 4 ℃ for more than half an hour, then obtaining 400g of free RNA, centrifuging at 4 ℃ for 10 minutes to obtain supernatant, obtaining serum sample, and storing in a refrigerator at-80 ℃;
(2) 50-200ng of serum free RNA was extracted from the above serum samples using the Qiagen miRNeasy Serum/Plasma Kit (cat# 217184), diluted to a total volume of 4. Mu.L with ultrapure water (DNase and RNase free, the same applies hereinafter), and placed in 200. Mu.L thin-walled PCR tubes;
(3) Adding 1 mu L of 10 mu M-concentration adapter RA3 into the solution obtained in the step (2), uniformly mixing, reacting at 70 ℃ for 2 minutes, and immediately cooling on ice, wherein the sequence of RA3 is 5'-TGGAATTCTCGGGTGCCAAGG-3';
(4) Adding 2 mu L of HML (Ligation Buffer) (Illumina, accession number 15013206), 1 mu L RNase Inhibitor (Illumina, accession number 15003548), 1 mu L T RNA, ligation 2, ligation, and Ligation (Epicentre, accession number LR2D 11310K) into the solution obtained in the step (3), mixing, and incubating at 28 ℃ for 1 hour; (5) Adding 1 mu L of STP (Stop Solution) (Illumina, product number 15016304) into the Solution obtained in the step (4), uniformly mixing, and incubating at 28 ℃ for 15 minutes;
(6) Taking a new PCR tube, adding 1.1 mu L of adapter RA5, incubating for 2 minutes at 70 ℃ with the base sequence of 5'-GUUCAGAGUUCUACAGUCCGACGAUC-3', RA and the concentration of 10 mu M, and immediately cooling on ice after the reaction;
(7) Adding 1.1 mu L of 10mM ATP (Illumina, accession number 15007432) to the solution obtained in the step (6), and adding 1.1 mu L T RNA ligase (Illumina, accession number 1000587) and uniformly mixing;
(8) Taking 3 mu L of the solution obtained in the step (7), adding the solution obtained in the step (5), uniformly mixing, and reacting at 28 ℃ for 1 hour;
(9) Adding 1 mu L of RNA RT Primer (10 mu M) into the solution obtained in the step (8), uniformly mixing, reacting for 2 minutes at 70 ℃, and carrying out reverse transcription reaction to obtain a first strand of DNA, wherein the sequence of the reverse transcription Primer RT Primer is 5'-CCTTGGCACCCGAGAATTCCA-3', and immediately placing the first strand on ice for cooling after the reaction;
(10) To the solution obtained in step (9) was added 2. Mu.L of 5X First Strand Buffer (Thermo, cat. No. 1889832), 0.5. Mu.L of dNTP Mix (12.5 mM, illumina, cat. No. 11318102), 1. Mu.L of 100mM DTT (Thermo, cat. No. 1850670), 1. Mu. L RNase Inhibitor and 1. Mu. L SuperScript II Reverse Transcriptase (Thermo, cat. No. 2008170) and incubated at 50℃for 1 hour;
(11) Adding 25 mu L of PML (PCR Mix) (Illumina, cat. No. 15022681), 2 mu L of Primer1 (10 mu M) and 2 mu L of Primer2 (10 mu M) into the solution obtained in the step (10), uniformly mixing, performing PCR reaction, performing pre-denaturation at 98 ℃ for 30s, denaturation at 98 ℃ for 10s, annealing at 60 ℃ for 30s, and extension at 72 ℃ for 15s, performing 18 cycles, and then extending at 72 ℃ for 10min and preserving at 4 ℃; wherein the sequence of Primer1 is
5'-CAAGCAGAAGACGGCATACGAGATGTCGTGATGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA-3' the sequence of Primer2 is 5'-AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGA-3', and the 8 bases "GTCGTGAT" in Primer1 is index sequence;
(12) And (3) performing 6% polyacrylamide gel electrophoresis on the PCR product obtained in the step (11), dyeing with 120V of voltage and 1h of time for 5 minutes by using Gelred dye liquor, then observing and photographing under an ultraviolet lamp, cutting off strips between 149 and 169, recovering, performing fragment length range detection by using an Agilent 2100 Bioanalyzer and concentration quantification by using Invitrogen Qubit, and sending to an Illumina Novaseq 6000 sequencing platform for sequencing, wherein the sequencing read length is 75bp, and the sequencing mode is single-ended sequencing, thereby obtaining the next machine data.
And carrying out data analysis on the machine-setting data of the training set sample by adopting the following steps to obtain the expression quantity RPM of each miRNA in the sample:
(1) Performing data quality control and pretreatment (using default parameters) on the off-machine data of the sample by using FastQC, cutadcat and Trimmomatic to obtain effective data with low-quality sequences and sequencing adaptors removed;
(2) Removing the base sequence in RA5 from the 5' -end of the sequence of the effective data, and then using sequence alignment software Bowtie to align the obtained sequence onto the human reference genome sequence (allowing up to 1 base mismatch) so as to obtain the position information of the human reference genome;
(3) Comparing the position of the obtained sequence with the position of the miRNA in the human reference genome, and determining the expression level RPM of each miRNA in the sample. RPM value information for 466 mirnas was obtained altogether. The miRNA position information is taken from a miRBase database (http:// www.mirbase.org /), and when the 5 'end of a certain sequence is consistent with the 5' end of the certain miRNA, the sequence is recorded as the sequencing sequence of the miRNA; each miRNA expression level RPM (reads per million) is the sum of the miRNA sequencing sequences in parts per million of the total of all sequencing sequences of the sample that can be aligned to the reference genome.
Example 3: obtaining miRNA markers for central precocity identification
The expression RPM value of each miRNA was normalized by using the formula S1 using the MinMaxScaler tool in the scikit-learn library. The central precocious puberty group was defined as 0 and the nc group was defined as 1. The reserved features were gradually converged using the RFECV algorithm, where the convergence rate was controlled to 9 feature subsets, the code was as follows:
the accuracy in the convergence process is as follows:
399 miRNAs: 0.931034482758620793;253 miRNAs: 0.7931034482758621;174 mirnas: 0.7931034482758621;91 miRNAs: 0.8275862068965517;57 mirnas: 0.7931034482758621;39 miRNAs: 0.8620689655172413;36 miRNAs: 0.7931034482758621;26 mirnas: 0.9255172413793104;18 miRNAs: 0.9310344827586207;12 miRNAs: 0.896551724137931;9 miRNAs: 0.9310344827586207.
it can be seen that the same accuracy can be achieved for the effect of retaining 9 features and the effect of retaining 18 features, but the workload of arranging and combining the 9 features is less than 18, and can be within the acceptance range of the computer.
They are respectively has-miR-584-5p, has-miR-625-3p, has-miR-652-3p, has-miR-7-5p, has-miR-874-3p, hsa-miR-181c-3p, hsa-miR-1290, hsa-miR-454-3p and hsa-let-7i-3p. The 9 mirnas obtained were combined in a permutation to obtain 511 combinations, and the ROC curves were plotted one by one and AUC values were calculated. According to the AUC arrangement, three miRNA combinations of has-miR-584-5p, hsa-miR-625-3p and has-miR-7-5p are obtained, the AUC value is 0.9885 at most, and in addition, the AUC value of the four miRNA combinations of hsa-miR-625-3p, has-miR-874-3p, hsa-miR-181c-3p and hsa-miR-1290 is 0.9722 at the next highest; the combined AUC values of three miRNAs, has-miR-584-5p, has-miR-652-3p and hsa-miR-181c-3p, are 0.9685, and are shown in FIG. 1.
Therefore, the miRNA marker can well distinguish the central precocious patients from healthy people, and can be used as an auxiliary diagnosis basis for central precocious identification.
While the application has been disclosed in terms of various aspects and embodiments, other aspects and embodiments will be apparent to those skilled in the art in view of this disclosure, and many changes and modifications can be made without departing from the spirit of the application. The various aspects and embodiments of the present application are disclosed for illustrative purposes only and are not intended to limit the application, the true scope of which is set forth in the following claims.
Claims (9)
1. An in vitro identification of a central precocious girl detection marker, wherein the marker is a free miRNA marker from human serum, wherein the miRNA marker comprises:
one or more than two of has-miR-584-5p, hsa-miR-625-3p and has-miR-7-5 p.
2. The in vitro diagnostic central precocious girl detection marker according to claim 1, wherein said miRNA marker is a mature miRNA in serum.
3. The in vitro identification central precocious girl detection marker according to claim 1, wherein the expression level of the miRNA marker in blood is a relative expression level, and the miRNA marker has a significant statistical difference between central precocious girl and healthy girl, and can distinguish central precocious girl from healthy girl, and also can distinguish central precocious girl from simple breast precocious girl.
4. The in vitro identification central precocity girl detection marker according to claim 1, wherein the free miRNA samples of the serum of central precocity girl and normal healthy girl are formed into a training set, the samples are subjected to second generation sequencing and data analysis, and the miRNA with statistical significance is screened out as the marker by using a recursive feature elimination cross-validation method based on the miRNA with significant differential expression between the serum sample of central precocity girl and the serum sample of normal healthy girl.
5. A method for in vitro identification of a central precocious girl test marker according to any one of claims 1 to 4, comprising the steps of:
(a) The method comprises the steps of forming a training set by using free miRNA samples of serum of a central precocious girl and normal healthy girl, and determining the expression level RPM of each miRNA in the samples by comparing the positions of the miRNAs in a human reference genome after library preparation, second-generation sequencing and data analysis of the samples;
(b) Taking the expression quantity RPM of miRNA as an independent variable, carrying out feature screening by utilizing a random forest, and selecting a feature subset with the most prediction capability for predicting the performance of the model;
(c) And (3) performing permutation and combination on the obtained feature subsets, drawing ROC curves one by one, calculating AUC values, and taking the highest AUC combination as a marker.
6. The method for in vitro identification of central precocious girl detection markers according to claim 5, wherein in step (a) RPM of the expression level of each miRNA in the sample is obtained, comprising in particular the steps of:
(a1) Obtaining the next-machine data after library preparation and second-generation sequencing of the sample, and performing data quality control and pretreatment on the next-machine data through a quality control tool to obtain effective data with low-quality sequences and sequencing joints removed;
(a2) Comparing the sequence of the effective data with a human reference genome sequence to obtain miRNA position information positioned in the human reference genome sequence, wherein the miRNA position information is taken from
A miRBase database, wherein when the 5 'end of a certain sequence is consistent with the 5' end position of a certain miRNA, the sequence is recorded as the sequence of the miRNA;
(a3) Determining the expression level RPM of each miRNA in the test sample, wherein RPM (reads per
million) is the unit of expression, and the expression RPM of a miRNA is the percentage of the total amount of the miRNA sequenced sequences in the sample over the total amount of sequenced sequences that can be aligned to the human reference genome.
7. The method for in vitro identification of central precocious girl detection markers according to claim 5, wherein in step (b) a subset of features is selected which has the most predictive power for predicting the performance of the model, comprising in particular the steps of:
(b1) Carrying out Min-Max standardization treatment on the expression level RPM value of each miRNA, and firstly calculating the minimum value X of each feature min And maximum value X max For each feature j and sample i, a data scaled value X is calculated using equation S1 scaled The formula S1 is:
(b2) The two types of feature subsets were labeled, i.e., { "central precocity": 0, "NC":1}, training the middle sexual precocity and normal healthy serum samples;
(b3) Gradually converging by using an RFECV algorithm, reserving a feature subset with higher accuracy, so that through cyclic iteration, finally converging to a plurality of miRNAs, wherein the finally converged miRNAs are required to meet the following characteristics, (1) the miRNA combination shows the highest feature value; (2) The number of miRNAs can be in the range of computer acceptance when being arranged and combined.
8. The method for in vitro identification of central precocious girl detection markers according to claim 5, wherein the convergence rate is controlled by modifying the min_features_to_select parameter to obtain the effect of different feature quantities, and the feature subset with the highest accuracy is selected to determine the number of feature variables.
9. The method for in vitro identification of central precocious girl detection markers according to claim 5, wherein in step (c) the highest AUC combination is obtained, comprising in particular the steps of:
(c1) C, arranging and combining the feature subsets obtained in the step b3, drawing ROC curves one by combining, and calculating AUC values;
(c2) The miRNA combination with the maximum AUC value is taken as a marker.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310948982.6A CN117070620A (en) | 2023-07-31 | 2023-07-31 | Marker and method for identifying central precocious girls in vitro |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310948982.6A CN117070620A (en) | 2023-07-31 | 2023-07-31 | Marker and method for identifying central precocious girls in vitro |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117070620A true CN117070620A (en) | 2023-11-17 |
Family
ID=88716196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310948982.6A Pending CN117070620A (en) | 2023-07-31 | 2023-07-31 | Marker and method for identifying central precocious girls in vitro |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117070620A (en) |
-
2023
- 2023-07-31 CN CN202310948982.6A patent/CN117070620A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109609633B (en) | Serum miRNA marker related to breast cancer auxiliary diagnosis and application thereof | |
CN112301130B (en) | Marker, kit and method for early detection of lung cancer | |
CN112609003A (en) | Composition and kit for identifying benign and malignant thyroid nodules and application of composition and kit | |
WO2023105297A2 (en) | Urine mirna marker for bladder cancer diagnosis, diagnostic reagent and kit | |
TWI829042B (en) | Early detection and prediction method of pan-cancer | |
CN106701962B (en) | Primer group, probe and kit for detecting Kawasaki disease | |
CN108728532B (en) | A kind of Microrna marker and its application | |
CN111826438A (en) | MiRNA (micro ribonucleic acid) markers for auxiliary diagnosis of esophageal squamous carcinoma and application thereof | |
CN116987791B (en) | Application of plasma markers in identification of benign and malignant thyroid nodule | |
TWI758670B (en) | Health risk assessment method | |
CN109536502B (en) | PCR (polymerase chain reaction) internal reference applicable to plasma exosome miRNA of patient with gestational trophoblastic tumor | |
CN109593851B (en) | Plasma miRNA marker related to breast cancer auxiliary diagnosis and application thereof | |
CN107312851A (en) | Myocardial infarction biomarker miR 1283 | |
CN117070620A (en) | Marker and method for identifying central precocious girls in vitro | |
CN111424085A (en) | Application of tRNA (transfer ribonucleic acid) derived fragment in preparation of breast cancer diagnostic reagent | |
CN108300788A (en) | A kind of micro RNA combination and its application for detecting light-duty brain trauma | |
CN116987780A (en) | Marker and method for in-vitro identification of central precocious puberty and simple breast development | |
CN115772524A (en) | Marker combination and application thereof in preparation of reagent for diagnosing thyroid cancer | |
WO2023105296A2 (en) | Urine mirna marker for prostate cancer diagnosis, diagnostic reagent and kit | |
CN114107514A (en) | miRNA molecular marker for colorectal cancer diagnosis and kit thereof | |
CN109266750B (en) | Biomarker for nasopharyngeal carcinoma diagnosis and application | |
CN109266751B (en) | Biomarker combination for nasopharyngeal carcinoma diagnosis and application | |
CN108998528B (en) | Lung cancer diagnosis molecular marker lncRNA LINC00516, kit and application thereof | |
CN107523641B (en) | Serum miRNAs biomarkers and application thereof | |
CN106119396B (en) | Plasma miRNA marker related to hashimoto thyroiditis auxiliary diagnosis and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |