CN112037855A - Method for predicting alcohol consumption based on gene screening - Google Patents

Method for predicting alcohol consumption based on gene screening Download PDF

Info

Publication number
CN112037855A
CN112037855A CN202010747988.3A CN202010747988A CN112037855A CN 112037855 A CN112037855 A CN 112037855A CN 202010747988 A CN202010747988 A CN 202010747988A CN 112037855 A CN112037855 A CN 112037855A
Authority
CN
China
Prior art keywords
drinking
gene
user
capacity
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010747988.3A
Other languages
Chinese (zh)
Other versions
CN112037855B (en
Inventor
朱慧彬
何荣军
何皓璠
赵锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Yinton Medical Laboratory Co ltd
Original Assignee
Suzhou Yinton Medical Laboratory Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Yinton Medical Laboratory Co ltd filed Critical Suzhou Yinton Medical Laboratory Co ltd
Priority to CN202010747988.3A priority Critical patent/CN112037855B/en
Publication of CN112037855A publication Critical patent/CN112037855A/en
Priority to PCT/CN2021/109453 priority patent/WO2022022665A1/en
Application granted granted Critical
Publication of CN112037855B publication Critical patent/CN112037855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for predicting drinking capacity based on gene screening, which comprises the following steps: s1, acquiring the relation between the drinking capacity and the drinking capacity of the sample, dividing the sample into a first preset number of drinking section positions according to the drinking capacity, and establishing a first database according to the relation between the drinking capacity and the drinking capacity of the sample and the drinking section positions; s2, acquiring gene data of the sample and formatting the gene data; s3, constructing a drinking capacity prediction model according to the gene data of the formatted sample and the first database; and S4, predicting the drinking capacity of the user based on the drinking capacity prediction model according to the gene data of the user. Has the advantages that: the method for predicting the drinking capacity based on gene screening provides a drinking capacity judgment standard, quantifies the individual drinking capacity, gives more visual and valuable drinking capacity evaluation and drinking advice according to the physical condition of a user, and improves the experience of the user.

Description

Method for predicting alcohol consumption based on gene screening
Technical Field
The invention relates to the technical field of biological genes, in particular to a method for predicting drinking capacity based on gene screening.
Background
After entering human body, alcohol enters blood circulation through oral cavity, esophagus, stomach, intestine and other organs directly through biomembrane and is transported to various tissues and organs of the whole body rapidly for metabolism and utilization. There are two enzymes in the human body that perform alcohol metabolism: under the catalysis of alcohol dehydrogenase, ethanol is oxidized into acetaldehyde; acetaldehyde is converted to acetic acid by acetaldehyde dehydrogenase. Alcohol metabolism is mainly accomplished by two enzymes (alcohol dehydrogenase and acetaldehyde dehydrogenase) together, the difference in drinking ability (alcohol amount) between individuals is mainly determined by the activities of the two enzymes, and the amount of activity of the enzymes is determined by genes, and the alcohol amount of people is determined by the genes.
Wine is used as an important beverage in part of people's lives, derives various wine cultures, and is indispensable in specific occasions. However, researches show that people are not suitable for drinking, and the harm to the body caused by excessive drinking is great; and the drinking capacities of different people are greatly different, so that the correct cognition of the alcohol metabolism ability of the people is very important when a healthy drinking standard is provided.
Similar products on the market at present detect the alcohol metabolism ability of users, do not quantify the drinking ability, have weak guiding effect on the users, and cannot give targeted drinking suggestions according to the physical conditions of the users.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the art described above. Therefore, the invention aims to provide a drinking capacity prediction method based on gene screening, which provides a drinking capacity judgment standard, quantifies the individual drinking capacity, gives more intuitive and valuable drinking capacity evaluation and drinking advice according to the physical condition of a user, and improves the user experience.
In order to achieve the above object, an embodiment of the present invention provides a method for predicting alcohol consumption based on gene screening, including:
s1, acquiring the relation between the drinking capacity and the drinking capacity of the sample, dividing the sample into a first preset number of drinking section positions according to the drinking capacity, and establishing a first database according to the relation between the drinking capacity and the drinking capacity of the sample and the drinking section positions;
s2, acquiring gene data of the sample and formatting the gene data;
s3, constructing a drinking capacity prediction model according to the gene data of the formatted sample and the first database;
and S4, predicting the drinking capacity of the user based on the drinking capacity prediction model according to the gene data of the user.
According to the method for predicting the drinking capacity based on gene screening, provided by the invention, the relation between the drinking capacity and the drinking capacity of a sample is obtained through a questionnaire survey method, data analysis is carried out, the drinking capacity is divided into a first preset number of drinking segment positions, a first database is established, the drinking level is divided into the drinking amount, the drinking amount is specifically quantized, and more valuable drinking suggestions are provided. The gene data of the sample are formatted, a drinking capacity prediction model is constructed according to the formatted gene data of the sample and the first database, the gene data of a user needing to inquire the drinking capacity are obtained, the drinking capacity prediction model is input to predict the drinking capacity of the user, the individual drinking capacity is quantized, more intuitive and valuable drinking capacity evaluation and drinking advice are given according to the physical condition of the user, and the user experience is improved.
According to some embodiments of the invention, the obtaining and formatting gene data of the sample comprises:
s21, collecting saliva of the sample;
s22, extracting DNA according to the saliva of the sample, and performing gene sequencing on the extracted DNA;
s23, processing the gene data after gene sequencing to obtain the genotype of the gene locus related to the drinking capacity of each sample;
and S24, formatting the gene locus into numbers according to the genotype.
According to some embodiments of the invention, the gene locus screening is performed on gene data of the formatted sample, comprising:
s241, respectively calculating the purity improvement value or uncertainty reduction value of each data subset obtained after the first database is divided and the data set before division;
s242, selecting a gene locus N with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value N of the gene locus N, wherein the gene locus N is used as a node, and the first database is divided into two sub data sets according to the grouping of the characteristic value N of the gene locus N;
s243, sequentially calculating the purity improvement value or uncertainty reduction value of the characteristic value of each gene locus in the two subdata sets; selecting a gene locus M with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value M of the gene locus M, wherein the gene locus M is used as a child node, and the child data set is split again according to the grouping of the characteristic value M of the gene locus M;
and S244, stopping splitting when the purity of the divided subdata set is determined to be greater than a preset purity threshold or the uncertainty value is determined to be smaller than a preset uncertainty threshold, and finally obtaining the gene locus related to the drinking volume and the relationship between the gene locus and the drinking section.
According to some embodiments of the invention, a machine learning model is selected to construct the alcohol consumption prediction model.
According to some embodiments of the invention, the predicting the drinking capacity of the user based on the drinking capacity prediction model according to the user's gene data comprises:
s41, acquiring basic information of the user and saliva of the user;
s42, extracting DNA according to the saliva of the user, and performing gene sequencing on the extracted DNA;
s43, processing the gene data after gene sequencing to obtain the genotype of the designated site, and formatting the genotype data of the designated sites rs1229984 and rs 671;
s44, inputting the formatted genotype data into a drinking capacity prediction model;
and S45, issuing the first prediction result output by the alcohol consumption prediction model to the user terminal according to the user basic information.
According to some embodiments of the invention, obtaining genetic data of the sample is performed by DNA extraction by collecting blood.
According to some embodiments of the invention, further comprising:
s71, acquiring second information influencing the drinking capacity of the user, wherein the second information comprises: disease history, type of drinking, alcohol degree, and alcohol frequency;
s72, acquiring a first prediction result of the drinking capacity output by the drinking capacity prediction model based on the gene data of the user;
s73, calculating a second prediction result of the drinking capacity according to the first prediction result and the second information according to a preset algorithm;
and S74, issuing the second prediction result to the user terminal according to the user basic information.
According to some embodiments of the invention, the preset algorithm comprises:
calculating the amount of ethanol in the first prediction of alcohol consumption:
V1=A×c
wherein A is the drinking capacity (ml) output by the alcohol capacity prediction model based on the gene data of the user; c is the preset alcohol concentration (% vol) in the alcohol capacity prediction model;
calculating the ethanol amount in the second prediction of alcohol consumption:
V2=V1×d×t×f
wherein d is a correlation coefficient of the disease history and the drinking capacity of the user; t is a correlation coefficient between the type of drinking and the drinking amount; f is a correlation coefficient of the drinking frequency and the drinking amount;
alcohol consumption of the second prediction result:
Figure BDA0002609020080000041
wherein, cuThe alcohol degree input by the user.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a method for predicting alcohol consumption based on genetic screening according to an embodiment of the present invention;
FIG. 2 is a flow chart of the processing of genetic data of a sample according to one embodiment of the present invention;
FIG. 3 is a flowchart of the screening for alcohol consumption-related gene loci according to one embodiment of the present invention;
FIG. 4 is a flow diagram of prediction of user alcohol consumption according to one embodiment of the present invention;
FIG. 5 is a flow chart of prediction of user alcohol consumption according to yet another embodiment of the present invention;
FIG. 6 is a schematic diagram of a decision tree of alcohol consumption-related genes and alcohol consumption segments according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
A method for predicting alcohol consumption based on genetic screening according to an embodiment of the present invention will be described with reference to FIGS. 1 to 6.
The embodiment of the invention provides a method for predicting drinking capacity based on gene screening, which comprises the following steps:
s1, acquiring the relation between the drinking capacity and the drinking capacity of the sample, dividing the sample into a first preset number of drinking section positions according to the drinking capacity, and establishing a first database according to the relation between the drinking capacity and the drinking capacity of the sample and the drinking section positions;
s2, acquiring gene data of the sample and formatting the gene data;
s3, constructing a drinking capacity prediction model according to the gene data of the formatted sample and the first database;
and S4, predicting the drinking capacity of the user based on the drinking capacity prediction model according to the gene data of the user.
According to the method for predicting the drinking capacity based on gene screening, provided by the invention, the relation between the drinking capacity and the drinking capacity of a sample is obtained through a questionnaire survey method, data analysis is carried out, the drinking capacity is divided into a first preset number of drinking segment positions, a first database is established, the drinking level is divided into the drinking amount, the drinking amount is specifically quantized, and more valuable drinking suggestions are provided. The gene data of the sample are formatted, a drinking capacity prediction model is constructed according to the formatted gene data of the sample and the first database, the gene data of a user needing to inquire the drinking capacity are obtained, the drinking capacity prediction model is input to predict the drinking capacity of the user, the individual drinking capacity is quantized, more intuitive and valuable drinking capacity evaluation and drinking advice are given according to the physical condition of the user, and the user experience is improved.
According to some embodiments of the invention, the obtaining and formatting gene data of the sample comprises:
s21, collecting saliva of the sample;
s22, extracting DNA according to the saliva of the sample, and performing gene sequencing on the extracted DNA;
s23, processing the gene data after gene sequencing to obtain the genotype of the gene locus related to the drinking capacity of each sample;
and S24, formatting the gene locus into numbers according to the genotype.
The working principle and the beneficial effects of the technical scheme are as follows: obtaining gene data of a sample, and performing DNA extraction, gene sequencing and genotyping on saliva of the sample; the gene sequencing method comprises the following steps: at least one of chip sequencing, second-generation sequencing, third-generation sequencing, PCR sequencing and panel sequencing. Finally, the genotype of the genetic locus related to the drinking capacity of each sample is obtained, and in order to effectively calculate the influence of the genetic locus on the drinking capacity, the genetic locus is formatted into numbers according to the genotype. Illustratively, the wild type is 0, the heterozygous mutant type is 1, and the homozygous mutant type is 2. If at the locus of the rs1229984 gene, CC is a homozygous mutant type and is formatted into a number of 2; TT is wild type, formatted to number 0; CT is a hybrid mutant type, and the formatting number is 1; if at the locus of the rs671 gene, AA is homozygous mutant and is formatted into a number of 2; GG is wild type, formatted into a number of 0; AG is a heterozygous mutant with a formatting number of 1.
In one embodiment, the gene data of the formatted sample is subjected to gene locus screening, comprising:
s241, respectively calculating the purity improvement value or uncertainty reduction value of each data subset obtained after the first database is divided and the data set before division;
s242, selecting a gene locus N with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value N of the gene locus N, wherein the gene locus N is used as a node, and the first database is divided into two sub data sets according to the grouping of the characteristic value N of the gene locus N;
s243, sequentially calculating the purity improvement value or uncertainty reduction value of the characteristic value of each gene locus in the two subdata sets; selecting a gene locus M with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value M of the gene locus M, wherein the gene locus M is used as a child node, and the child data set is split again according to the grouping of the characteristic value M of the gene locus M;
and S244, stopping splitting when the purity of the divided subdata set is determined to be greater than a preset purity threshold or the uncertainty value is determined to be smaller than a preset uncertainty threshold, and finally obtaining the gene locus related to the drinking volume and the relationship between the gene locus and the drinking section.
The working principle and the beneficial effects of the technical scheme are as follows: the method for measuring the purity and uncertainty of the data set before and after dividing the data set comprises the steps of calculating at least one parameter of information gain, information gain rate and a kini coefficient, wherein in the method for determining the purity and uncertainty according to the kini coefficient, the larger the kini coefficient is, the higher the uncertainty of the data is, the lower the sample purity is, and the smaller the proportion of a target sample in the data set in the total sample is; the smaller the kini coefficient is, the lower the uncertainty of the data is, the higher the sample purity is, and the higher the proportion of the target sample in the data set in the total sample is represented; and when the Gini coefficient is smaller than a preset numerical value, the divided subdata sets are shown to have the purity larger than a preset purity threshold value or the uncertainty value is smaller than a preset uncertainty threshold value, the splitting is stopped, and finally the gene locus related to the drinking capacity and the relationship between the gene locus and the drinking segment position are obtained. For example, when the kini coefficient is equal to 0, all samples in the dataset are of the same class.
In one embodiment, as shown in fig. 6, it is determined whether the result of the rs671 gene site of the sample is GG, i.e., it is determined whether the formatting of the rs671 gene site of the sample is 0, and the first database is divided into two data sets, i.e., a first data set and a second data set, according to whether the result of the rs671 gene site of the sample in the first data set is GG and the result of the rs671 gene site of the sample in the second data set is AA and AG; calculating the keny coefficient of the characteristic value of each gene locus in the first data set and the second data set, and selecting the gene locus A with the minimum calculated keny coefficient and the characteristic value a of the gene locus A, wherein the gene locus A is used as a child node, and the data sets are split again according to the grouping of the characteristic value a of the gene locus A; for example, in the first data set, it is determined whether the rs1229984 gene locus of the sample is CC or CT, and when it is determined as False, the rs1229984 gene locus of the sample is TT, i.e., the result of the rs671 gene locus of the sample in the group is GG, and the rs1229984 gene locus of the sample is TT, as shown in table one, the drinking segment is 8 segments. When determining that each group is a sample of the same type, namely the Gini coefficient is 0, stopping splitting, and finally obtaining the gene locus related to the drinking capacity and the relationship between the gene locus and the drinking section. The genes related to the drinking capacity are divided into corresponding drinking sections according to the gene types, so that the method is convenient to memorize, can accurately reflect the corresponding relation between the gene types and the drinking capacity, is clear at a glance, and improves the user experience.
According to some embodiments of the invention, obtaining genetic data of the sample is performed by DNA extraction by collecting blood.
According to some embodiments of the present invention, the gene loci related to drinking capacity include an rs1229984 gene locus and an rs671 gene locus, wherein the rs1229984 gene locus is located on the ADH1B gene, and when the result of the rs1229984 gene locus is TT type, the activity of alcohol dehydrogenase is strong, and alcohol metabolism is fast; the results show that the activity of the ethanol dehydrogenase is moderate in the CT type, and the metabolism speed of the ethanol is moderate; the result shows that the activity of the alcohol dehydrogenase is weak in the CC type, and the metabolism speed of the alcohol is slow; the rs671 gene locus is positioned on an ALDH2 gene, and the result of the rs671 gene locus is that acetaldehyde dehydrogenase activity is strong and acetaldehyde metabolism is fast when GG type genes are adopted; as a result, the activity of acetaldehyde dehydrogenase was weak in GA \ AA type, and acetaldehyde metabolism was slow.
According to some embodiments of the invention, a machine learning model is selected to construct the alcohol consumption prediction model. The machine learning model includes: at least one of linear classification, linear regression, Support Vector Machine (SVM), decision tree, naive Bayes, random forest, and neural network model.
Specifically, a decision tree classification model is selected to construct a drinking capacity prediction model;
the algorithm comprises the following steps:
using Python to program and call a decisionTreeConsiliier module of Sklern to carry out data mining and construct a drinking capacity prediction model;
decisiontreelsifier module main parameter settings:
criterion ═ gini': selecting a Gini coefficient as a measurement standard of node division quality;
splitter ═ best': finding the best cut point among all the features;
max _ depth ═ None: setting the maximum depth of the decision tree, wherein None represents that the maximum depth of the decision tree is not restricted until samples on each leaf node belong to the same class;
min _ samples _ split ═ 2: when an internal node is partitioned, the minimum number of samples on the node is required to be 2;
min _ samples _ leaf ═ 1: setting the minimum number of samples on the leaf node to be 1;
finally, the relation between the rs1229984 gene locus and the rs671 gene locus and the drinking capacity is obtained, and when the first preset number is 9, the relation is shown in the table I.
Watch 1
Figure BDA0002609020080000091
In one embodiment, the first predetermined number is 7, and the drinking segment is 7 segments, and the relationship between the rs1229984 gene locus and the rs671 gene locus and the drinking amount is shown in Table two.
Watch two
Figure BDA0002609020080000092
The working principle and the beneficial effects of the technical scheme are as follows: when the drinking section is 0, 3 situations are included: 1. the locus of the rs1229984 gene is CC, and the locus of the rs671 gene is AA; 2. the locus of the rs1229984 gene is TT, and the locus of the rs671 gene is AA; 3. the locus of the rs1229984 gene is CT, and the locus of the rs671 gene is AA. The naming of the drinking segment position is named in a discontinuous mode, such as 3 segments and 6 segments which are lacked, the discontinuous mode naming can match the drinking segment position with the specific alcohol capacity of the alcohol capacity, and for example, when the drinking segment position is 9 segments, the alcohol capacity of a user is more than 9.
According to some embodiments of the invention, the predicting the drinking capacity of the user based on the drinking capacity prediction model according to the user's gene data comprises:
s41, acquiring basic information of the user and saliva of the user;
s42, extracting DNA according to the saliva of the user, and performing gene sequencing on the extracted DNA;
s43, processing the gene data after gene sequencing to obtain the genotype of the designated site, and formatting the genotype data of the designated sites rs1229984 and rs 671;
s44, inputting the formatted genotype data into a drinking capacity prediction model;
and S45, issuing the first prediction result output by the alcohol consumption prediction model to the user terminal according to the user basic information.
The working principle and the beneficial effects of the technical scheme are as follows: DNA extraction and chip sequencing are carried out according to saliva of a user, an rs1229984 gene locus and an rs671 gene locus of the user are found out, whether the gene related to the drinking capacity of the user is mutated or not and in which state is analyzed, gene data of the user is input into a drinking capacity prediction model, and a corresponding prediction result is output by the drinking capacity prediction model. For example, if the locus of the rs1229984 gene of the user is CC and the locus of the rs671 gene of the user is AG, the prediction result is: as shown in Table II, the user can drink about 1 two wines (taking 50-degree white spirit as an example) with the drinking amount of 1 stage, and the drinking is recommended to be a small amount. And according to the basic user information provided by the user, the prediction result is sent to the user terminal, so that the user can check the prediction result conveniently, and the user experience is improved. The basic information of the user comprises gender, age, name and contact information.
According to some embodiments of the invention, further comprising:
s71, acquiring second information influencing the drinking capacity of the user, wherein the second information comprises: disease history, type of drinking, alcohol degree, and alcohol frequency;
s72, acquiring a first prediction result of the drinking capacity output by the drinking capacity prediction model based on the gene data of the user;
s73, calculating a second prediction result of the drinking capacity according to the first prediction result and the second information according to a preset algorithm;
and S74, issuing the second prediction result to the user terminal according to the user basic information.
The working principle and the beneficial effects of the technical scheme are as follows: correcting the prediction of the drinking capacity by combining a first prediction result of the drinking capacity output by the drinking capacity prediction model based on the gene data of the user and second information which influences the drinking capacity of the user by the actual condition, wherein the second information comprises: disease history, type of drinking, alcohol degree, and alcohol frequency; for example, the gene data of the user is rs1229984 gene locus CC and rs671 gene locus GG, and as shown in table two, the drinking amount of the user is predicted to be 7 segments, that is, the user can drink more than 7 pieces of wine (taking 50 ° white spirit as an example), but the user is recently on stomach illness and cannot drink wine, and the stomach perforation is easily caused by drinking wine, and the health is seriously damaged. Similarly, the prediction of the amount of alcohol consumed by the user may be influenced by the type of alcohol consumed by the user, the alcohol consumption level, and the alcohol consumption frequency. And calculating to obtain a second prediction result of the drinking capacity according to a preset algorithm, and issuing the second prediction result to the user terminal according to the user basic information, so that more effective drinking capacity prediction can be performed according to the actual condition of the user, and the prediction result is more accurate.
According to some embodiments of the invention, the preset algorithm comprises:
calculating the amount of ethanol in the first prediction of alcohol consumption:
V1=A×c
wherein A is the drinking capacity (ml) output by the alcohol capacity prediction model based on the gene data of the user; c is the preset alcohol concentration (% vol) in the alcohol capacity prediction model;
calculating the ethanol amount in the second prediction of alcohol consumption:
V2=V1×d×t×f
wherein d is a correlation coefficient of the disease history and the drinking capacity of the user; t is a correlation coefficient between the type of drinking and the drinking amount; f is a correlation coefficient of the drinking frequency and the drinking amount;
alcohol consumption of the second prediction result:
Figure BDA0002609020080000111
wherein, cuFor the userThe alcohol degree is inputted.
The working principle and the beneficial effects of the technical scheme are as follows: when a user belongs to a patient with stomach illness, a patient with liver disease, a patient with cardiovascular and cerebrovascular diseases, a pregnant woman, and takes a cold drug, a hypnotic drug and a tranquilizer, the correlation coefficient d of the disease history of the user and the drinking capacity is 0, namely the user can not drink wine; the value of the correlation coefficient d of the disease history and the drinking capacity of other users is between 0 and 1; the correlation coefficient t between the type of drinking and the drinking amount is shown in Table III; the correlation coefficient f between the drinking frequency and the drinking amount is shown in the fourth table; through the preset algorithm, the first prediction result of the drinking capacity is corrected, the second prediction result of the drinking capacity is obtained through calculation, more effective drinking capacity prediction can be performed according to the actual situation of the user, the prediction result is more accurate, the most correct drinking suggestion of the user is given, and the user experience is improved.
Watch III
Type of drinking Correlation coefficient t
White spirit 1
Beer with improved flavor 1.5
Grape wine 1.8
Watch four
Frequency of drinking Coefficient of correlation f
Daily drinking 0.3
Once drinking for three days 0.6
Wine drinking once in 7 days 0.8
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for predicting alcohol consumption based on gene screening, comprising:
s1, acquiring the relation between the drinking capacity and the drinking capacity of the sample, dividing the sample into a first preset number of drinking section positions according to the drinking capacity, and establishing a first database according to the relation between the drinking capacity and the drinking capacity of the sample and the drinking section positions;
s2, acquiring gene data of the sample and formatting the gene data;
s3, constructing a drinking capacity prediction model according to the gene data of the formatted sample and the first database;
and S4, predicting the drinking capacity of the user based on the drinking capacity prediction model according to the gene data of the user.
2. The method of predicting alcohol consumption based on genetic screening as set forth in claim 1, wherein the obtaining and formatting genetic data of the sample comprises:
s21, collecting saliva of the sample;
s22, extracting DNA according to the saliva of the sample, and performing gene sequencing on the extracted DNA;
s23, processing the gene data after gene sequencing to obtain the genotype of the gene locus related to the drinking capacity of each sample;
and S24, formatting the gene locus into numbers according to the genotype.
3. The method for predicting alcohol consumption based on genetic screening according to claim 2,
and (3) performing gene locus screening on the gene data of the formatted sample, wherein the gene locus screening comprises the following steps:
s241, respectively calculating the purity improvement value or uncertainty reduction value of each data subset obtained after the first database is divided and the data set before division;
s242, selecting a gene locus N with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value N of the gene locus N, wherein the gene locus N is used as a node, and the first database is divided into two sub data sets according to the grouping of the characteristic value N of the gene locus N;
s243, sequentially calculating the purity improvement value or uncertainty reduction value of the characteristic value of each gene locus in the two subdata sets; selecting a gene locus M with a maximum purity improvement value or a maximum uncertainty reduction value and a characteristic value M of the gene locus M, wherein the gene locus M is used as a child node, and the child data set is split again according to the grouping of the characteristic value M of the gene locus M;
and S244, stopping splitting when the purity of the divided subdata set is determined to be greater than a preset purity threshold or the uncertainty value is determined to be smaller than a preset uncertainty threshold, and finally obtaining the gene locus related to the drinking volume and the relationship between the gene locus and the drinking section.
4. The method of claim 1, wherein a machine learning model is used to construct the model for predicting the amount of alcohol consumed.
5. The method for predicting alcohol consumption based on genetic screening according to claim 1, wherein the predicting alcohol consumption of the user based on the alcohol consumption prediction model according to the genetic data of the user comprises:
s41, acquiring basic information of the user and saliva of the user;
s42, extracting DNA according to the saliva of the user, and performing gene sequencing on the extracted DNA;
s43, processing the gene data after gene sequencing to obtain the genotype of the designated site, and formatting the genotype data of the designated sites rs1229984 and rs 671;
s44, inputting the formatted genotype data into a drinking capacity prediction model;
and S45, issuing the first prediction result output by the alcohol consumption prediction model to the user terminal according to the user basic information.
6. The method for predicting alcohol consumption based on genetic screening according to claim 1, wherein the genetic data of the obtained sample is subjected to DNA extraction by collecting blood.
7. The method for predicting alcohol consumption based on genetic screening according to claim 5, further comprising:
s71, acquiring second information influencing the drinking capacity of the user, wherein the second information comprises: disease history, type of drinking, alcohol degree, and alcohol frequency;
s72, acquiring a first prediction result of the drinking capacity output by the drinking capacity prediction model based on the gene data of the user;
s73, calculating a second prediction result of the drinking capacity according to the first prediction result and the second information according to a preset algorithm;
and S74, issuing the second prediction result to the user terminal according to the user basic information.
8. The method for predicting alcohol consumption based on genetic screening according to claim 7,
the preset algorithm comprises the following steps:
calculating the amount of ethanol in the first prediction of alcohol consumption:
V1=A×c
wherein A is the drinking capacity (ml) output by the alcohol capacity prediction model based on the gene data of the user; c is the preset alcohol concentration (% vol) in the alcohol capacity prediction model;
calculating the ethanol amount in the second prediction of alcohol consumption:
V2=V1×d×t×f
wherein d is a correlation coefficient of the disease history and the drinking capacity of the user; t is a correlation coefficient between the type of drinking and the drinking amount; f is a correlation coefficient of the drinking frequency and the drinking amount;
alcohol consumption of the second prediction result:
Figure FDA0002609020070000031
wherein, cuThe alcohol degree input by the user.
CN202010747988.3A 2020-07-30 2020-07-30 Drinking volume prediction method based on gene screening Active CN112037855B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010747988.3A CN112037855B (en) 2020-07-30 2020-07-30 Drinking volume prediction method based on gene screening
PCT/CN2021/109453 WO2022022665A1 (en) 2020-07-30 2021-07-30 Method for predicting alcohol consumption amount on the basis of gene screening

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010747988.3A CN112037855B (en) 2020-07-30 2020-07-30 Drinking volume prediction method based on gene screening

Publications (2)

Publication Number Publication Date
CN112037855A true CN112037855A (en) 2020-12-04
CN112037855B CN112037855B (en) 2023-09-12

Family

ID=73583544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010747988.3A Active CN112037855B (en) 2020-07-30 2020-07-30 Drinking volume prediction method based on gene screening

Country Status (2)

Country Link
CN (1) CN112037855B (en)
WO (1) WO2022022665A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022022665A1 (en) * 2020-07-30 2022-02-03 苏州因顿医学检验实验室有限公司 Method for predicting alcohol consumption amount on the basis of gene screening
WO2022022667A1 (en) * 2020-07-30 2022-02-03 苏州因顿医学检验实验室有限公司 Gene screening-based alcohol tolerance prediction system
CN114908146A (en) * 2022-05-31 2022-08-16 因顿健康科技(苏州)有限公司 Method for rapidly detecting and judging alcohol content by gene

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060062859A1 (en) * 2004-08-05 2006-03-23 Kenneth Blum Composition and method to optimize and customize nutritional supplement formulations by measuring genetic and metabolomic contributing factors to disease diagnosis, stratification, prognosis, metabolism, and therapeutic outcomes
CN107058594A (en) * 2017-06-19 2017-08-18 中山大学附属第三医院 A kind of method, primer and kit for detecting gout tumor susceptibility gene SNP genotype
CN110004221A (en) * 2019-04-16 2019-07-12 北京和合医学诊断技术股份有限公司 The multi-PCR detection method of detection 3 alcohol metabolism genes, 4 SNP sites can be synchronized
CN111312396A (en) * 2020-02-21 2020-06-19 光瀚健康咨询管理(上海)有限公司 Individual wine capacity evaluation method and system based on wine capacity related factors

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE9401975D0 (en) * 1994-06-07 1994-06-07 Pharmacia Ab The use of sialic acid determinations
CN102876782A (en) * 2012-09-13 2013-01-16 周宏灏 Kit for detecting acetaldehyde dehydrogenase 2 (ALDH2) gene polymorphism by pyro-sequencing method and method
CN107644087A (en) * 2017-09-25 2018-01-30 云健康基因科技(上海)有限公司 Coffee beverage recommends method, commending system and computer-readable recording medium
CN112037855B (en) * 2020-07-30 2023-09-12 苏州因顿医学检验实验室有限公司 Drinking volume prediction method based on gene screening

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060062859A1 (en) * 2004-08-05 2006-03-23 Kenneth Blum Composition and method to optimize and customize nutritional supplement formulations by measuring genetic and metabolomic contributing factors to disease diagnosis, stratification, prognosis, metabolism, and therapeutic outcomes
CN107058594A (en) * 2017-06-19 2017-08-18 中山大学附属第三医院 A kind of method, primer and kit for detecting gout tumor susceptibility gene SNP genotype
CN110004221A (en) * 2019-04-16 2019-07-12 北京和合医学诊断技术股份有限公司 The multi-PCR detection method of detection 3 alcohol metabolism genes, 4 SNP sites can be synchronized
CN111312396A (en) * 2020-02-21 2020-06-19 光瀚健康咨询管理(上海)有限公司 Individual wine capacity evaluation method and system based on wine capacity related factors

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋晓超 等: "SIRT1和APOC3基因单核苷酸多态性与非酒精性脂肪性肝病的相关性", 卫生研究, pages 114 - 19 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022022665A1 (en) * 2020-07-30 2022-02-03 苏州因顿医学检验实验室有限公司 Method for predicting alcohol consumption amount on the basis of gene screening
WO2022022667A1 (en) * 2020-07-30 2022-02-03 苏州因顿医学检验实验室有限公司 Gene screening-based alcohol tolerance prediction system
CN114908146A (en) * 2022-05-31 2022-08-16 因顿健康科技(苏州)有限公司 Method for rapidly detecting and judging alcohol content by gene

Also Published As

Publication number Publication date
CN112037855B (en) 2023-09-12
WO2022022665A1 (en) 2022-02-03

Similar Documents

Publication Publication Date Title
CN112002375B (en) Construction method of alcohol capacity prediction model
CN112037855B (en) Drinking volume prediction method based on gene screening
TWI516969B (en) Methods and systems for personalized action plans
KR20110074527A (en) Methods and systems for incorporating multiple environmental and genetic risk factors
WO2022022667A1 (en) Gene screening-based alcohol tolerance prediction system
CN108345768B (en) Method for determining maturity of intestinal flora of infants and marker combination
WO2022170909A1 (en) Drug sensitivity prediction method, electronic device and computer-readable storage medium
CN113571158A (en) Intelligent AI intelligent mental health detection and analysis evaluation system
Rudolph et al. Modeling yeast in suspension during laboratory and commercial fermentations to detect aberrant fermentation processes
Aanen et al. Mutation-rate plasticity and the germline of unicellular organisms
CN109585017A (en) A kind of the risk prediction algorithms model and device of age-related macular degeneration
Deak et al. Genome-wide investigation of maximum habitual alcohol intake in US veterans in relation to alcohol consumption traits and alcohol use disorder
US20170116386A1 (en) Cellular-age meta-analysis system
CN116189919B (en) Computer analysis method and system for microbial drug sensitivity and application of computer analysis method and system
CN116344055A (en) Heart failure risk prediction and neural network model construction method
CN110890131A (en) Method for predicting cancer risk based on hereditary gene mutation
CN113035352B (en) Diabetic retinopathy early warning method based on BP neural network
Thomas et al. Alcohol metabolizing polygenic risk for alcohol consumption in European American college students
Vena et al. Short Course of Antifungal Therapy in Patients With Uncomplicated Candida Bloodstream Infection: Another Case of Less Is More in the Clinical Setting?
Pulanić et al. Effects of isolation and inbreeding on human quantitative traits: An example of biochemical markers of hemostasis and inflammation
US20230033547A1 (en) Systems and methods for predicting the taste of a user
CN111647666B (en) Primer group, application, kit and method for detecting SNP locus related to human watermelon preference
CN111719001B (en) Primer group, application, kit and method for detecting SNP locus related to human orange preference
Taylor et al. Using genetic burden scores for gene-by-methylation interaction analysis on metabolic syndrome in African Americans
Serrano Strategies for Gene Discovery and Mechanistic Insight Using Pleiotropy and Induced Mutagenesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant