CN111863266B - Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model - Google Patents

Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model Download PDF

Info

Publication number
CN111863266B
CN111863266B CN202010057865.7A CN202010057865A CN111863266B CN 111863266 B CN111863266 B CN 111863266B CN 202010057865 A CN202010057865 A CN 202010057865A CN 111863266 B CN111863266 B CN 111863266B
Authority
CN
China
Prior art keywords
item
association rule
data
history
fresh
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010057865.7A
Other languages
Chinese (zh)
Other versions
CN111863266A (en
Inventor
余盖青
高俊波
程陈
费若岚
王长静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN202010057865.7A priority Critical patent/CN111863266B/en
Publication of CN111863266A publication Critical patent/CN111863266A/en
Application granted granted Critical
Publication of CN111863266B publication Critical patent/CN111863266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a dangerous factor screening method for sporadic colorectal adenoma based on a directional weighted association rule model, and belongs to the field of data mining. The invention pre-processes the data; then, carrying out feature extraction by adopting a feature selection method for reducing average non-purity of a random forest, and determining optimal dividing nodes by adopting information gain to obtain a preferred feature set; next, the preferred feature set is input into a directionally weighted association rule model to generate a strong association rule. Finally, the risk factors contained in the strong association rules are brought into the risk factor set and communicated with the expert. Compared with the prior art, the method mainly provides a directional weighted association rule model to screen the risk factors of colorectal adenoma, confirms the significance of life eating habit factors in the etiology of colorectal adenoma, discovers the high risk factors which are not discovered in the previous research, and provides a method worthy of reference for searching the risk factors of colorectal adenoma.

Description

Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model
Technical Field
The invention relates to medical data analysis, in particular to a risk factor screening method for sporadic colorectal adenoma based on a directional weighted association rule model.
Background
Sporadic colorectal adenomas (CRAs) are benign glandular tumors of the colon and rectum, a pre-lesion of colorectal cancer. Early detection and timely treatment can effectively reduce the canceration probability of the patients, and has important significance for prolonging the survival time of the patients. Investigation and research show that CRA is closely related to life eating habit, and 66% -78% of colorectal adenomas can be avoided through healthy life habit. However, some important risk factors are ignored or even not found, so that the health life of the patient cannot be effectively guided, and the current situation is improved.
In recent years, more and more researchers have come to appreciate the importance of eating habits in the etiology of colorectal adenoma and have been devoted to the study of risk factors for colorectal adenoma. However, the method is too single in the aspect of analysis of the risk factors, and the traditional methods have a certain effect on single factor analysis, but are not perfect, and some risk factors with small probability but important are easily omitted. To overcome the above problem, we propose a directional weighted association rule model, which is an efficient association rule mining model constructed by combining the way of probability calculation of weighted support and fixed postamble. The risk factors for colorectal adenomas are analyzed by generating a regular pattern of colorectal adenoma onset.
Disclosure of Invention
The invention aims at: in order to solve the technical problems related to the background technology, a risk factor screening method for sporadic colorectal adenoma based on a directional weighted association rule model is provided. The technical scheme adopted by the invention is as follows:
a risk factor screening method for sporadic colorectal adenomas based on a directionally weighted association rule model comprises the following specific steps:
s1, preprocessing data;
s2, selecting characteristics by adopting a method for reducing average non-purity of random forests;
s3, analyzing by using a directional weighted association rule model;
and S4, incorporating the risk factors contained in the strong association rule generated in the step S3 into a risk factor set and communicating with an expert.
As a still further technical scheme of the invention: the S1, data preprocessing comprises the following steps:
s101, deleting irrelevant data;
s102, deleting redundant information, deleting characteristic columns with deletion values exceeding 50%, and deleting dirty data with obvious anomalies.
S103, data conversion;
as a still further technical scheme of the invention: s2, performing feature selection by adopting a random forest average non-purity reduction method comprises the following steps:
s201, calculating information entropy H1 of original data:
s202, selecting a feature, classifying data according to the feature value, calculating information entropy of each class respectively, and summing the information entropy according to proportion to obtain information entropy H2 of the division mode;
s203, calculating information gain: info_gain=h1-H2;
s204, calculating information gains corresponding to all the features according to S202 and S203, and reserving feature attributes with larger gains.
S205, according to the feature index corresponding to the maximum information gain, the previous features are put into a set to be used as a preferred feature set.
The step S3 includes the steps of:
as a still further technical scheme of the invention: s3, analyzing by using a directional weighted association rule model comprises the following steps:
definition: let i= { I 1 ,i 2 ,…,i m And is a set of item attributes. Notation D is a set of transactions T, where T is a set of item attributes, andthere is a unique identification for each transaction T, denoted TID. Let X be a collection of items in I, ifThen transaction T is said to contain X.
Definition: item attribute i j The weight of (a) is a value related to the item property, denoted w (i j ). Item attribute i j Probability of occurrence in transaction set D P (i j ),w(i j ) Namely P (i) j ) Is the inverse of (c). The weight of a patient transaction refers to the weight of a record in the patient data set, denoted w (T k ) Is all of T k The average value of the weights of the item attributes; wherein T is k Is the kth record in transaction set D;
formula (1):
formula (2): the weighted support of association rule a- > B is denoted wsp (a, B),
equation (3): the confidence level of the association rule a— > B is denoted conf (a, B):
equation (4): the degree of promotion of association rule a— > B is denoted as lift (a, B), if lift (a, B) >1 indicates A, B is positively correlated, lift (a, B) <1 indicates A, B is negatively correlated, lift (a, B) =1 indicates A, B is uncorrelated:
s301, scanning a database D to obtain each item attribute i j And calculate the probability of w (T) k ) (the specific calculation mode is shown in the formula (1));
s302, scanning a database D, setting pathology as a post term after_item of an association rule, and putting all other features into a set Q; setting a minimum support threshold value min_sup, a minimum confidence threshold value min_conf and a maximum circulation number max_rule_length;
s303, initializing frequent 1-item sets. All items in Q are connected with the later term after_item, and the items with the weighted support degree larger than min_sup are selected to be put into L0 (the weighted support degree calculation is shown in a formula (2));
s304, generating frequent (k+1) -item sets by using the frequent k-item sets. The core method is a recursive method based on a frequency set theory, firstly, a frequent 1-item set L1 is generated, a frequent 2-item set L2 is generated again, and the algorithm is stopped until a rule maximum length r is generated to generate Lr. Here, in the kth cycle, the process first generates a set Ck of candidate k-term sets, each term set in Ck being generated by doing a self-join by Lk-1. The term set in Ck is a candidate set used to generate frequent term sets, and the last frequent term set Lk must be a subset of Ck. Wherein the Lk mode is generated by Ck: and calculating the weighted support sup1 of each item in Ck and the weighted support sup2 of the item after each item is removed from the after_item, and putting the item with the weighted support sup1 larger than the min_sup into L (k+1).
S305、L=[L2,…,Lr]Calculating the ratio conf (the calculation mode is shown in formula (3)) and the lifting degree lift (the calculation mode is shown in formula (4)) of the weighted support degree of each frequent item set (L-after_item) and after_item in L, and outputting a strong association rule if conf is larger than min_confCompared with the prior art, the invention has the advantages or positive effects
1. The invention provides a dangerous factor screening method for sporadic colorectal adenoma based on a directional weighted association rule model, which improves a support degree calculation mode and a postterm generation mode, is beneficial to reducing invalid calculation, improving the generation of effective rules and improving the mining effect.
2. The invention constructs the preferential feature set, which is beneficial to improving the accuracy of analysis results and shortening the calculation process.
3. Aiming at life eating habit data, the invention analyzes the high risk factor of colorectal adenoma by excavating the association relation between the life eating habit data and the occurrence of colorectal adenoma, and provides a borrowed method for screening the risk factor of colorectal adenoma.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a feature selection flow chart of the present invention;
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Embodiment one, referring to fig. 1, a method for screening risk factors for sporadic colorectal adenomas based on a directionally weighted association rule model, comprises the following specific steps:
s1, preprocessing colorectal adenoma data. Deleting irrelevant data, deleting redundant information, deleting characteristic columns with deletion values exceeding 50%, and deleting dirty data with obvious anomalies. A total of 234 were included in the standard dataset, 62 of which were diagnosed with colorectal adenoma.
Referring to fig. 2, feature selection is performed using a random forest average non-purity reduction method.
(1) Calculating the information entropy of the original data to obtain the initial information entropy as follows:
(2) The information entropy H2 is calculated by taking the classification by features 7 and 24 as examples:
h2 (classified by feature 7) = 0.8283984298779227;
h2 (classified by feature 24) = 0.7903757392936914.
(3) The information gain info_gain is calculated to be classified as example according to features 7 and 24:
info_gain (classified by feature 7) =h1-H2 (classified by feature 7) = 0.8341351937-0.7903757392 = 0.0057367638;
info_gain (classified by feature 24) =h1-H2 (classified by feature 24) = 0.8341351937-0.8283984298 = 0.0437594544.
(4) The characteristic attribute with larger gain is reserved, and the characteristic index corresponding to the optimal information gain is obtained to be 24. The preferred feature set is the top 24 features of the feature importance rank.
S3, analyzing by using a directional weighted association rule model;
through repeated experiments, the experimental parameters are selected, the maximum mining item is 5, the latter item is 'bq_1' (the pathology is 1, namely, colorectal adenoma is suffered from), the minimum weighted support degree is 0.3, and the minimum confidence degree is 0.5. The preferred index set is input into a directionally weighted association rule model, which generates a rule pattern of 44 colorectal adenoma incidences.
And S4, incorporating the risk factors contained in the strong association rule generated in the step S3 into a risk factor set and communicating with an expert. The 44 regulations contain 7 important characteristics, and include some traditional risk factors and some non-traditional risk factors, so that the effectiveness and correctness of the method are proved.
The above embodiments are only examples of some of the data of the present invention, and are not intended to limit the present invention, and any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (1)

1. A risk factor screening method for sporadic colorectal adenomas based on a directionally weighted association rule model comprises the following specific steps:
s1, preprocessing data and the data;
s2, selecting characteristics by adopting a random forest average non-purity reduction method to obtain a preferred index set;
s3, analyzing by using a directional weighted association rule model;
s4, bringing the risk factors contained in the strong association rule generated in the S3 into a risk factor set, and communicating with an expert;
the data in step S1 includes the following data segment items:
the data column field comprises basic information, disease states, life habits, eating habits and risk factor characteristics related to life eating habits which are preliminarily screened by 79 experts in five aspects of colonoscopy results;
wherein, the basic information includes: name, gender, age, race, telephone number, education level, local residents, height, weight, occupation, marital status, household income;
(1) The disease condition includes (1) a past medical history: diabetes history, hypertension history, coronary heart disease history, chronic liver disease history, chronic kidney disease history, chronic bronchitis disease history, cerebrovascular disease history, hyperlipidemia disease history, fatty liver disease history, cholecystectomy history, enterosurgical history, gastric surgery history, esophageal surgery history, other diseases or surgery history; (2) current medical history: abdominal pain, abdominal distension, diarrhea, constipation, bloody stool, mucous stool, and other symptoms; (3) antibiotics are used;
(2) Lifestyle habits include: smoking, staying up night, exercising and going out;
(3) Eating habits include (1) frequency and cooking pattern of the seawater product: fresh sea fish, fresh frozen fish fillets, salted sea fish and dried fish, spicy sea fish and dried fish, fresh sea shrimp/crab/shellfish/snails, fresh frozen shrimp/crab/shellfish/snails, salted sea shrimp/crab/shellfish/snails, drunk sea shrimp/crab/shellfish/snails, sea plants, salted sea plants; (2) frequency and cooking processing mode of livestock meat: freshly slaughtered pork/beef/mutton/chicken/duck meat, freshly slaughtered animal viscera, cured processed meat products, spicy processed meat products; (3) fresh water product frequency and cooking method: fresh freshwater fish, salted freshwater fish, spicy freshwater fish, fresh freshwater shrimp/crab/shellfish/snails, salted freshwater shrimp/crab/shellfish/snails, drunk freshwater shrimp/crab/shellfish/snails; (4) poultry eggs/milks and milk products: common milk, low fat/skimmed milk, yogurt, milk powder, hen eggs/duck eggs/quail eggs, and processed eggs by curing; (5) snack type: processed carbohydrate, processed meat, processed preserved fruit; (6) vegetables/melons and fruits and cooking processing modes: fresh vegetables, pickled vegetables, mushrooms, melons and fresh fruits; (7) drinking water beverages: potable tap water, potable mineral water, potable purified water, carbonated beverages, and fruit juice beverages; (8) alcoholic beverage: mixing low-alcohol Chinese liquor, high-alcohol Chinese liquor, red wine, yellow wine, beer, fruit wine, alcoholic beverage and various wines;
(4) The colonoscopy results included: examination results, examination sites and pathology results; the pathological outcome is used to determine if they are colorectal adenoma patients;
the data preprocessing in the step S1 comprises the following steps:
s101, deleting irrelevant data;
s102, deleting redundant information, deleting characteristic columns with deletion values exceeding 50%, and deleting dirty data with obvious abnormality;
s103, data conversion;
the step S2 of selecting the characteristics by adopting a method for reducing average non-purity of random forests comprises the following steps:
s201, calculating information entropy H1 of original data:
s202, selecting a feature, classifying data according to the feature value, calculating information entropy of each class respectively, and summing the information entropy according to proportion to obtain information entropy H2 of the division mode;
s203, calculating information gain: info_gain=h1-H2;
s204, calculating information gains corresponding to all the features according to the S202 and the S203, and reserving feature attributes with larger gains;
s205, according to the feature index corresponding to the maximum information gain, putting the previous features into a set to be used as a preferred feature set;
the step S3 includes the steps of:
definition: let i= { I 1 ,i 2 ,…,i m Is the set of item attributes, denoted D is the set of transactions T, where T is the set of item attributes, andwith unique identification for each transaction T, denoted TID, let X be a collection of items in I, ifThen transaction T is said to contain X;
item attribute i j The weight of (a) is a value related to the item property, denoted w (i j ) The method comprises the steps of carrying out a first treatment on the surface of the Item attribute i j Probability of occurrence in transaction set D P (i j ),w(i j ) Namely P (i) j ) Is the reciprocal of (2); the weight of a patient transaction refers to the weight of a record in the patient data set, denoted w (T k ) Is all of T k The average value of the weights of the item attributes; wherein T is k Is the kth record in transaction set D;
formula (1):
formula (2): the weighted support of association rule a- > B is denoted wsp (a, B),
equation (3): the confidence level of the association rule a— > B is denoted conf (a, B):
equation (4): the degree of promotion of association rule a— > B is denoted as lift (a, B), if lift (a, B) >1 indicates A, B is positively correlated, lift (a, B) <1 indicates A, B is negatively correlated, lift (a, B) =1 indicates A, B is uncorrelated:
s301, scanning a data table D to obtain each item attribute i j And calculate the weight w (T) by the formula (1) k );
S302, scanning a data table D, setting pathology as a post term after_item of an association rule, and putting all other features into a set Q; setting a minimum support threshold value min_sup, a minimum confidence threshold value min_conf and a maximum circulation number max_rule_length;
s303, initializing frequent 1-item sets: all items in Q are connected with the later term after_item, and items with weighted support degree larger than min_sup are selected to be put into L0, wherein the weighted support degree is calculated through a formula (2);
s304, generating frequent (k+1) -item sets by using the frequent k-item sets: firstly, generating a frequent 1-item set L1, and regenerating a frequent 2-item set L2 until the maximum length r of the generation rule generates Lr, and stopping the algorithm; the Lk mode generated by Ck is: calculating the weighted support sup1 of each item in Ck and the weighted support sup2 of each item after the after_item is removed, and putting the item with the weighted support sup1 larger than min_sup into L (k+1);
S305、L=[L2,…,Lr]the method comprises the steps of carrying out a first treatment on the surface of the Calculating a ratio conf of weighted support of each frequent item set (L-after_item) to after_item in L through a formula (3); calculating a lifting degree lift through a formula (4); if conf is greater than min_conf, outputting a strong association rule
CN202010057865.7A 2020-01-16 2020-01-16 Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model Active CN111863266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010057865.7A CN111863266B (en) 2020-01-16 2020-01-16 Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010057865.7A CN111863266B (en) 2020-01-16 2020-01-16 Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model

Publications (2)

Publication Number Publication Date
CN111863266A CN111863266A (en) 2020-10-30
CN111863266B true CN111863266B (en) 2023-09-19

Family

ID=72984863

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010057865.7A Active CN111863266B (en) 2020-01-16 2020-01-16 Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model

Country Status (1)

Country Link
CN (1) CN111863266B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113192632A (en) * 2021-05-24 2021-07-30 哈尔滨理工大学 Breast cancer classification method based on weighted association rule algorithm
CN117352178A (en) * 2023-11-10 2024-01-05 西安艾派信息技术有限公司 Big data-based drug risk assessment system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095673A (en) * 2015-08-26 2015-11-25 中国人民解放军军事医学科学院放射与辐射医学研究所 Construction method of chronic disease risk model on the basis of medical big data mining
CN109543963A (en) * 2018-11-06 2019-03-29 深圳信息职业技术学院 A kind of big data analysis method and system based on student's study habit
EP3543702A1 (en) * 2018-03-23 2019-09-25 Roche Diabetes Care GmbH Methods for screening a subject for the risk of chronic kidney disease and computer-implemented method
CN110334737A (en) * 2019-06-04 2019-10-15 阿里巴巴集团控股有限公司 A kind of method and system of the customer risk index screening based on random forest

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095673A (en) * 2015-08-26 2015-11-25 中国人民解放军军事医学科学院放射与辐射医学研究所 Construction method of chronic disease risk model on the basis of medical big data mining
EP3543702A1 (en) * 2018-03-23 2019-09-25 Roche Diabetes Care GmbH Methods for screening a subject for the risk of chronic kidney disease and computer-implemented method
CN109543963A (en) * 2018-11-06 2019-03-29 深圳信息职业技术学院 A kind of big data analysis method and system based on student's study habit
CN110334737A (en) * 2019-06-04 2019-10-15 阿里巴巴集团控股有限公司 A kind of method and system of the customer risk index screening based on random forest

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
关联规则联合logistic回归分析新疆乳腺癌发病影响因素;马依热古丽·尼斯尔等;《医学临床研究》;第36卷(第1期);第142-144页 *

Also Published As

Publication number Publication date
CN111863266A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111863266B (en) Dangerous factor screening method for sporadic colorectal adenoma based on directional weighted association rule model
Clonan et al. Socioeconomic and demographic drivers of red and processed meat consumption: implications for health and environmental sustainability
Ege et al. Simultaneous estimation of food categories and calories with multi-task CNN
Piperata et al. Nutrition in transition: dietary patterns of rural Amazonian women during a period of economic change
Dvoretsky et al. Effects of environmental factors on the abundance, biomass, and individual weight of juvenile red king crabs in the Barents Sea
Papaconstantinou et al. The biology of the giant red shrimp (Aristaeomorpha foliacea) at an unexploited fishing ground in the Greek Ionian Sea
CN114049339B (en) Fetal cerebellum ultrasonic image segmentation method based on convolutional neural network
Dineshbabu et al.  Biology and exploitation of the blue swimmer crab, Portunus pelagicus (Linnaeus, 1758), from south Karnataka coast, India
CN114582516A (en) Disease multi-source data processing method and device, storage medium and electronic device
Ogimoto et al. World Cancer Research fund/american institute of cancer research 1997 recommendations: Applicability to digestive tract cancer in Japan
Andriguetto‐Filho et al. Analysis of natural and social dynamics of fishery production systems in Paraná, Brazil: implications for management and sustainability
Ansell et al. The ecology of two sandy beaches in south west India. II. Notes on Emerita holthuisi
Campagna et al. Risk of lymphoma subtypes and dietary habits in a Mediterranean area
Hammer et al. Fish stock development under hydrographic and hydrochemical aspects, the history of Baltic Sea fisheries and its management
Giussi et al. Biology and fishery of long tail hake (Macruronus magellanicus) in the Southwest Atlantic Ocean.
Matsuba et al. Overview of epidemiology of bile duct and gallbladder cancer focusing on the JACC Study
Ross et al. The diet of whiting Merlangius merlangus in the western Baltic Sea
James et al. Stock assessment of tunas from the Indian seas
Gildner Life history tradeoffs between testosterone and immune function among Shuar forager-horticulturalists of Amazonian Ecuador
Palma et al. The “Mediterraneanisation” of food fashions in the world
Palomares et al. The fisheries of the sea around Saint-Pierre and miquelon: from cod to sea cucumber
Howgate Psychophysics and the sensory assessment of fish
Anderson Trends, drivers, and ecosystem effects of expanding global invertebrate fisheries
Fanou et al. Consumption of, and beliefs about fonio (Digitaria exilis) in urban area in Mali
Gherman et al. Technical report: Review of quantitative risk assessment of foodborne norovirus transmission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant