WO2022041718A1 - 一种茶类判别方法及系统 - Google Patents

一种茶类判别方法及系统 Download PDF

Info

Publication number
WO2022041718A1
WO2022041718A1 PCT/CN2021/083770 CN2021083770W WO2022041718A1 WO 2022041718 A1 WO2022041718 A1 WO 2022041718A1 CN 2021083770 W CN2021083770 W CN 2021083770W WO 2022041718 A1 WO2022041718 A1 WO 2022041718A1
Authority
WO
WIPO (PCT)
Prior art keywords
tea
compounds
samples
sample
model
Prior art date
Application number
PCT/CN2021/083770
Other languages
English (en)
French (fr)
Inventor
王一君
宛晓春
阚志鹏
胡丽珍
宁井铭
李大祥
Original Assignee
安徽农业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 安徽农业大学 filed Critical 安徽农业大学
Publication of WO2022041718A1 publication Critical patent/WO2022041718A1/zh
Priority to US18/078,188 priority Critical patent/US20230109241A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N30/8631Peaks
    • G01N30/8634Peak quality criteria
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8675Evaluation, i.e. decoding of the signal into analytical information
    • G01N30/8682Group type analysis, e.g. of components having structural properties in common
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/04Preparation or injection of sample to be analysed
    • G01N30/06Preparation
    • G01N30/14Preparation by elimination of some components
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/62Detectors specially adapted therefor
    • G01N30/72Mass spectrometers
    • G01N30/7233Mass spectrometers interfaced to liquid or supercritical fluid chromatograph
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8603Signal analysis with integration or differentiation
    • G01N30/861Differentiation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8693Models, e.g. prediction of retention times, method development and validation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8696Details of Software
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • G01N2030/8809Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample
    • G01N2030/8813Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample biological materials

Definitions

  • the invention relates to a tea discrimination method and system, and belongs to the technical field of detection.
  • Tea tree is a perennial evergreen leafy plant, belonging to the Camellia genus of the family Theaceae. Tea is made from fresh tea leaves through different processing methods, and is rich in functional chemical components such as polyphenols, amino acids and alkaloids.
  • the traditional classification method of tea leaves is divided into six major tea categories: green tea, yellow tea, dark tea, white tea, oolong tea, and black tea according to different processing techniques, quality characteristics and appearance differences.
  • the discrimination of tea types is mainly through sensory evaluation, combined with dry evaluation and wet evaluation, to evaluate the appearance, color, tenderness and clarity of dry tea, as well as the aroma, taste, soup color and leaf bottom of tea after brewing.
  • the traditional sensory evaluation requires professionals, which requires high technical and work experience of the reviewers, and is easily affected by the reviewers' physiological conditions, environment and other factors, and the personal subjectivity is too strong to accurately classify tea. .
  • the identification, quality control and quality safety of tea have become more reliable.
  • the actual technical problem to be solved by the present invention is to provide a tea discrimination method and system capable of discriminating multiple samples at the same time and the discrimination accuracy rate can reach 100%.
  • the present invention provides a tea type discrimination method and system, which can use the relative abundance of 20 compounds in the tea leaves to discriminate the tea types, which can overcome the problems in sensory discrimination and more objectively and scientifically evaluate the tea leaves.
  • the classification was carried out to improve the reliability and accuracy of the judgment results; the feasibility and accuracy of using the found 20 compound combinations for tea type identification were verified by using three algorithms.
  • the present invention provides a tea type discrimination method.
  • the method uses the ionic strength of 20 characteristic compounds as an evaluation index to establish a discriminant function to discriminate tea types; wherein, the mass-to-charge ratio of the 20 compounds is: 116.0648 116.0764,267.1206 ⁇ ⁇ ⁇ 268.1174,280.1252 267.1474,268.0906 280.1532,289.0561 ⁇ ⁇ ⁇ 307.0964,308.0757 289.085,307.0657 309.1123,364.0819 ⁇ ⁇ ⁇ 308.1065,309.0814 364.1183, 381.0604 381.0985,425.0658 ⁇ ⁇ ⁇ 518.3503 425.1083,485.0833 ⁇ 485.1318,518.2984 , 537.2765 ⁇ 537.3302, 554.1509 ⁇ 554.2063, 579.1207 ⁇ 579.1786, 607.2611 ⁇ 607.3218, 677.3378 ⁇ 677.4055, 744.22
  • the method specifically includes:
  • the method before substituting the internal standard normalized ion response intensity value of the tea sample to be judged into the established different tea types discrimination model, the method further includes: collecting tea samples of each type, and pre-processing the samples. Processing and detection: Process and analyze the obtained data to obtain the internal standard normalized ion response intensity values of the 20 compounds whose samples are between the set mass-to-charge ratios.
  • the obtained 20 compounds are analyzed by orthogonal partial least squares discriminant analysis on the data of different tea samples, and then according to the VIP (Variable Importance in Projection) value greater than 1.5 is selected as a candidate Variables, in order to further simplify the variables, stepwise discriminant analysis was used to further screen the candidate variables, and finally 16 compounds were obtained; combined with the actual application, it was found that the difference between green tea and yellow bud tea was prone to misjudgment. The discrimination rate is reduced.
  • FC Fold Change
  • establishing different tea types discrimination models further includes: acquiring the internal standard normalized ion responses of 20 compounds between the set mass-to-charge ratios of the collected tea samples of different types
  • the data of the intensity value is used as the data set of the sample tea leaves; the data set of the sample tea leaves is randomly divided into the training set and the verification set, the training set data is used to construct the tea discriminant model, and the verification set data is used for the constructed tea type discrimination model. Verification, wherein the ratio of the number of sample tea samples of each tea type to the number of tea samples to be discriminated is not less than 3:1.
  • the tea samples (validation set) to be judged are tested to obtain the internal standard normalization of 20 compounds between the set mass-to-charge ratios
  • the data of the ion response intensity value is substituted into the constructed tea discrimination model, so as to obtain the classification result of the tea samples to be discriminated.
  • the types of tea leaves include one or more of green tea, yellow tea, dark tea, white tea, black tea and oolong tea.
  • the present invention also provides a tea classification system, the system includes:
  • the sampling module is used for obtaining by LC-MS technology, and obtaining the tea mass spectrometry data corresponding to the tea to be tested;
  • the classification module is used to establish a discriminant function by using the ionic strength of the 20 characteristic compounds as an evaluation index, so as to classify and process the obtained tea mass spectrometry data, so as to obtain the classification result of the tea to be tested; the mass charge of the 20 compounds ratio: 116.0648 116.0764,267.1206 ⁇ ⁇ ⁇ 268.1174,280.1252 267.1474,268.0906 280.1532,289.0561 ⁇ ⁇ ⁇ 307.0964,308.0757 289.085,307.0657 308.1065,309.0814 ⁇ ⁇ ⁇ 364.1183,381.0604 309.1123,364.0819 381.0985,425.0658 ⁇ ⁇ ⁇ 485.1318 425.1083,485.0833 , 518.2984 ⁇ 518.3503, 537.2765 ⁇ 537.3302, 554.1509 ⁇ 554.2063, 579.1207 ⁇ 579.1786, 607.2611 ⁇ 607.3218, 677.3378 ⁇ 67
  • the system also includes a model building module for establishing a tea classification model, and the model building module specifically includes:
  • the modeling data acquisition sub-module is used to acquire the sample tea mass spectrometry data corresponding to different types of sample tea leaves, and the data set formed by the obtained sample tea mass spectrometry data is used as the sample tea mass spectrometry data set;
  • the modeling processing sub-module is used to randomly divide the obtained sample tea mass spectrometry data into a training set and a verification set, and use the random forest method, support vector machine method or Fisher discriminant method to model the training set, so as to establish the obtained tea category. discriminant model;
  • the validation submodule is used to validate the random forest model with the validation set.
  • the present invention also provides an automatic sorting device for tea leaves, which includes the above-mentioned tea sorting system.
  • the invention uses the relative abundances of 20 compounds present in the tea leaves to discriminate the types of tea leaves, which can overcome the problems existing in the sensory discrimination, classify the tea leaves more objectively and scientifically, and the correct recognition rate reaches 100%, thereby improving the accuracy of the judgment results.
  • Reliability and Accuracy The feasibility and accuracy of using the discovered 20 compound combinations for tea species discrimination were verified by using three algorithms.
  • FIG. 1 is the first-order (MS1) and second-order (MS2) mass spectra of compound C1 in Table 1.
  • FIG. 2 is the first-order (MS1) and second-order (MS2) mass spectra of compound C2 in Table 1.
  • FIG. 1 is the first-order (MS1) and second-order (MS2) mass spectra of compound C2 in Table 1.
  • FIG. 3 is the first-order (MS1) and second-order (MS2) mass spectra of compound C3 in Table 1.
  • FIG. 3 is the first-order (MS1) and second-order (MS2) mass spectra of compound C3 in Table 1.
  • FIG. 4 is the first-order (MS1) and second-order (MS2) mass spectra of compound C4 in Table 1.
  • FIG. 4 is the first-order (MS1) and second-order (MS2) mass spectra of compound C4 in Table 1.
  • FIG. 5 is the first-order (MS1) and second-order (MS2) mass spectra of compound C5 in Table 1.
  • FIG. 5 is the first-order (MS1) and second-order (MS2) mass spectra of compound C5 in Table 1.
  • FIG. 6 is the first-order (MS1) and second-order (MS2) mass spectra of compound C6 in Table 1.
  • FIG. 6 is the first-order (MS1) and second-order (MS2) mass spectra of compound C6 in Table 1.
  • FIG. 7 is the first-order (MS1) and second-order (MS2) mass spectra of compound C7 in Table 1.
  • FIG. 7 is the first-order (MS1) and second-order (MS2) mass spectra of compound C7 in Table 1.
  • FIG. 8 is the first-order (MS1) and second-order (MS2) mass spectra of compound C8 in Table 1.
  • FIG. 8 is the first-order (MS1) and second-order (MS2) mass spectra of compound C8 in Table 1.
  • FIG. 9 is the first-order (MS1) and second-order (MS2) mass spectra of compound C9 in Table 1.
  • FIG. 9 is the first-order (MS1) and second-order (MS2) mass spectra of compound C9 in Table 1.
  • FIG. 10 is the first-order (MS1) and second-order (MS2) mass spectra of compound C10 in Table 1.
  • FIG. 10 is the first-order (MS1) and second-order (MS2) mass spectra of compound C10 in Table 1.
  • FIG. 11 is the first-order (MS1) and second-order (MS2) mass spectra of compound C11 in Table 1.
  • FIG. 11 is the first-order (MS1) and second-order (MS2) mass spectra of compound C11 in Table 1.
  • FIG. 12 is the first-order (MS1) and second-order (MS2) mass spectra of compound C12 in Table 1.
  • FIG. 12 is the first-order (MS1) and second-order (MS2) mass spectra of compound C12 in Table 1.
  • FIG. 13 is the first-order (MS1) and second-order (MS2) mass spectra of compound C13 in Table 1.
  • FIG. 13 is the first-order (MS1) and second-order (MS2) mass spectra of compound C13 in Table 1.
  • FIG. 14 is the first-order (MS1) and second-order (MS2) mass spectra of compound C14 in Table 1.
  • FIG. 14 is the first-order (MS1) and second-order (MS2) mass spectra of compound C14 in Table 1.
  • FIG. 15 is the first-order (MS1) and second-order (MS2) mass spectra of compound C15 in Table 1.
  • FIG. 15 is the first-order (MS1) and second-order (MS2) mass spectra of compound C15 in Table 1.
  • FIG. 16 is the first-order (MS1) and second-order (MS2) mass spectra of compound C16 in Table 1.
  • FIG. 16 is the first-order (MS1) and second-order (MS2) mass spectra of compound C16 in Table 1.
  • FIG. 17 is the first-order (MS1) and second-order (MS2) mass spectra of compound C17 in Table 1.
  • FIG. 17 is the first-order (MS1) and second-order (MS2) mass spectra of compound C17 in Table 1.
  • FIG. 18 is the first-order (MS1) and second-order (MS2) mass spectra of compound C18 in Table 1.
  • FIG. 18 is the first-order (MS1) and second-order (MS2) mass spectra of compound C18 in Table 1.
  • FIG. 19 is the first-order (MS1) and second-order (MS2) mass spectra of compound C19 in Table 1.
  • FIG. 19 is the first-order (MS1) and second-order (MS2) mass spectra of compound C19 in Table 1.
  • FIG. 20 is the first-order (MS1) and second-order (MS2) mass spectra of compound C20 in Table 1.
  • FIG. 20 is the first-order (MS1) and second-order (MS2) mass spectra of compound C20 in Table 1.
  • 21 is a visualization graph of training set sample data based on the random forest algorithm tea classification model according to Embodiment 1 of the present invention.
  • Figure 22 is a visualization graph of the training set sample data based on the support vector machine algorithm tea classification model according to Embodiment 2 of the present invention.
  • Fig. 23 is a visualization graph of training set sample data based on Fisher's algorithm tea discrimination model according to Embodiment 3 of the present invention; wherein, BT is black tea, DT is black tea, GT is green tea, OT is oolong tea, WT is white tea, and YT is yellow tea .
  • Embodiment 1 a kind of tea discrimination method
  • This embodiment proposes a method for discriminating tea types based on chemical components, which specifically includes the collection and preprocessing of tea samples to be discriminated, selecting a detection platform and setting the platform environment (LC-MS detection), data acquisition and data preprocessing, and tea type Identify five steps.
  • Step 1 Sample collection and pretreatment: A total of 126 samples of six tea types on the market were collected, and the samples were freeze-dried and ground into powder; for each tea sample, 50 mg was accurately weighed, and 800 ⁇ L of 70% methanol was added; vortex Shake well, sonicate for 20 min; centrifuge at 12,000 g for 15 min at 4°C, pipette the supernatant and internal standard (DL-4-chlorophenylalanine methanol solution) into a sample vial for detection.
  • DL-4-chlorophenylalanine methanol solution DL-4-chlorophenylalanine methanol solution
  • Step 2 Sample detection: The tea samples treated in step 1 were detected by LC-MS, wherein, chromatographic conditions: Hypersil Gold chromatographic column; mobile phase 0.1% formic acid water (phase A)-acetonitrile containing 0.1% formic acid (Phase B), gradient elution (0-2min, 5-40%B; 2-7min, 40-80%B; 7-11min, 80-95%B; 11-15min, 95%B); column temperature 35°C, flow rate 0.3 mL/min, injection volume 4 ⁇ L.
  • Mass spectrometry conditions Q Exactive high-resolution combined mass spectrometry system was used; the capillary temperature was 350 °C, the capillary voltage was 3.8 kV, and the acquisition mass range was m/z: 50-1000.
  • Step 3 Data acquisition and data preprocessing: Use Compounds Discoverer 2.0 software to perform peak identification and peak integration on the collected data; then perform retention time correction, peak alignment, background subtraction and deconvolution analysis on each group of data obtained , and obtain the peak area, retention time, and mass-to-charge ratio of each group of samples. In order to eliminate errors between samples from different batches, all extracted peaks were normalized, specifically, the peak area of each compound peak was divided by the peak area of the internal standard compound (internal standard normalization).
  • OPLS-DA Orthogonal partial least squares discriminant analysis
  • Step 4 Randomly assign samples as training set and validation set: 126 tea samples are randomly allocated according to the ratio of the number of samples of each type as the training set and the number of samples to be judged of each type as the validation set of not less than 3:1, among which There are 20 samples to be discriminated as the validation set, and 106 samples as the training set to build the model.
  • Step 5 Establish a model and predict unknown samples: According to the 106 tea sample data in the training set, a tea type discrimination model based on the random forest algorithm is obtained, and the sample data in the training set is subjected to back-substitution test, and the type judgment results of all tea samples are correct. The positive recognition rate is 100%, and the results are shown in Table 2. Similarly, the data of 20 tea samples to be discriminated are substituted to obtain the confusion matrix calculated by the tea discrimination model based on the random forest algorithm. The types of all the tea samples to be discriminated All judgments are correct, and the awareness rate is 100%.
  • Embodiment 2 a kind of tea discrimination method
  • the tea samples in Example 2 are the same as those in Example 1, and are implemented according to steps 1 to 4.
  • Step 5 establishes a model and predicts unknown samples: According to the data of 106 tea samples in the training set, the tea classification based on support vector machine is obtained. Model, the sample data of the training set is tested by back-substitution, the type judgment results of all tea samples are correct, and the positive recognition rate is 100%. The confusion matrix calculated by the machine tea discrimination model for the tea samples to be judged, all the types of tea samples to be judged are correctly judged, and the positive recognition rate is 100%.
  • Embodiment 3 a kind of tea discrimination method
  • the tea samples in Example 2 are the same as those in Example 1, and are implemented according to steps 1 to 4.
  • Step 5 establishes a model and predicts unknown samples:
  • a tea discrimination model based on Fisher algorithm is obtained.
  • the sample data of the training set is subjected to back-substitution test, and the judgment results of the types of all tea samples are correct, and the positive recognition rate is 100%.
  • the confusion matrix calculated by the tea type discrimination model based on Fisher function is obtained. All the types of tea samples to be judged are correctly judged, and the positive recognition rate is 100%.
  • the present embodiment provides a tea classification system, the system includes:
  • the sampling module is used for obtaining by LC-MS technology, and obtaining the tea mass spectrometry data corresponding to the tea to be tested;
  • the classification module is used to establish a discriminant function by using the ionic strength of the 20 characteristic compounds as an evaluation index, so as to classify and process the obtained tea mass spectrometry data, so as to obtain the classification result of the tea to be tested; the mass charge of the 20 compounds ratio: 116.0648 116.0764,267.1206 ⁇ ⁇ ⁇ 268.1174,280.1252 267.1474,268.0906 280.1532,289.0561 ⁇ ⁇ ⁇ 307.0964,308.0757 289.085,307.0657 308.1065,309.0814 ⁇ ⁇ ⁇ 364.1183,381.0604 309.1123,364.0819 381.0985,425.0658 ⁇ ⁇ ⁇ 485.1318 425.1083,485.0833 , 518.2984 ⁇ 518.3503, 537.2765 ⁇ 537.3302, 554.1509 ⁇ 554.2063, 579.1207 ⁇ 579.1786, 607.2611 ⁇ 607.3218, 677.3378 ⁇ 67
  • system also includes a model building module for establishing a tea classification model, and the model building module specifically includes:
  • the modeling data acquisition sub-module is used to acquire the sample tea mass spectrometry data corresponding to different types of sample tea leaves, and the data set formed by the obtained sample tea mass spectrometry data is used as the sample tea mass spectrometry data set;
  • the modeling processing sub-module is used to randomly divide the obtained sample tea mass spectrometry data into a training set and a verification set, and use the random forest method, support vector machine method or Fisher discriminant method to model the training set, so as to establish the obtained tea category. discriminant model;
  • the validation submodule is used to validate the random forest model with the validation set.
  • This embodiment provides an automatic sorting device for tea leaves, the device includes the tea classification system described in Embodiment 4, and the tea leaves are classified according to the tea classification system.
  • the discriminant results of the random forest model are shown in Table 5. From the training set sample data, all 106 tea samples were correctly discriminated, and the accuracy of the entire model was 100%; from the discriminant results of the 20 test set samples, black tea samples , Oolong tea, white tea and yellow tea samples were all correctly discriminated. One of the black tea samples was misjudged as white tea, and one green tea sample was judged as yellow tea. The model's discriminant accuracy rate for the test set sample data was 90%.
  • the discrimination results of the support vector machine model are shown in Table 6. From the sample data of the training set, only 2 yellow tea samples were misjudged as green tea samples, all other tea samples were correctly discriminated, and the final accuracy of the entire model was 98.11%; Judging from the discrimination results of the 20 samples in the test set, the black tea, green tea, oolong tea and white tea samples were all correctly discriminated. One of the black tea samples was misjudged as white tea, and both yellow tea samples were misjudged as green tea. The discriminative accuracy of the model for the test set sample data is 85%.
  • the classification results of Fisher's linear discriminant model are shown in Table 7. From the sample data of the training set, among the black tea samples, two samples were judged to be black tea and white tea respectively, two oolong tea samples were misclassified as green tea, and three samples were misclassified as green tea. The yellow tea samples were judged as green tea, and all other tea samples were correctly discriminated, and the final accuracy of the entire model was 93.4%; from the discriminant results of the 20 test set samples, black tea, dark tea, green tea and white tea samples were all correctly discriminated , one sample of oolong tea was mistakenly classified as green tea, and one sample of yellow tea was mistakenly classified as oolong tea, the model's discriminative accuracy rate for the test set sample data was 90%.
  • the accuracy rate is not as high as that of using 20 compound variable combinations. It can better represent the characteristic compounds of the six major tea types, and plays an important role in the discrimination of each type of tea.
  • the correct rate of discrimination is 99.06%; from the test set samples, one green tea sample was misjudged as oolong tea, all other teas were correctly discriminated, and the test set samples were discriminated correctly rate of 95%.
  • the 20 compounds before adding showed better discriminative effect and better discrimination accuracy.

Landscapes

  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Library & Information Science (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

一种茶类判别方法及系统,属于检测技术领域。该方法是以20个化合物的离子强度作为评价指标,建立判别函数,以鉴别茶叶种类;利用茶叶中存在的20个化合物的相对丰度对茶叶种类进行判别,能够克服感官判别中存在的问题,更加客观科学地对茶叶进行分类,提高了判断结果的可靠性和准确性;使用三种算法验证了将所发现的20个化合物组合用于茶叶种类判别的可行性和准确性。

Description

一种茶类判别方法及系统 技术领域
本发明涉及一种茶类判别方法及系统,属于检测技术领域。
背景技术
茶树是多年生常绿叶用植物,属山茶科(Theaceae)山茶属(Camellia)。茶叶是由茶树鲜叶经过不同的加工方式制得而成,富含多酚、氨基酸和生物碱等功能性化学成分。传统的茶叶分类方法根据加工工艺的不同,品质特征和外形差异等,将茶叶分为绿茶、黄茶、黑茶、白茶、乌龙茶、红茶六大茶类。
目前,茶叶种类的判别主要是通过感官评定,结合干评和湿评,对干茶的外形,色泽,嫩度和净度等以及茶叶冲泡后茶汤的香气,滋味,汤色和叶底进行评价。传统的感官审评需要专业人员进行,对评审人员的技术以及工作经验要求高,且容易受到评审人员的生理条件、环境等因素的影响,个人主观性太强,无法准确的对茶类进行划分。随着科学技术的发展,各种仪器分析检测方法的出现,使茶叶的鉴别、品质控制以及质量安全都变得更加可靠,为克服感官评审的缺陷,现阶段有人尝试用近红外光谱,电子鼻和电子舌,液相色谱和气相色谱及其与质谱联用等技术来进行茶叶分类,但这些方法中存在样品前处理复杂,花费时间长以及步骤繁琐等缺点,而且,目前能够同时判别六大茶类的方法与技术欠缺,且所有种类的判别正确率不能同时达到很高。
发明内容
【技术问题】
本发明实际要解决的技术问题是:提供一种能够同时判别多个样品且判别正确率能达100%的茶类判别方法及系统。
【技术方案】
为了解决上述问题,本发明提供一种茶类判别方法及系统,利用茶叶中存在的20个化合物的相对丰度对茶叶种类进行判别,能够克服感官判别中存在的问题,更加客观科学地对茶叶进行分类,提高了判断结果的可靠性和准确性;通过使用三种算法验证了将所发现的20个化合物组合用于茶叶种类判别的可行性和准确性。
本发明提供了一种茶类判别方法,所述方法是以20个特征化合物的离子强度作为评价指标,建立判别函数,以鉴别茶叶种类;其中,所述20个化合物的质荷比为:116.0648~116.0764,267.1206~267.1474,268.0906~268.1174,280.1252~280.1532,289.0561~289.085,307.0657~307.0964,308.0757~308.1065,309.0814~309.1123,364.0819~364.1183, 381.0604~381.0985,425.0658~425.1083,485.0833~485.1318,518.2984~518.3503,537.2765~537.3302,554.1509~554.2063,579.1207~579.1786,607.2611~607.3218,677.3378~677.4055,744.2234~744.2978和869.1124~869.1993。
在本发明的一种实施方式中,该方法具体包括:
(1)对茶叶样品进行预处理,预处理包括打磨成粉、离心取上清和设置内标;
(2)对茶叶样品进行检测得到含有各样品的各组峰面积、保留时间和质荷比信息的数据矩阵,将所得的化合物峰的离子响应强度值分别除以内标化合物离子响应强度值进行内标归一化,通过正交偏最小二乘法判别分析(OPLS-DA)和逐步判别分析进行变量筛选得到20个特征化合物;
(3)将从茶叶样品获得的20个化合物的内标归一化离子响应强度值建立基于三种数学方法的茶类判别模型;
(4)将待测茶样中的20个化合物的内标归一化离子响应强度值代入所建立的茶类判别模型,获得待测茶样的类别。
在本发明的一种实施方式中,将待判茶样的内标归一化离子响应强度值代入建立的不同茶类判别模型之前,还包括:采集每个种类的茶叶样品,对样品进行预处理和检测,对所得数据进行处理与分析,获得样品在设定质荷比之间的20个化合物的内标归一化离子响应强度值。
在本发明的一种实施方式中,所得的20个化合物是通过对不同茶类样品数据进行正交偏最小二乘法判别分析,再根据VIP(Variable Importance in Projection)值大小选择大于1.5的作为候选变量,为了进一步精简变量,采用逐步判别分析的方法对候选变量进一步筛选,最终得到16个化合物;结合实际应用情况发现绿茶与黄芽茶之间容易误判,为了避免实际应用中样本增加造成潜在的判别率降低,考虑增加一些益于绿茶与黄芽茶判别的特征变量,通过引入变化率(Fold Change,FC)这个参数,选择在绿茶与黄芽茶之间FC>2、FC<0.5且离子响应强度(intensity)高的化合物变量,最终筛选得到4个化合物;将两次筛选得到的化合物进行汇总最终得到20个化合物用于不同茶类的判别。
在本发明的一种实施方式中,建立茶叶判别模型的三种数学方法,包括:随机森林法(Random Forest),支持向量机法(Support Vector Machine)和Fisher判别法。
在本发明的一种实施方式中,建立不同的茶类判别模型还包括:将获取收集到的不同种类的茶叶样品在设定质荷比之间的20个化合物的内标归一化离子响应强度值的数据作为样本茶叶的数据集;将样本茶叶的数据集随机划分成训练集和验证集,训练集数据用来构建茶 叶判别模型,验证集数据用来对所构建的茶类判别模型进行验证,其中,每个茶叶种类的样本茶叶样品的数量与待判别茶叶样品的数量之比不小于3:1。
在本发明的一种实施方式中,根据建立的不同茶类判别模型,将待判茶叶样品(验证集)进行检测,获得在设定质荷比之间的20个化合物的内标归一化离子响应强度值的数据,代入构建好的茶类判别模型中,从而得到待判茶叶样品的分类结果。
在本发明一种实施方式中,所述茶叶种类包括绿茶、黄茶、黑茶、白茶、红茶和乌龙茶中的一种或多种。
本发明还提供一种茶叶分类系统,所述系统包括:
采样模块,用于采用LC-MS技术获得,获取得到待测茶叶所对应的茶叶质谱数据;
分类模块,用于通过20个特征化合物的离子强度作为评价指标,建立判别函数,以对获取得到的茶叶质谱数据进行分类处理,从而得到待测茶叶的分类结果;所述20个化合物的质荷比为:116.0648~116.0764,267.1206~267.1474,268.0906~268.1174,280.1252~280.1532,289.0561~289.085,307.0657~307.0964,308.0757~308.1065,309.0814~309.1123,364.0819~364.1183,381.0604~381.0985,425.0658~425.1083,485.0833~485.1318,518.2984~518.3503,537.2765~537.3302,554.1509~554.2063,579.1207~579.1786,607.2611~607.3218,677.3378~677.4055,744.2234~744.2978和869.1124~869.1993。
在本发明一种实施方式中,所述系统还包括用于建立茶叶分类模型的模型建立模块,所述模型建立模块具体包括:
建模数据获取子模块,用于获取不同类别的样品茶叶所对应的样本茶叶质谱数据,将由获取得到的样本茶叶质谱数据所构成的数据集作为样本茶叶质谱数据集;
建模处理子模块,用于将获得的样本茶叶质谱数据随机划分成训练集和验证集,利用随机森林法、支持向量机法或Fisher判别法对训练集进行建模处理,从而建立得到茶类判别模型;
验证子模块,用于利用验证集对随机森林模型进行验证。
本发明还提供一种茶叶自动分选装置,所述装置含有上述茶叶分类系统。
本发明的有益效果:
本发明使用茶叶中存在的20个化合物的相对丰度对茶叶种类进行判别,能够克服感官判别中存在的问题,更加客观科学地对茶叶进行分类,正识率达100%,提高了判断结果的可靠性和准确性;通过使用三种算法验证了将所发现的20个化合物组合用于茶叶种类判别的可行性和准确性。
附图说明
图1为表1中化合物C1的一级(MS1)和二级(MS2)质谱图。
图2为表1中化合物C2的一级(MS1)和二级(MS2)质谱图。
图3为表1中化合物C3的一级(MS1)和二级(MS2)质谱图。
图4为表1中化合物C4的一级(MS1)和二级(MS2)质谱图。
图5为表1中化合物C5的一级(MS1)和二级(MS2)质谱图。
图6为表1中化合物C6的一级(MS1)和二级(MS2)质谱图。
图7为表1中化合物C7的一级(MS1)和二级(MS2)质谱图。
图8为表1中化合物C8的一级(MS1)和二级(MS2)质谱图。
图9为表1中化合物C9的一级(MS1)和二级(MS2)质谱图。
图10为表1中化合物C10的一级(MS1)和二级(MS2)质谱图。
图11为表1中化合物C11的一级(MS1)和二级(MS2)质谱图。
图12为表1中化合物C12的一级(MS1)和二级(MS2)质谱图。
图13为表1中化合物C13的一级(MS1)和二级(MS2)质谱图。
图14为表1中化合物C14的一级(MS1)和二级(MS2)质谱图。
图15为表1中化合物C15的一级(MS1)和二级(MS2)质谱图。
图16为表1中化合物C16的一级(MS1)和二级(MS2)质谱图。
图17为表1中化合物C17的一级(MS1)和二级(MS2)质谱图。
图18为表1中化合物C18的一级(MS1)和二级(MS2)质谱图。
图19为表1中化合物C19的一级(MS1)和二级(MS2)质谱图。
图20为表1中化合物C20的一级(MS1)和二级(MS2)质谱图。
图21为本发明实施例1基于随机森林算法茶类判别模型的训练集样品数据可视化图形;
图22为本发明实施例2基于支持向量机算法茶类判别模型的训练集样品数据可视化图形;
图23为本发明实施例3基于Fisher算法茶类判别模型的训练集样品数据可视化图形;其中,BT为红茶,DT为黑茶,GT为绿茶,OT为乌龙茶,WT为白茶,YT为黄茶。
具体实施方式
以下对本发明的优选实施例进行说明,应当理解实施例是为了更好地解释本发明,不用于限制本发明。
实施例1:一种茶类判别方法
本实施例提出一种基于化学成分的茶类判别方法,具体包括待判别茶叶样品的采集与预处理、选取检测平台和设置平台环境(LC-MS检测)、数据采集与数据预处理以及茶类判别五个步骤。
步骤1、样品采集与预处理:收集到市场上六种茶类样品共126个,将样品冷冻干燥后打磨成粉;对于每个茶叶样品,精密称取50mg,加入800μL 70%甲醇;涡旋摇匀,超声处理20min;12000g,4℃离心15min,吸取上清和内标(DL-4-氯苯丙氨酸甲醇溶液)于进样瓶用于检测。
步骤2、样品检测:通过LC-MS对步骤1中处理后的茶叶样品进行检测,其中,色谱条件:采用Hypersil Gold色谱柱;流动相0.1%甲酸水(A相)-含0.1%甲酸的乙腈(B相),梯度洗脱(0-2min,5-40%B;2-7min,40-80%B;7-11min,80-95%B;11-15min,95%B);柱温35℃,流速0.3mL/min,进样体积为4μL。质谱条件:采用Q Exactive高分辨组合质谱系统;毛细管温度为350℃,毛细管电压为3.8kV,采集质量范围为m/z:50~1000。
步骤3、数据采集与数据预处理:将采集到的数据,采用Compounds Discoverer 2.0软件进行峰识别和峰积分;然后对获得的各组数据进行保留时间校正、峰对齐、背景扣除和去卷积分析,获取各样品的各组峰面积、保留时间、质荷比。为了消除不同批次样品间误差,将所有提取得到的峰进行归一化处理,具体为将每一个化合物峰的峰面积分别除以内标化合物的峰面积(内标归一化)。
通过对不同茶类样品归一化后的数据进行正交偏最小二乘法判别分析(OPLS-DA),再根据VIP(Variable Importance in Projection)值大小选择大于1.5的作为候选变量,为了进一步精简变量,采用逐步判别分析的方法对候选变量进一步筛选,最终得到16个化合物;结合实际应用情况发现绿茶与黄芽茶之间容易误判,为了避免实际应用中样本增加造成潜在的判别率降低,考虑增加一些益于绿茶与黄芽茶判别的特征变量,通过引入变化率(Fold Change,FC)这个参数,选择在绿茶与黄芽茶之间FC>2、FC<0.5且intensity高的化合物变量,最终筛选得到4个化合物;将两次筛选得到的化合物进行汇总最终得到20个化合物用于不同茶类的判别,各化合物具体信息见表1;通过检测平台对茶叶样品进行检测,获得20个化合物的一级和二级质谱图见图1-20。
步骤4、随机分配样品为训练集和验证集:将126个茶叶样品按照作为训练集的每类样品数量与作为验证集的每类待判样品数量不少于3:1的比例随机分配,其中作为验证集的待判别样品共20个,作为训练集的建立模型的样品共106个。
步骤5、建立模型与预测未知样品:根据训练集106个茶叶样品数据,得到基于随机森 林算法的茶类判别模型,将训练集样品数据进行回代检验,所有茶叶样品的种类判断结果都正确,正识率100%,结果见表2,同样地,代入20个待判别茶叶样品的数据,获得基于随机森林算法的茶类判别模型对待判茶样计算的混淆矩阵,所有待判茶叶样品的种类都判断正确,正识率100%。
表1 20个化合物及内标(DL-4-氯苯丙氨酸)的信息表
Figure PCTCN2021083770-appb-000001
表2随机森林模型判别结果
Figure PCTCN2021083770-appb-000002
实施例2:一种茶类判别方法
实施例2中的茶叶样品与实施例1中的相同,并且按照步骤1~4实施,步骤5建立模型与预测未知样品:根据训练集106个茶叶样品数据,得到基于支持向量机的茶类判别模型,将训练集样品数据进行回代检验,所有茶叶样品的种类判断结果都正确,正识率100%,结果见表3,同样地,代入20个待判别茶叶样品的数据,获得基于支持向量机茶类判别模型对待判茶样计算的混淆矩阵,所有待判茶叶样品的种类都判断正确,正识率100%。
表3支持向量机模型判别结果
Figure PCTCN2021083770-appb-000003
实施例3:一种茶类判别方法
实施例2中的茶叶样品与实施例1中的相同,并且按照步骤1~4实施,步骤5建立模型与预测未知样品:根据训练集106个茶叶样品数据,得到基于Fisher算法的茶类判别模型,将训练集样品数据进行回代检验,所有茶叶样品的种类判断结果都正确,正识率100%,结果见表4,同样地,代入20个待判别茶叶样品的数据,代入20个待判别茶叶样品的数据,获得基于Fisher函数的茶类判别模型对待判茶样计算的混淆矩阵,所有待判茶叶样品的种类都判断正确,正识率100%。
表4 Fisher线性判别模型分类结果
Figure PCTCN2021083770-appb-000004
Figure PCTCN2021083770-appb-000005
实施例4
本实施例提供一种茶叶分类系统,所述系统包括:
采样模块,用于采用LC-MS技术获得,获取得到待测茶叶所对应的茶叶质谱数据;
分类模块,用于通过20个特征化合物的离子强度作为评价指标,建立判别函数,以对获取得到的茶叶质谱数据进行分类处理,从而得到待测茶叶的分类结果;所述20个化合物的质荷比为:116.0648~116.0764,267.1206~267.1474,268.0906~268.1174,280.1252~280.1532,289.0561~289.085,307.0657~307.0964,308.0757~308.1065,309.0814~309.1123,364.0819~364.1183,381.0604~381.0985,425.0658~425.1083,485.0833~485.1318,518.2984~518.3503,537.2765~537.3302,554.1509~554.2063,579.1207~579.1786,607.2611~607.3218,677.3378~677.4055,744.2234~744.2978和869.1124~869.1993。
进一步地,所述系统还包括用于建立茶叶分类模型的模型建立模块,所述模型建立模块具体包括:
建模数据获取子模块,用于获取不同类别的样品茶叶所对应的样本茶叶质谱数据,将由获取得到的样本茶叶质谱数据所构成的数据集作为样本茶叶质谱数据集;
建模处理子模块,用于将获得的样本茶叶质谱数据随机划分成训练集和验证集,利用随机森林法、支持向量机法或Fisher判别法对训练集进行建模处理,从而建立得到茶类判别模型;
验证子模块,用于利用验证集对随机森林模型进行验证。
实施例5
本实施例提供一种茶叶自动分选装置,所述装置含有实施例4所述的茶叶分类系统,根据茶叶分类系统对茶叶进行分类。
对比例1:
运用专利CN201810521854和CN201810521734中的16个化合物(277.0692,112.0753,116.0702,104.0703,181.1217,679.4144,333.2024,132.1014,381.078,267.1328,335.218,175.1071,496.3372,291.0854,433.1109和535.2675)使用实施例1-3中的106个茶样来进行三种模型的构建,20个茶样来检验模型的判别准确率。为了比较两种变量组合对于126个 茶样的预测能力,使用了相同的训练集样本和测试集样本。
随机森林模型的判别结果见表5,从训练集样品数据来看,106个茶叶样品全部判别正确,整个模型的准确率为100%;从20个测试集样品的判别结果来看,其中黑茶、乌龙茶、白茶和黄茶样品全部判别正确,红茶中有一个样品被误判为白茶,有一个绿茶样品被判为黄茶,该模型对于测试集样品数据的判别准确率为90%。
表5随机森林模型判别结果
Figure PCTCN2021083770-appb-000006
支持向量机模型的判别结果见表6,从训练集样品数据来看,只有2个黄茶样品被误判为绿茶样品,其他茶类样品全部判别正确,最终整个模型的准确率为98.11%;从20个测试集样品的判别结果来看,其中黑茶、绿茶、乌龙茶和白茶样品全部判别正确,红茶中有一个样品被误判为白茶,两个黄茶样品都被误判为绿茶,该模型对于测试集样品数据的判别准确率为85%。
表6支持向量机模型判别结果
Figure PCTCN2021083770-appb-000007
Figure PCTCN2021083770-appb-000008
Fisher线性判别模型分类结果见表7,从训练集样品数据来看,在红茶样品中,有两个样品分别被判为黑茶和白茶,有两个乌龙茶样品被误判为绿茶,有3个黄茶样品判为绿茶,其他茶类样品全部判别正确,最终整个模型的准确率为93.4%;从20个测试集样品的判别结果来看,其中红茶、黑茶、绿茶和白茶样品全部判别正确,乌龙茶中有一个样品被误判为绿茶,黄茶中有一个样品被误判为乌龙茶,该模型对于测试集样品数据的判别准确率为90%。
表7 Fisher线性判别模型分类结果
Figure PCTCN2021083770-appb-000009
从以上三种模型的判别结果来看,使用这16个化合物用于126个茶样的判别,准确率没有使用20个化合物变量组合的判别准确率高,相比而言,这20个化合物组合更能代表六大茶类的特征性化合物,为各茶类的判别起着重要作用。
对比例2:
为了进一步证明这20个化合物对于不同茶类的判别效果,我们通过设置省略/增加/替换部分化合物作为对比来证明并不是任意的多种化合物的组合都能达到类似的效果,我们以Fisher算法为例建立不同的判别模型并比较其判别效果来加以说明。
1、省略部分化合物
从20个化合物中随机删减4个化合物(C17-C20),将剩下的16个化合物使用实施例1-3中的106个茶样重新构建Fisher线性判别模型,20个茶样来检验模型的判别准确率,其判别结果见表8。从训练集样品数据来看,有一个乌龙茶样品和一个白茶样品都被误判为绿茶,其他茶类样品全部判别正确,对训练集样品的判别正确率为98.11%;在测试集样品中,有一个红茶样品被误判为黑茶,该模型对测试集样品的判别的正确率为95%。与删减部分化合物 后建立的Fisher线性判别模型相比,删减之前的20个化合物表现出更好的判别效果,判别正确率更佳。
表8 Fisher线性判别模型分类结果
Figure PCTCN2021083770-appb-000010
2、增加部分化合物
在20个化合物变量组合的基础上额外地增加部分化合物,选取茶叶中主要的次生代谢化合物包括以下4种:茶氨酸(MZ:175.10772)、咖啡碱(MZ:195.08765)、表儿茶素(MZ:291.08631)和表没食子儿茶素没食子酸酯(MZ:459.09219)。同样地,使用实施例1-3中的106个茶样来进行Fisher线性判别模型的构建,20个茶样来检验模型的判别准确率,保证结果的可比性。替换后的这20个化合物建立了Fisher线性判别模型,其判别结果见表9,从训练集样品数据来看,只有一个乌龙茶样品被误判为绿茶,其他茶类全部判别正确,对训练集样品的判别正确率为99.06%;从测试集样品来看,有一个绿茶样品被误判为乌龙茶,其他茶类全部判别正确,测试集样品的判别正确率为95%。与增加部分化合物建立的Fisher线性判别模型相比,增加之前的20个化合物表现出更好的判别效果,判别正确率更佳。
表9 Fisher线性判别模型分类结果
Figure PCTCN2021083770-appb-000011
Figure PCTCN2021083770-appb-000012
3、替换部分化合物
从20个化合物中随机选取4个化合物C17-C20替换为茶叶中的主要次生代谢成分茶氨酸(MZ:175.10772)、咖啡碱(MZ:195.08765)、表儿茶素(MZ:291.08631)和表没食子儿茶素没食子酸酯(MZ:459.09219),同样地,使用实施例1-3中的106个茶样来进行Fisher线性判别模型的构建,20个茶样来检验模型的判别准确率,保证结果的可比性。替换后的这20个化合物建立了Fisher线性判别模型,其判别结果见表10。从训练集样品数据来看,有一个乌龙茶样品和一个白茶样品都被误判为绿茶,其他茶类样品全部判别正确,对训练集样品的判别正确率为98.11%;在测试集样品中,也只有一个乌龙茶样品被误判为绿茶,该模型对测试集样品的判别的正确率为95%。与替换部分化合物建立的Fisher线性判别模型相比,替换之前的20个化合物表现出更好的判别效果,判别正确率更佳。
表10 Fisher线性判别模型分类结果
Figure PCTCN2021083770-appb-000013
综上,我们通过设置删减/增加/替换部分化合物作为对比,根据建立的Fisher线性判别模型对训练集和测试集样品的判别结果,发现并不是任意的多种化合物的组合都能达到所选20个化合物变量组合的判别效果,这20个化合物组合对于不同茶类的判别起着重要作用。
虽然本发明已以较佳实施例公开如上,但其并非用以限定本发明,任何熟悉此技术的人,在不脱离本发明的精神和范围内,都可做各种的改动与修饰,因此本发明的保护范围应该以权利要求书所界定的为准。

Claims (11)

  1. 一种茶类判别方法,其特征在于,所述方法是以20个化合物的离子强度作为评价指标,建立判别函数,以鉴别茶叶种类;其中,所述20个化合物的质荷比为:116.0648~116.0764,267.1206~267.1474,268.0906~268.1174,280.1252~280.1532,289.0561~289.085,307.0657~307.0964,308.0757~308.1065,309.0814~309.1123,364.0819~364.1183,381.0604~381.0985,425.0658~425.1083,485.0833~485.1318,518.2984~518.3503,537.2765~537.3302,554.1509~554.2063,579.1207~579.1786,607.2611~607.3218,677.3378~677.4055,744.2234~744.2978和869.1124~869.1993。
  2. 根据权利要求1所述的一种茶类判别方法,其特征在于,该方法具体包括:
    (1)对茶叶样品进行预处理,预处理包括打磨成粉、离心取上清和设置内标;
    (2)对茶叶样品进行检测得到含有各样品的各组峰面积、保留时间和质荷比信息的数据矩阵,将所得的化合物峰的离子响应强度值分别除以内标化合物离子响应强度值进行内标归一化,通过正交偏最小二乘法判别分析OPLS-DA和逐步判别分析进行变量筛选得到20个特征化合物;
    (3)将从茶叶样品获得的20个化合物的内标归一化离子响应强度值建立茶类判别模型;
    (4)将待测茶样中的20个化合物的内标归一化离子响应强度值代入所建立的茶类判别模型,获得待测茶样的类别。
  3. 根据权利要求1或2所述的一种茶类判别方法,其特征在于,将待判茶样的内标归一化离子响应强度值代入建立的不同茶类判别模型之前,还包括:采集每个种类的茶叶样品,对样品进行预处理和检测,对所得数据进行处理与分析,获得样品在设定质荷比之间的20个化合物的内标归一化离子响应强度值。
  4. 根据权利要求1-3任一项所述的一种茶类判别方法,其特征在于,所得的20个化合物是通过对不同茶类样品数据进行正交偏最小二乘法判别分析,再根据Variable Importance in Projection值大小选择大于1.5的作为候选变量,采用逐步判别分析的方法对候选变量进一步筛选,最终得到16个化合物;进一步地,通过引入变化率FC这个参数,选择在绿茶与黄芽茶之间FC>2、FC<0.5且离子响应强度高的化合物变量,最终筛选得到4个化合物;将两次筛选得到的化合物进行汇总最终得到20个化合物用于不同茶类的判别。
  5. 根据权利要求1-4任一项所述的一种茶类判别方法,其特征在于,建立茶叶判别模型的数学方法包括:随机森林法、支持向量机法或Fisher判别法。
  6. 根据权利要求1-5任一项所述的一种茶类判别方法,其特征在于,建立不同的茶类判别模型还包括:将获取收集到的不同种类的茶叶样品在设定质荷比之间的20个化合物的内标归一化离子响应强度值的数据作为样本茶叶的数据集;将样本茶叶的数据集随机划分成训练 集和验证集,训练集数据用来构建茶叶判别模型,验证集数据用来对所构建的茶类判别模型进行验证,其中,每个茶叶种类的样本茶叶样品的数量与待判别茶叶样品的数量之比不小于3:1。
  7. 根据权利要求1-6任一项所述的一种茶类判别方法,其特征在于,根据建立的不同茶类判别模型,将待判茶叶样品进行检测,获得在设定质荷比之间的20个化合物的内标归一化离子响应强度值的数据,代入构建好的茶类判别模型中,从而得到待判茶叶样品的分类结果。
  8. 根据权利要求1-7任一项所述的一种茶类判别方法,其特征在于,所述茶叶种类包括绿茶、黄茶、黑茶、白茶、红茶和乌龙茶中的一种或多种。
  9. 一种茶叶分类系统,其特征在于,所述系统包括:
    采样模块,用于采用LC-MS技术获得,获取得到待测茶叶所对应的茶叶质谱数据;
    分类模块,用于通过20个特征化合物的离子强度作为评价指标,建立判别函数,以对获取得到的茶叶质谱数据进行分类处理,从而得到待测茶叶的分类结果;所述20个化合物的质荷比为:116.0648~116.0764,267.1206~267.1474,268.0906~268.1174,280.1252~280.1532,289.0561~289.085,307.0657~307.0964,308.0757~308.1065,309.0814~309.1123,364.0819~364.1183,381.0604~381.0985,425.0658~425.1083,485.0833~485.1318,518.2984~518.3503,537.2765~537.3302,554.1509~554.2063,579.1207~579.1786,607.2611~607.3218,677.3378~677.4055,744.2234~744.2978和869.1124~869.1993。
  10. 根据权利要求9所述的一种茶叶分类系统,其特征在于,所述系统还包括用于建立茶叶分类模型的模型建立模块,所述模型建立模块具体包括:
    建模数据获取子模块,用于获取不同类别的样品茶叶所对应的样本茶叶质谱数据,将由获取得到的样本茶叶质谱数据所构成的数据集作为样本茶叶质谱数据集;
    建模处理子模块,用于将获得的样本茶叶质谱数据随机划分成训练集和验证集,利用随机森林法、支持向量机法或Fisher判别法对训练集进行建模处理,从而建立得到茶类判别模型;
    验证子模块,用于利用验证集对随机森林模型进行验证。
  11. 一种茶叶自动分选装置,其特征在于,所述装置含有权利要求9或10所述的茶叶分类系统。
PCT/CN2021/083770 2020-08-24 2021-03-30 一种茶类判别方法及系统 WO2022041718A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/078,188 US20230109241A1 (en) 2020-08-24 2022-12-09 Method and System for Differentiation of Tea Type

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010855084.2A CN112014516B (zh) 2020-08-24 2020-08-24 一种茶类判别方法及系统
CN202010855084.2 2020-08-24

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/078,188 Continuation US20230109241A1 (en) 2020-08-24 2022-12-09 Method and System for Differentiation of Tea Type

Publications (1)

Publication Number Publication Date
WO2022041718A1 true WO2022041718A1 (zh) 2022-03-03

Family

ID=73505646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083770 WO2022041718A1 (zh) 2020-08-24 2021-03-30 一种茶类判别方法及系统

Country Status (3)

Country Link
US (1) US20230109241A1 (zh)
CN (1) CN112014516B (zh)
WO (1) WO2022041718A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114878730A (zh) * 2022-06-16 2022-08-09 陕西科技大学 一种集固相微萃取和原位质谱的羊乳掺假检测装置及方法
CN115389657A (zh) * 2022-08-16 2022-11-25 华南农业大学 一种皂角的鉴别模型及其建立方法应用

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112014516B (zh) * 2020-08-24 2021-06-25 安徽农业大学 一种茶类判别方法及系统
CN113720929A (zh) * 2021-07-12 2021-11-30 中国农业科学院茶叶研究所 一种白化与黄化品种成品茶的判别方法
CN113884593B (zh) * 2021-09-28 2023-05-05 安徽农业大学 一种判别六安瓜片茶叶等级的方法
CN114019100B (zh) * 2021-10-29 2024-03-26 中国农业科学院茶叶研究所 基于多源信息融合技术的滇红工夫茶汤综合品质客观量化评价方法
CN114924002B (zh) * 2022-05-12 2024-02-13 中国检验检疫科学研究院 一种结合代谢组学技术和机器学习鉴别nfc和fc橙汁的方法

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110051983A (ko) * 2009-11-11 2011-05-18 건국대학교 산학협력단 보이차의 제조타입과 후발효 기간을 구별하는 방법 및 그 구별용 조성물
CN102692486A (zh) * 2012-06-13 2012-09-26 福建农林大学 一种基于茶叶生化成分的多茶类判别方法
CN104914190A (zh) * 2015-06-23 2015-09-16 福建省农业科学院农业工程技术研究所 一种茶叶种类鉴别和21种特征成分含量测定的方法
CN106885851A (zh) * 2017-01-22 2017-06-23 中国农业科学院茶叶研究所 一种基于手性定量分析技术的红茶产地判别方法
CN107132267A (zh) * 2017-06-21 2017-09-05 佛山科学技术学院 一种基于随机森林的茶叶分类方法及系统
CN108717078A (zh) * 2018-05-28 2018-10-30 安徽农业大学 一种基于化学成分的茶类判别方法
CN108760870A (zh) * 2018-05-28 2018-11-06 安徽农业大学 基于化学成分的茶类判别方法
CN112014516A (zh) * 2020-08-24 2020-12-01 安徽农业大学 一种茶类判别方法及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1763541A (zh) * 2005-09-27 2006-04-26 浙江大学 一种花茶品质鉴定方法
CN106770862A (zh) * 2017-01-17 2017-05-31 江苏大学 一种茶叶分类方法
WO2018227384A1 (zh) * 2017-06-13 2018-12-20 浙江海正甦力康生物科技有限公司 一种鉴定茶叶品质的方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110051983A (ko) * 2009-11-11 2011-05-18 건국대학교 산학협력단 보이차의 제조타입과 후발효 기간을 구별하는 방법 및 그 구별용 조성물
CN102692486A (zh) * 2012-06-13 2012-09-26 福建农林大学 一种基于茶叶生化成分的多茶类判别方法
CN104914190A (zh) * 2015-06-23 2015-09-16 福建省农业科学院农业工程技术研究所 一种茶叶种类鉴别和21种特征成分含量测定的方法
CN106885851A (zh) * 2017-01-22 2017-06-23 中国农业科学院茶叶研究所 一种基于手性定量分析技术的红茶产地判别方法
CN107132267A (zh) * 2017-06-21 2017-09-05 佛山科学技术学院 一种基于随机森林的茶叶分类方法及系统
CN108717078A (zh) * 2018-05-28 2018-10-30 安徽农业大学 一种基于化学成分的茶类判别方法
CN108760870A (zh) * 2018-05-28 2018-11-06 安徽农业大学 基于化学成分的茶类判别方法
CN112014516A (zh) * 2020-08-24 2020-12-01 安徽农业大学 一种茶类判别方法及系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114878730A (zh) * 2022-06-16 2022-08-09 陕西科技大学 一种集固相微萃取和原位质谱的羊乳掺假检测装置及方法
CN114878730B (zh) * 2022-06-16 2023-08-15 陕西科技大学 一种集固相微萃取和原位质谱的羊乳掺假检测装置及方法
CN115389657A (zh) * 2022-08-16 2022-11-25 华南农业大学 一种皂角的鉴别模型及其建立方法应用

Also Published As

Publication number Publication date
US20230109241A1 (en) 2023-04-06
CN112014516A (zh) 2020-12-01
CN112014516B (zh) 2021-06-25

Similar Documents

Publication Publication Date Title
WO2022041718A1 (zh) 一种茶类判别方法及系统
CA3029488A1 (en) Apparatus and method for classifying a tobacco sample into one of a predefined set of taste categories
CN102012365B (zh) 一种基于红外光谱的茶叶发酵度识别方法
JP2009014700A (ja) 緑茶の品質予測方法
CN111879846B (zh) 一种利用元素分析-稳定同位素质谱鉴别燕窝真伪的方法及应用
CN110470781B (zh) 鉴别复原乳和超高温灭菌乳的方法
CN111855757B (zh) 一种基于电子鼻的六堡茶陈香香味识别方法
CN112986430B (zh) 一种娟珊牛奶粉和荷斯坦牛奶粉的差异标志物筛选方法及其应用
CN113125590A (zh) 一种基于快速气相电子鼻技术的滇红工夫茶汤香气品质客观评价方法
CN113125588B (zh) 一种代谢组学分析技术判别鸭屎香单丛茶时空分类的应用
Gröger et al. Application of comprehensive two‐dimensional gas chromatography mass spectrometry and different types of data analysis for the investigation of cigarette particulate matter
CN108205042B (zh) 一种安化黑茶识别方法
CN109358022A (zh) 一种快速判别烟用爆珠类型的方法
CN110687215A (zh) 一种应用代谢组学技术预测茶叶年份的方法
CN115950979B (zh) 一种用于复杂基质烟草提取物产地溯源的方法
CN115792022B (zh) 一种基于感官效应的烟草中滋味物质模型及其构建方法和应用
CN108760870B (zh) 基于化学成分的茶类判别方法
CN110736718B (zh) 一种烤烟烟丝的产地及等级识别方法
CN104407019A (zh) 一种基于dfa和simca模型的烟用包装纸品质判别方法
CN112684029A (zh) 一种基于烟叶差异代谢物含量快速检测烟叶成熟度的方法及装置
CN113125589B (zh) 一种代谢组学分析技术鉴定鸭屎香单丛茶的应用
CN108717078A (zh) 一种基于化学成分的茶类判别方法
CN111257436A (zh) 一种利用质谱技术识别道地药材特异性标志物和差异性标志物及判断道地药材的方法
CN111426778B (zh) 高分辨质谱技术结合模式识别的橄榄油等级快速鉴定方法
CN114019100B (zh) 基于多源信息融合技术的滇红工夫茶汤综合品质客观量化评价方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21859567

Country of ref document: EP

Kind code of ref document: A1