CN110378373B - Tea variety classification method for fuzzy non-relevant linear discriminant analysis - Google Patents

Tea variety classification method for fuzzy non-relevant linear discriminant analysis Download PDF

Info

Publication number
CN110378373B
CN110378373B CN201910505655.7A CN201910505655A CN110378373B CN 110378373 B CN110378373 B CN 110378373B CN 201910505655 A CN201910505655 A CN 201910505655A CN 110378373 B CN110378373 B CN 110378373B
Authority
CN
China
Prior art keywords
tea
sample
matrix
fuzzy
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910505655.7A
Other languages
Chinese (zh)
Other versions
CN110378373A (en
Inventor
武小红
周晶
武斌
孙俊
陈勇
傅海军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yizhang Huyi Agricultural Development Co.,Ltd.
Original Assignee
Jiangsu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University filed Critical Jiangsu University
Priority to CN201910505655.7A priority Critical patent/CN110378373B/en
Publication of CN110378373A publication Critical patent/CN110378373A/en
Application granted granted Critical
Publication of CN110378373B publication Critical patent/CN110378373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a tea variety classification method for fuzzy non-relevant linear discriminant analysis, which comprises the steps of firstly, obtaining near infrared diffuse reflection spectrum data of tea samples of a plurality of varieties by using an Antaris II Fourier transform near infrared spectrum analyzer; then preprocessing near infrared diffuse reflection spectrum data of the collected tea samples by adopting a Savitzky-Golay first derivative; then, performing dimension reduction treatment and classification discrimination information extraction on the preprocessed near infrared diffuse reflection spectrum data of the tea by using a fuzzy non-relevant linear discrimination analysis method for extracting near infrared spectrum features of the tea; finally, classifying the tea varieties by utilizing a Gath-Geva fuzzy cluster. The invention is in a fuzzy expansion form of the non-relevant linear discriminant analysis, not only can solve the undersampling problem of the linear discriminant analysis, but also can treat the characteristic extraction problem of the hard class of the non-relevant linear discriminant analysis, and has the advantages of green pollution-free, less detection samples, low identification cost, high discrimination speed, high classification accuracy and the like.

Description

Tea variety classification method for fuzzy non-relevant linear discriminant analysis
Technical Field
The invention relates to the field of pattern recognition and artificial intelligence, in particular to a tea variety classification method for fuzzy non-relevant linear discriminant analysis.
Background
Tea is taken as a green health-care drink, and is not only mixed with coffee and cocoa and called as three major world drinks; and with the progress of society and the rapid development of modern food industry, tea series products are favored by consumers. The tea is rich in caffeine, catechin, amino acids and microelements, and has effects of tranquillizing, improving eyesight, promoting salivation, quenching thirst, clearing heat, removing summer-heat, resolving food stagnation, relieving hangover, promoting urination, and removing toxic substances. At present, the unit price difference of different varieties of tea leaves in the tea market is huge, and the price fluctuation of the same variety of tea leaves along with seasons is also great due to the short storage period of part varieties. Therefore, the tea market has huge violent space, and therefore, the behavior that some illegal merchants impersonate high-quality tea with low-quality and inferior tea is frequent. In view of the consideration of standardizing the tea market and protecting the interests of consumers, it is necessary to establish a simple, rapid, accurate and lossless tea variety identification method.
The near infrared spectrum technology has the characteristics of rapidness, no damage, no pollution, no pretreatment, low analysis cost and the like, and is applied to various fields, especially the food research field in recent years. Near infrared spectrum refers to electromagnetic radiation wave with the wavelength in the range of 780-2526 nm, can reflect information of frequency multiplication and frequency combination vibration of molecular groups, and realizes quantitative and qualitative analysis of characteristic components. Research on tea leaves using near infrared spectroscopy today mainly involves two aspects: on one hand, quantitative analysis and measurement of tea components are performed, and on the other hand, qualitative classification and discrimination of tea grades, varieties, production places and the like are performed. However, due to the "high-dimensional, overlapping, redundant" nature of the near infrared spectrum, appropriate feature extraction algorithms are used to extract useful information in the spectrum before analysis to obtain better model performance.
Currently, when near infrared spectrum technology is applied to detect and classify foods, a popular feature information extraction method is mainly linear discriminant analysis. The linear discriminant analysis is a dimension reduction technology with labels, and the optimal transformation vector is found by maximizing the ratio of the inter-class distance to the intra-class distance, so that the optimal class discrimination is achieved. However, in practical application, the sample dimension is often larger than the sample number, so that the problem of undersampling is solved, and the non-correlation linear discriminant analysis is an expansion of the problem of linear discriminant analysis, so that redundancy of a transformation space is reduced, and the problem of undersampling is also solved. However, in essence, the non-relevant linear discriminant analysis is also a "hard" feature extraction algorithm, and the extracted feature information cannot completely reflect the original structural information of the sample. The invention introduces a fuzzy set theory based on non-relevant linear discriminant analysis, and provides a tea near infrared spectrum classification method for fuzzy non-relevant linear discriminant analysis to realize variety discrimination of tea.
Disclosure of Invention
Aiming at the undersampling problem of linear discriminant analysis and the characteristic extraction problem of 'hard' class of non-relevant linear discriminant analysis, the invention provides a characteristic information extraction method of fuzzy non-relevant linear discriminant analysis which combines fuzzy set theory with non-relevant linear discriminant analysis for classifying near infrared spectrums of tea. The tea variety classification method for fuzzy non-relevant linear discriminant analysis not only can solve the undersampling problem of linear discriminant analysis, but also can solve the characteristic extraction problem of hard class of non-relevant linear discriminant analysis when extracting the classification discrimination information of tea varieties. Meanwhile, the invention has the advantages of green pollution-free, less detection samples, low identification cost, high discrimination speed, high classification accuracy and the like.
A tea variety classification method for fuzzy non-relevant linear discriminant analysis adopts the technical scheme that the method comprises the following steps:
step one, acquiring near infrared diffuse reflection spectrum data of a tea sample;
step two, preprocessing the near infrared diffuse reflection spectrum of the tea sample;
step three, extracting near-red spectrum identification information of the tea subjected to fuzzy non-relevant linear identification analysis;
and step four, classifying tea varieties by Gath-Geva fuzzy clustering.
The near infrared diffuse reflection spectrum data of the tea sample is obtained after the first step, and particularly the near infrared diffuse reflection spectrum data of the tea sample is collected through an integrating sphere diffuse reflection mode of an Antaris II Fourier transform near infrared spectrum analyzer. Meanwhile, in the process of acquiring near infrared diffuse reflection spectrum data of the tea sample, the stability of factors such as temperature, humidity and the like during acquisition is ensured as much as possible, and the finally obtained tea sample is obtainedNear infrared diffuse reflection spectrum data is wave number range 10000cm -1 ~4000cm -1 1557-dimensional data of (2);
preprocessing near infrared diffuse reflection spectrum data of a tea sample, namely preprocessing the collected near infrared diffuse reflection spectrum data of the tea sample by adopting a Savitzky-Golay first derivative, and dividing the preprocessed tea sample data into a training sample set and a test sample set;
extracting near-red spectrum identification information of the tea subjected to fuzzy non-relevant linear identification analysis, and particularly carrying out dimension reduction treatment and classification identification information extraction on the near-infrared diffuse reflection spectrum data of the tea pretreated in the step two by using a method for extracting near-red spectrum characteristics of the tea subjected to fuzzy non-relevant linear identification analysis; it should be noted that, before performing the dimension reduction process and the classification discrimination information extraction, the number of classes c, the weight index η, the cluster center V and the fuzzy membership degree U need to be initialized. Wherein the clustering center V takes the mean value of each training sample as the clustering center value V j And U in the fuzzy membership matrix U ij The calculation formula of (2) is as follows:
wherein x is i Training sample for near infrared diffuse reflection spectrum of ith tea, v k Is the class center of the k-th class.
The specific process of performing the dimension reduction processing and the classification discrimination information extraction in the third step is as follows:
(1) Given a labeled training sample matrixp 1 For the dimension of the sample, n is the number of samples, S ft ,S fb ,S fw Respectively defining a fuzzy total scattering matrix, a fuzzy inter-class scattering matrix and a fuzzy intra-class scattering matrix of the training sample set:
wherein c is the number of categories, eta is the weight index, and x i For the i th tea near infrared diffuse reflection spectrum training sample,to train the overall sample mean of the sample set, u ij For sample x i Fuzzy membership belonging to class j, v j Is the sample mean (j=1, 2,3, 4) of the j-th sample in the sample set.
(2) Construction matrix H ft ,H fb ,H fw And make it meet
(3) Calculate matrix H ft Singular value decomposition of H ft =G∑S T Wherein the matrixMatrix arrayp 1 For the sample dimension, t=rank (H ft );
(4) Order theWherein, matrix->Is a matrix sigma t Inverse matrix of matrix->As a matrix G 1 Is a transposed matrix of (a). And calculates a singular value decomposition of matrix B, b=pao T Wherein matrix->
(5) Order theWherein matrix Y q Is a matrix consisting of the first q columns of matrix Y, q=rank (H fb );
(6) Finally, a characteristic projection matrix W=Y of fuzzy non-relevant linear discriminant analysis is obtained q The ith (i=1, 2, …, n) training sample x in the training sample set of the second step i Conversion to x' i =x i W, where n is the number of training samples; the kth (k=1, 2, …, n 1 ) Test samples y k Conversion to z k =y k W, where n 1 To test the number of samples.
And step four, namely classifying tea varieties by Gath-Geva fuzzy clustering, wherein the specific process is described as follows:
(1) Initializing: setting the number of tea varieties to be c (+infinity > c is more than or equal to 2), and setting an initial weight index m 0 (+∞>m 0 > 1), maximum number of iterations r max The upper error limit value epsilon, the training sample number n and the test sample number n 1 With training samples x 'in step three' i The mean value of each class of samples in the composed sample set is taken as the initial class center gamma i (0) The initial fuzzy membership is calculated as follows:
γ i (0) an initial class center, z, of class i (i=1, 2, …, c) k Is the kth (k=1, 2, …, n) in step three 1 ) And (3) testing samples.
(2) Calculate the r (r=1, 2, … …, r max ) Membership value μ at multiple iterations ik (r)
Membership value mu ik (r) Represents the r (r=1, 2, … …, r max ) The kth sample is subject to the membership value of the ith class in the iterative calculation, D ik For sample z k To the class center gamma i (r-1) Distance norm of (2), andz k for the kth test sample, γ i (r-1) Is the class center value of the i class calculated by the r-1 th iteration; s is S fi Is a fuzzy covariance matrix, and +.>n 1 To test the number of samples, mu ik (r-1) Is the fuzzy membership value of the r-1 th iterative computation; all fuzzy membership forms a fuzzy membership matrix +.>m r Weight index at the r-th iteration, m r =m 0 -rΔm;Δm=(m 0 -1)/r max
(3) Calculating the learning rate alpha at the r-th iteration ik,r
(4) Calculating class center gamma at the time of the r-th iteration i (r) (i=1,2,……,c)
Wherein gamma is i (r) For class center of class i (i=1, 2, … …, c) at the r-th iterative calculation, γ i (r-1) Class center of the ith class in the r-1 th iterative computation;
(5) When (when)Or r=r max -1, ending the iteration, otherwise returning to step (2) to continue the iterative computation. After iteration is converged, according to the final fuzzy membership mu ik (r) Discriminating test sample z k Tea belonging to which variety.
The invention has the beneficial effects that:
the tea variety classification method for fuzzy non-relevant linear discriminant analysis can solve the undersampling problem of linear discriminant analysis and the characteristic extraction problem of hard class of non-relevant linear discriminant analysis, has the advantages of green pollution-free, few detection samples, low identification cost, high discrimination speed, high classification accuracy and the like, and can be used for reducing dimension, extracting and discriminating near infrared spectrum data of tea and extracting and analyzing near infrared spectrum data of other foods.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a near infrared diffuse reflectance spectrum of 260 tea samples;
FIG. 3 is a near infrared diffuse reflectance spectrum of tea after Savitzky-Golay first derivative pretreatment;
FIG. 4 is an initial fuzzy membership graph for fuzzy non-relevant linear discriminant analysis;
FIG. 5 is a diagram of test sample data obtained by extracting classification discrimination information from the preprocessed near infrared diffuse reflectance spectrum data of tea through fuzzy non-relevant linear discrimination analysis;
FIG. 6 is an initial fuzzy membership graph of Gath-Geva fuzzy clustering;
FIG. 7 is a final fuzzy membership graph of Gath-Geva fuzzy clustering.
Detailed Description
The invention is further described below with reference to the drawings and examples.
As shown in fig. 1, the specific implementation flow of the present invention is as follows:
step one, obtainNear infrared diffuse reflection spectrum data of a tea sample are taken: four Anhui brand tea leaves of Yuexi Cuilan, liuan Guapian, maofeng and Huangshan Maofeng were collected, the number of samples of each tea leaf was 65, and a total of 260 tea leaf samples. All tea samples were ground and crushed and then filtered through a 40 mesh screen. In the process of acquiring near infrared diffuse reflection spectrum data of a tea sample, the stability and the constancy of the external environments such as temperature, humidity and the like during acquisition are ensured as much as possible. The specific steps of spectrum data acquisition include: firstly, starting up an Antaris II Fourier transform near infrared spectrum analyzer and preheating for 1 hour; second, setting the wave number range, scanning interval and scanning times of spectrum scanning to 10000cm -1 ~4000cm -1 、3.857cm -1 32; thirdly, near infrared diffuse reflection spectrum data of the tea sample are obtained by adopting an integrating sphere diffuse reflection mode of an Antaris II Fourier transform near infrared spectrum analyzer, and the obtained tea spectrum data are 1557-dimensional high-dimensional data. Meanwhile, each tea sample is sampled 3 times, and the average value of the 3 times of sampling is stored in a computer so as to provide experimental data for the establishment of a subsequent model. Fig. 2 shows near infrared diffuse reflection spectrum of 260 tea samples.
Step two, preprocessing a near infrared diffuse reflection spectrum of a tea sample: preprocessing the collected near infrared diffuse reflection spectrum data of the tea sample by using the Savitzky-Golay first derivative, wherein a preprocessed near infrared diffuse reflection spectrum diagram of the tea is shown in figure 3; and randomly distributing the preprocessed tea sample data into a training set and a testing set, wherein the tea samples of each variety randomly extract 22 samples, 88 samples form the training sample set, and the rest 43 samples form the testing sample set.
Step three, extracting near-red spectrum identification information of the tea subjected to fuzzy non-relevant linear identification analysis: and (3) obtaining classification identification information of the tea varieties from the preprocessed tea near infrared diffuse reflection spectrum data in the step two by using a fuzzy non-relevant linear identification analysis tea near infrared spectrum characteristic extraction method to obtain training samples and test samples containing identification information.
Proceeding to stepIn step three, the class number c=4 and the weight index η=1.5 are set first, and the central value v is clustered by the mean value of each class of training samples j Fuzzy membership value u ij The calculation is as follows:
wherein x is i Training sample for near infrared diffuse reflection spectrum of ith tea, v k Is the class center of class k (k=1, 2,3, 4).
The calculation results are as follows:
fuzzy membership value u ij As shown in fig. 4.
The detailed process for extracting the near-red spectrum identification information of the tea subjected to the fuzzy non-relevant linear identification analysis is as follows:
(1) Given a labeled training sample matrixSample dimension p 1 Number of training samples n=88, s, =1557 ft ,S fb ,S fw Respectively defining a fuzzy total scattering matrix, a fuzzy inter-class scattering matrix and a fuzzy intra-class scattering matrix of the training sample set:
wherein, the category number c=4, the weight index eta=1.5 and x i For the i th tea near infrared diffuse reflection spectrum training sample, the total sample mean value of the training sample setu ij For sample x i Fuzzy membership belonging to class j, v j For sample mean (j=1, 2,3, 4) of the j-th class of samples in the sample set, is->Is an intermediate variable.
(2) Construction matrix H ft ,H fb ,H fw And make it meet
(3) Calculate matrix H ft Singular value decomposition of H ft =G∑S T Wherein the matrixS represents an orthogonal matrix of order l×l, matrix +.>p 1 =1557,l=352,t=87,
(4) Order theWherein matrix->Is a matrix sigma t Inverse matrix of matrix->As a matrix G 1 Is used to determine the transposed matrix of (a),and calculates a singular value decomposition of matrix B, b=pao T Wherein, matrix->t=87,/>A is a matrix of order t×r, t=rank (H ft ) R=rank (B), the former r×r matrix is a diagonal matrix, the elements on the diagonal are the singular values of matrix B, and the elements of the remaining (r+1) ×r matrices are all 0
O represents an orthogonal matrix of order r×r, r=rank (B).
(5) Order theWherein matrix Y q Is a matrix consisting of the first q columns of matrix Y, q=3,
(6) Finally, a characteristic projection matrix W=Y of fuzzy non-relevant linear discriminant analysis is obtained q The ith (i=1, 2, …, n) training sample x in the training sample set of the second step i Conversion to x' i =x i W, where n is the number of training samples; the kth (k=1, 2, …, n 1 ) Test samples y k Conversion to z k =y k W, where n 1 To test the number of samples. Test sample z k The data distribution is shown in fig. 5.
Step four, classifying tea varieties by Gath-Geva fuzzy clustering, wherein the specific process is described as follows:
(1) Initializing: setting the number of tea varieties to be c=4 (+infinity > c is more than or equal to 2), and setting an initial weight index m 0 =2.0(+∞>m 0 > 1), maximum number of iterations r max The upper error limit value epsilon=0.00001, the training sample number n=88, and the test sample number n 1 =172, with training samples x 'in step three' i The mean value of each class of samples in the composed sample set is taken as the initial class center gamma i (0) Initial fuzzy membership mu ik (0) The calculation is as follows:
γ i (0) an initial class center, z, of class i (i=1, 2, …, c) k Is the kth (k=1, 2, …, n) in step three 1 ) And (3) testing samples.
Calculation results:
initial fuzzy membership mu ik (0) As shown in fig. 6.
(2) Calculate the r (r=1, 2, … …, r max ) Membership value μ at multiple iterations ik (r)Membership value mu ik (r) Represents the r (r=1, 2, … …, r max ) The kth sample is subject to the membership value of the ith class in the iterative calculation, D ik For sample z k To the class center gamma i (r-1) Distance norm of (2), and->z k For the kth test sample, γ i (r-1) Is the class center value of the i class calculated by the r-1 th iteration; s is S fi Is a fuzzy covariance matrix, and +.>n 1 To test the number of samples, mu ik (r-1) Is the fuzzy membership value of the r-1 th iterative computation; all fuzzy membership forms a fuzzy membership matrix +.>m r Weight index at the r-th iteration, m r =m 0 -rΔm;Δm=(m 0 -1)/r max
(3) Calculating the learning rate alpha at the r-th iteration ik,r
(4) Calculating class center gamma at the time of the r-th iteration i (r) (i=1,2,……,c)
Wherein gamma is i (r) For class center of class i (i=1, 2, … …, c) at the r-th iterative calculation, γ i (r-1) Class center of the ith class in the r-1 th iterative computation;
(5) When (when)Or r=r max -1, ending the iteration, otherwise returning to step (2) to continue the iterative computation. After iteration is converged, according to the final fuzzy membership mu ik (r) Discriminating test sample z k Tea belonging to which variety.
Experimental results: final fuzzy membership μ after termination of r=2 iterations ik (2) As shown in fig. 7, the classification accuracy of the tea samples in the discrimination test set can reach 100% according to the fuzzy membership.
The above list of detailed descriptions is only specific to practical embodiments of the present invention, and they are not intended to limit the scope of the present invention, and all equivalent embodiments or modifications that do not depart from the spirit of the present invention should be included in the scope of the present invention.

Claims (6)

1. A tea variety classification method for fuzzy non-relevant linear discriminant analysis is characterized by comprising the following steps:
step 1, acquiring near infrared diffuse reflection spectrum data of a tea sample;
step 2, preprocessing a near infrared diffuse reflection spectrum of a tea sample;
step 3, extracting near-red spectrum identification information of the tea by adopting a fuzzy non-relevant linear identification analysis method;
the implementation method of the step 3 comprises the following steps: performing dimension reduction treatment and classification identification information extraction on the near infrared diffuse reflection spectrum data of the tea leaves pretreated in the step 2; the method comprises the following specific steps:
3.1, given a labeled training sample matrixp 1 For the dimension of the sample, n is the number of samples, S ft ,S fb ,S fw Respectively defining a fuzzy total scattering matrix, a fuzzy inter-class scattering matrix and a fuzzy intra-class scattering matrix of the training sample set:
wherein c is the number of categories, eta is the weight index, and x i For the i th tea near infrared diffuse reflection spectrum training sample,to train the overall sample mean of the sample set, u ij For sample x i Fuzzy membership belonging to class j, v j J=1, 2,3,4, which is the sample mean value of the j-th sample in the sample set;
3.2, constructing matrix H ft ,H fb ,H fw And make it meet
3.3, calculating matrix H ft Singular value decomposition of H ft =GΣS T Wherein the matrix g= [ G ] 1 G 2 ],Matrix arrayp 1 For the sample dimension, t=rank (H ft );
3.4, orderWherein, matrix->For matrix sigma t Inverse matrix of matrix->As a matrix G 1 And calculates the singular value decomposition of matrix B, b=pao T Wherein matrix->t=rank(H ft );
3.5, orderWherein matrix Y q Is a matrix consisting of the first q columns of matrix Y, q=rank (H fb );
3.6, finally obtaining the characteristic projection matrix W=Y of the fuzzy non-relevant linear discriminant analysis q The ith training sample x in the training sample set of the second step i Conversion to x' i =x i W, where n is the number of training samples; the kth test sample y in the test set of the step two is processed k Conversion to z k =y k W, where n 1 For the number of test samples; where i=1, 2, …, n, k=1, 2, …, n 1
And 4, classifying tea varieties by adopting a Gath-Geva fuzzy clustering method.
2. The method for classifying tea varieties according to claim 1, wherein the implementation method of step 1 is as follows: collecting near infrared diffuse reflection spectrum data of a tea sample by using an integrating sphere diffuse reflection mode of an Antaris II Fourier transform near infrared spectrum analyzer; specifically:
firstly, starting up an Antaris II Fourier transform near infrared spectrum analyzer and preheating for 1 hour;
second, setting the wave number range, scanning interval and scanning times of spectrum scanning to 10000cm -1 ~4000cm -1 、3.857cm -1 、32;
Thirdly, near infrared diffuse reflection spectrum data of the tea sample are obtained by adopting an integrating sphere diffuse reflection mode of an Antaris II Fourier transform near infrared spectrum analyzer, and the obtained tea spectrum data are 1557-dimensional high-dimensional data.
3. The method for classifying tea varieties by fuzzy non-relevant linear discriminant analysis according to claim 2, wherein the temperature and humidity are ensured to be stable as much as possible during the collection.
4. The method for classifying tea varieties by fuzzy non-relevant linear discriminant analysis according to claim 1, wherein the implementation method of step 2 is as follows: the collected near infrared diffuse reflection spectrum data of the tea samples are preprocessed by adopting the Savitzky-Golay first derivative, and the preprocessed tea sample data are divided into a training sample set and a testing sample set.
5. A method of classifying tea varieties according to claim 1, further comprising: initializing a class number c, a weight index eta, a clustering center V and a fuzzy membership U; wherein the clustering center V takes the mean value of each training sample as the clustering center value V j And fuzzy membership matrixU in U ij The calculation formula of (2) is as follows:
wherein x is i Training sample for near infrared diffuse reflection spectrum of ith tea, v k Is the class center of the k-th class.
6. The method for classifying tea varieties according to claim 1, wherein the implementation of the step 4 comprises the steps of:
4.1, initializing: setting the number of tea varieties as c and the initial weight index m 0 Maximum number of iterations r max The upper error limit value epsilon, the training sample number n and the test sample number n 1 With training samples x 'in step three' i The mean value of each class of samples in the composed sample set is taken as the initial class center gamma i (0) The initial fuzzy membership is calculated as follows:
γ i (0) z is the initial class center of class i k Is the kth test sample in step three; wherein, C is more than or equal to 2, and m is more than or equal to 2 0 >1,i=1,2,…,c,k=1,2,…,n 1
4.2, calculating the membership value mu at the r-th iteration ik (r) ;r=1,2,……,r max
Membership value mu ik (r) Representing the membership value of the kth sample to the ith class in the nth iterative calculation, D ik For sample z k To the class center gamma i (r-1) Distance norm of (2), and->z k For the kth test sample, γ i (r-1) Is the class center value of the i class calculated by the r-1 th iteration; s is S fi Is a fuzzy covariance matrix, and +.>n 1 To test the number of samples, mu ik (r-1) Is the fuzzy membership value of the r-1 th iterative computation; all fuzzy membership forms a fuzzy membership matrix +.>m r Weight index at the r-th iteration, m r =m 0 -rm;Δm=(m 0 -1)/r max
4.3, calculating the learning Rate α at the r-th iteration ik,r
4.4, calculating the class center gamma at the time of the r iteration i (r) Where i=1, 2, … …, c,
wherein gamma is i (r) For class center of i-th class in the r-th iterative calculation, gamma i (r-1) Class center of the ith class in the r-1 th iterative computation;
4.5 when max i ||γ i (r)i (r-1) || < epsilon or r=r max -1, ending the iteration, otherwise returning to step 4.2 to continue the iterative computation; after iteration is converged, according to the final fuzzy membership mu ik (r) Discriminating test sample z k Tea belonging to which variety。
CN201910505655.7A 2019-06-12 2019-06-12 Tea variety classification method for fuzzy non-relevant linear discriminant analysis Active CN110378373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910505655.7A CN110378373B (en) 2019-06-12 2019-06-12 Tea variety classification method for fuzzy non-relevant linear discriminant analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910505655.7A CN110378373B (en) 2019-06-12 2019-06-12 Tea variety classification method for fuzzy non-relevant linear discriminant analysis

Publications (2)

Publication Number Publication Date
CN110378373A CN110378373A (en) 2019-10-25
CN110378373B true CN110378373B (en) 2024-03-12

Family

ID=68250185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910505655.7A Active CN110378373B (en) 2019-06-12 2019-06-12 Tea variety classification method for fuzzy non-relevant linear discriminant analysis

Country Status (1)

Country Link
CN (1) CN110378373B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111595803A (en) * 2020-05-09 2020-08-28 滁州职业技术学院 Apple near infrared spectrum classification method based on exponential distance measure fuzzy clustering
CN112801174A (en) * 2021-01-25 2021-05-14 江苏大学 Tea variety classification method for fuzzy linear machine learning
CN112801172A (en) * 2021-01-25 2021-05-14 江苏大学 Chinese cabbage pesticide residue qualitative analysis method based on fuzzy pattern recognition

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685098A (en) * 2018-11-12 2019-04-26 江苏大学 The local tea variety classification method of cluster is separated between a kind of Fuzzy Cluster

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685098A (en) * 2018-11-12 2019-04-26 江苏大学 The local tea variety classification method of cluster is separated between a kind of Fuzzy Cluster

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
模糊非相关鉴别C均值聚类的茶叶傅里叶红外光谱分类;武小红等;《光谱学与光谱分析》;20180630;第38卷(第6期);第1719-1723页 *

Also Published As

Publication number Publication date
CN110378373A (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN107677647B (en) Method for identifying origin of traditional Chinese medicinal materials based on principal component analysis and BP neural network
CN110378373B (en) Tea variety classification method for fuzzy non-relevant linear discriminant analysis
CN101819141B (en) Maize variety identification method based on near infrared spectrum and information processing
CN103048273B (en) Fruit near infrared spectrum sorting method based on fuzzy clustering
CN110378374B (en) Tea near infrared spectrum classification method for extracting fuzzy identification information
CN109685098B (en) Tea variety classification method for fuzzy inter-cluster separation and clustering
CN106408012A (en) Tea infrared spectrum classification method of fuzzy discrimination clustering
CN107192686B (en) Method for identifying possible fuzzy clustering tea varieties by fuzzy covariance matrix
CN105181650A (en) Method for quickly identifying tea varieties through near-infrared spectroscopy technology
CN104374739A (en) Identification method for authenticity of varieties of seeds on basis of near-infrared quantitative analysis
CN108764288A (en) A kind of GK differentiates the local tea variety sorting technique of cluster
CN103278467A (en) Rapid nondestructive high-accuracy method with for identifying abundance degree of nitrogen element in plant leaf
CN107271394A (en) A kind of fuzzy Kohonen differentiates the tealeaves infrared spectrum sorting technique of clustering network
CN108872128B (en) Tea infrared spectrum classification method based on fuzzy non-correlated C-means clustering
CN109685099B (en) Apple variety distinguishing method based on spectrum band optimization fuzzy clustering
CN108491894B (en) Tea leaf classification method capable of fuzzy identification of C-means clustering
CN110414549B (en) Tea near infrared spectrum classification method for fuzzy orthogonal linear discriminant analysis
CN109886296A (en) A kind of authentication information extracts the local tea variety classification method of formula noise cluster
CN106570520A (en) Infrared spectroscopy tea quality identification method mixed with GK clustering
CN109001181A (en) A kind of edible oil type method for quick identification of Raman spectrum canonical correlation analysis fusion
CN111595804A (en) Fuzzy clustering tea near infrared spectrum classification method
CN111881738B (en) Near infrared spectrum classification method for tea leaves through nuclear fuzzy orthogonal discriminant analysis
CN110108661B (en) Tea near infrared spectrum classification method based on fuzzy maximum entropy clustering
CN112801173B (en) Lettuce near infrared spectrum classification method based on QR fuzzy discriminant analysis
CN102999765B (en) The pork storage time decision method of adaptive boosting method and irrelevant discriminatory analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240719

Address after: 423000 Industrial Undertaking Park, Economic Development Zone, Yizhang County, Chenzhou City, Hunan Province

Patentee after: Yizhang Huyi Agricultural Development Co.,Ltd.

Country or region after: China

Address before: Zhenjiang City, Jiangsu Province, 212013 Jingkou District Road No. 301

Patentee before: JIANGSU University

Country or region before: China

TR01 Transfer of patent right