CN106096649A

CN106096649A - Sense of taste induced signal otherness feature extracting method based on core linear discriminant analysis

Info

Publication number: CN106096649A
Application number: CN201610404407.XA
Authority: CN
Inventors: 支瑞聪; 张德政
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2016-06-08
Filing date: 2016-06-08
Publication date: 2016-11-09
Anticipated expiration: 2036-06-08
Also published as: CN106096649B

Abstract

The present invention provides a method for extracting the difference feature of taste sensory signals based on kernel linear discriminant analysis. The method includes: using an electronic tongue to detect tea samples to obtain sensor response time series signals; according to the response time series signals, using principal component residuals and The Mahalanobis distance method is used to analyze and eliminate abnormal samples; the parameters of the kernel linear discriminant analysis method are optimized, and the parameters of the kernel linear discriminant analysis method are selected based on the correct recognition rate of the quality grade of Longjing tea; The response signal is extracted with nonlinear features to obtain the taste characteristics of the tea samples; the taste characteristics of the tea samples are input into the classifier to judge the tea quality grade. Remove outliers from tea samples, and use the kernel linear discriminant analysis method after optimizing parameters to better characterize the nonlinear characteristics of tea samples of different grades, and improve the signal difference of samples after nonlinear mapping in high-dimensional feature space .

Description

Differential Feature Extraction Method of Taste Sensing Signals Based on Kernel Linear Discriminant Analysis

技术领域technical field

本发明涉及茶叶检测技术领域，特别是指一种基于核线性判别分析的味觉感应信号差异性特征提取方法。The invention relates to the technical field of tea detection, in particular to a method for extracting differences in taste sensory signals based on kernel linear discriminant analysis.

背景技术Background technique

近年来，茶叶品质检测是一项很有难度的工作，因为茶叶包含很多成分且它们对茶叶品质的影响非常不同。西湖龙井茶是中国绿茶中的典型代表。有些商贩为了谋取自身利益，将其他绿茶炒制成扁平状冒充龙井茶，或者用浙江其他产地的龙井冒充西湖龙井，扰乱了龙井茶市场，损坏了消费者利益，因此，对西湖龙井茶品质的科学检测和评价具有重要意义。In recent years, tea quality testing has been a difficult task because tea contains many components and their effects on tea quality are very different. West Lake Longjing tea is a typical representative of Chinese green tea. In order to seek their own interests, some vendors fry other green teas into flat shapes to pass off as Longjing tea, or use Longjing from other places in Zhejiang as West Lake Longjing, which disrupts the Longjing tea market and damages the interests of consumers. Scientific testing and evaluation are of great significance.

长期以来感官品评是评价茶叶品质优劣的重要方法，但该方法需要有丰富的茶学知识和审评经验，并且专业品茶师的感觉器官灵敏度也容易受到外界因素的干扰而改变。许多分析型工具也因此被用于分析茶叶的化学物质，例如高效液相色谱、气质仪等。但是传统的线性特征提取方法并不能有效地探索存在于非线性数据中的内在规律。For a long time, sensory evaluation has been an important method to evaluate the quality of tea, but this method requires rich knowledge of tea science and evaluation experience, and the sensitivity of sensory organs of professional tea tasters is also easily changed by the interference of external factors. Many analytical tools have therefore been used to analyze the chemical substances of tea, such as high performance liquid chromatography, mass spectrometry, etc. However, traditional linear feature extraction methods cannot effectively explore the inherent laws existing in nonlinear data.

发明内容Contents of the invention

本发明要解决的技术问题是提供一种基于核线性判别分析的味觉感应信号差异性特征提取方法，能够有效对茶叶的差异性特征进行提取。The technical problem to be solved by the present invention is to provide a method for extracting differential features of taste sensory signals based on kernel linear discriminant analysis, which can effectively extract differential features of tea leaves.

为解决上述技术问题，本发明的实施例提供一种基于核线性判别分析的味觉感应信号差异性特征提取方法，所述基于核线性判别分析的味觉感应信号差异性特征提取方法包括：In order to solve the above-mentioned technical problems, an embodiment of the present invention provides a method for extracting differential features of taste sensing signals based on kernel linear discriminant analysis. The method for extracting differential features of taste sensing signals based on kernel linear discriminant analysis includes:

利用电子舌对茶叶样品进行检测，得到传感器响应时序信号；Use the electronic tongue to detect the tea samples, and get the sensor response timing signal;

根据所述响应时序信号采用主成分残差和马氏距离法对异常样本进行分析和剔除；Analyzing and eliminating abnormal samples by using principal component residual and Mahalanobis distance method according to the response timing signal;

对核线性判别分析方法的参数进行优化,以龙井茶品质等级正确识别率为依据选择核线性判别分析方法的参数；The parameters of the kernel linear discriminant analysis method were optimized, and the parameters of the kernel linear discriminant analysis method were selected based on the correct recognition rate of the quality grade of Longjing tea;

采用核线性判别分析方法对传感器响应信号进行非线性特征提取,得到茶叶样品的滋味特征；Using the kernel linear discriminant analysis method to extract the nonlinear feature of the sensor response signal, the taste characteristics of the tea samples are obtained;

将茶叶样品的滋味特征输入分类器,进行茶叶品质等级判定。The taste characteristics of tea samples are input into the classifier to judge the quality of tea.

优选的，所述传感器响应时序信号包括：ZA传感器响应时序信号、BB传感器响应时序信号、JE传感器响应时序信号、GA传感器响应时序信号、HA传感器响应时序信号、JB传感器响应时序信号、CA传感器响应时序信号和Ag/AgCl参比电极传感器响应时序信号中的至少一种。Preferably, the sensor response timing signal includes: ZA sensor response timing signal, BB sensor response timing signal, JE sensor response timing signal, GA sensor response timing signal, HA sensor response timing signal, JB sensor response timing signal, CA sensor response At least one of the timing signal and the Ag/AgCl reference electrode sensor responding to the timing signal.

优选的，所述利用电子舌对茶叶样品进行检测，包括：Preferably, the tea sample is detected using the electronic tongue, including:

按照顺序将样品和清洗溶液放置在电子舌的自动进样器上；Place the sample and cleaning solution on the autosampler of the electronic tongue in sequence;

每个样品重复采集，每次采集按照“茶汤样品→清洗液1→清洗液2”的流程进行。Each sample was collected repeatedly, and each collection was carried out according to the process of "tea soup sample→cleaning solution 1→cleaning solution 2".

优选的，所述根据所述响应时序信号采用主成分残差和马氏距离法对异常样本进行分析和剔除，包括：Preferably, the analysis and elimination of abnormal samples by using the principal component residual and the Mahalanobis distance method according to the response time series signal include:

对数据集X＝[x₁,x₂,…,x_N]∈R^m×N进行中心化，；Center the data set X=[x ₁ ,x ₂ ,…,x _N ]∈R ^m×N , ;

计算中心化数据的协方差矩阵： Compute the covariance matrix for centered data:

计算协方差矩阵的特征值和特征向量：Cv＝λv；Calculate the eigenvalues and eigenvectors of the covariance matrix: Cv=λv;

将协方差矩阵的特征值λ_i按由大到小的顺序进行排序，特征值所对应的特征向量按由大到小的顺序排序；Sort the eigenvalues λ _i of the covariance matrix in descending order, and the eigenvectors corresponding to the eigenvalues are sorted in descending order;

利用将数据样本投影到Cv＝λv中得到的特征向量上；use Project the data samples onto the eigenvectors obtained in Cv=λv;

利用计算样本估计值，主成分残差即为样本真实值与估计值之差，即 use Calculate the estimated value of the sample, the principal component residual is the difference between the true value of the sample and the estimated value, that is

其中，为均值向量，v为特征值所对应的特征向量；in, is the mean vector, v is the eigenvector corresponding to the eigenvalue;

样本点之间的马氏距离为：d_ij＝[(x_i-x_j)^T[Cov(X)]^-1(x_i-x_j)]^1/2；The Mahalanobis distance between sample points is: d _ij =[( _xi -x _j ) ^T [Cov(X)] ^-1 ( _xi -x _j )] ^1/2 ;

以主成分残差值及样本点与同类样本均值之间的马氏距离为依据，将远离同类样本点总体分布的样本点判断为异常样本剔除。Based on the principal component residual value and the Mahalanobis distance between the sample point and the mean value of the same sample point, the sample points far from the overall distribution of the same sample point are judged as abnormal samples and eliminated.

优选的，所述对核线性判别分析方法的参数进行优化,以茶叶品质等级正确识别率为依据选择核线性判别分析方法的参数，包括：Preferably, the parameters of the nuclear linear discriminant analysis method are optimized, and the parameters of the nuclear linear discriminant analysis method are selected based on the correct recognition rate of tea quality grades, including:

以高斯核函数为核线性判别分析方法的非线性转换函数，对高斯核函数k(x,y)＝exp(-||x-y||²/2σ²)中的参数σ²进行优化选择；The Gaussian kernel function is used as the nonlinear conversion function of the kernel linear discriminant analysis method, and the parameter σ ² in the Gaussian kernel function k(x, y)=exp(-||xy|| ² /2σ ² ) is optimally selected;

参数选择时以茶叶品质等级判定的正确识别率为依据选择参数值。Parameter selection is based on the correct recognition rate of tea quality grades to select parameter values.

优选的，所述高斯核函数为：Preferably, the Gaussian kernel function is:

$k k ((x x,, y the y)) = = exp exp ((- - \frac{| | | | x x - - y the y | | {| |}^{22}}{22 {σ σ}^{22}}))$

优选的，所述采用核线性判别分析方法对传感器响应信号进行非线性特征提取,得到茶叶样品的滋味特征，包括：Preferably, the nonlinear feature extraction is carried out to the sensor response signal by using the nuclear linear discriminant analysis method to obtain the taste characteristics of the tea sample, including:

通过一个非线性变换把输入数据映射到高维特征空间，非线性变换后的数据点为Φ(x₁),Φ(x₂),…,Φ(x_N)；through a nonlinear transformation Map the input data to a high-dimensional feature space, and the data points after nonlinear transformation are Φ(x ₁ ), Φ(x ₂ ),...,Φ(x _N );

在高维特征空间中，将Fisher准则函数最大化的问题转化为求解特征方程的特征值和特征向量问题；In the high-dimensional feature space, the problem of maximizing the Fisher criterion function is transformed into the problem of solving the eigenvalue and eigenvector of the characteristic equation;

对传感器响应信号进行非线性特征提取,得到茶叶样品的滋味特征。The nonlinear feature extraction is performed on the sensor response signal to obtain the taste characteristics of the tea samples.

优选的，所述通过一个非线性变换把输入数据映射到高维特征空间，非线性变换后的数据点为Φ(x₁),Φ(x₂),…,Φ(x_N)，包括：Preferably, said through a non-linear transformation Map the input data to a high-dimensional feature space, and the data points after nonlinear transformation are Φ(x ₁ ), Φ(x ₂ ),...,Φ(x _N ), including:

高维特征空间中训练样本的类间离散度矩阵和类内离散度矩阵为：Between-class dispersion matrix of training samples in high-dimensional feature space and the within-class scatter matrix for:

${S S}_{b b}^{Φ Φ} = = {Σ Σ}_{i i = = 11}^{L L} {P P}_{i i} (({m m}_{Φ Φ,, i i} - - {m m}_{Φ Φ})) {(({m m}_{Φ Φ,, i i} - - {m m}_{Φ Φ}))}^{T T}$

${S S}_{w w}^{Φ Φ} = = {Σ Σ}_{i i = = 11}^{L L} \underset{{x x}_{k k} &Element; &Element; {c c}_{i i}}{Σ Σ} ((Φ Φ (({x x}_{k k})) - - {m m}_{Φ Φ,, i i})) {((Φ Φ (({x x}_{k k})) - - {m m}_{Φ Φ,, i i}))}^{T T}$

其中，m_Φ和m_Φ,i分别代表高维特征空间中所有训练样本的均值和第i类训练样本的均值；Among them, m _Φ and m _Φ,i respectively represent the mean value of all training samples in the high-dimensional feature space and the mean value of the i-th class training samples;

高维特征空间中Fisher准则函数为：The Fisher criterion function in the high-dimensional feature space is:

${J J}_{f f} ((W W)) = = | | \frac{{W W}^{T T} {S S}_{b b}^{Φ Φ} W W}{{W W}^{T T} {S S}_{w w}^{Φ Φ} W W} | |$

所述在高维特征空间中，将Fisher准则函数最大化的问题转化为求解特征方程的特征值和特征向量问题，包括：In the high-dimensional feature space, the problem of maximizing the Fisher criterion function is transformed into the problem of solving the eigenvalue and eigenvector of the characteristic equation, including:

定义N×N的核矩阵K＝[K_ij]，则上式变为 Define N×N kernel matrix K=[K _ij ], then the above formula becomes

KBKα＝λKWKαKBKα＝λKWKα

其中，K_ij＝k(x_i,x_j)＝Φ(x_i)^TΦ(x_j)，B＝GCG^T，Among them, K _ij =k(x _i ,x _j )=Φ(x _i ) ^T Φ(x _j ), B=GCG ^T ,

$C C = = d d i i a a g g (({n no}_{11},, {n no}_{22},, ... ...,, {n no}_{L L})) &Element; &Element; {R R}^{L L \times \times L L},, G G = = d d i i a a g g ((\frac{11}{{n no}_{11}} 11_{{n no}_{11} \times \times 11},, ... ...,, \frac{11}{{n no}_{L L}} 11_{{n no}_{L L} \times \times 11})),,$

$W W = = d d i i a a g g (({I I}_{{n no}_{11}} - - \frac{11}{{n no}_{11}} 11_{{n no}_{11} \times \times {n no}_{11}},, ... ...,, {I I}_{{n no}_{L L}} - - \frac{11}{{n no}_{L L}} 11_{{n no}_{L L} \times \times {n no}_{L L}})) &Element; &Element; {R R}^{N N \times \times N N}$

优选的，所述对传感器响应信号进行非线性特征提取,得到茶叶样品的滋味特征，包括：Preferably, the nonlinear feature extraction is performed on the sensor response signal to obtain the taste characteristics of the tea sample, including:

根据确定的核函数和优化的核函数参数计算训练样本集的核矩阵 K＝[K_ij]，其中K_ij＝k(x_i,x_j)＝Φ(x_i)^TΦ(x_j)；Calculate the kernel matrix K=[K _ij ] of the training sample set according to the determined kernel function and optimized kernel function parameters, where K _ij =k( _xi ,x _j )=Φ( _xi ) ^T Φ(x _j );

将Fisher准则函数最大化转化为求解广义特征值的问题，求解KBKα＝λKWKα的特征值和对应的特征向量α＝[α₁,α₂,…,α_N]^T，并按照特征值从大到小的顺序进行排序；Transform the maximization of the Fisher criterion function into the problem of solving generalized eigenvalues, solve the eigenvalues of KBKα=λKWKα and the corresponding eigenvectors α=[α ₁ ,α ₂ ,…,α _N ] ^T , and follow the eigenvalues from large to Sort in small order;

将训练样本Φ(x_i)投影到第k个特征向量上最为样本的非线性特征：Project the training sample Φ( _xi ) onto the kth eigenvector as the most sample nonlinear feature:

$Φ Φ {(({x x}_{i i}))}^{T T} {v v}^{k k} = = Φ Φ {(({x x}_{i i}))}^{T T} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ {(({x x}_{i i}))}^{T T} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} {K K}_{i i j j}$

根据确定的核函数和优化的核函数参数计算训练样本集的核矩阵K＝[K_ij]，其中K_ij＝k(x_i,x_j)＝Φ(x_i)^TΦ(x_j)；Calculate the kernel matrix K=[K _ij ] of the training sample set according to the determined kernel function and optimized kernel function parameters, where K _ij =k( _xi ,x _j )=Φ( _xi ) ^T Φ(x _j );

求解KBKα＝λKWKα的特征值和对应的特征向量α＝[α₁,α₂,…,α_N]^T，并按照特征值从大到小的顺序进行排序；Solve the eigenvalues of KBKα=λKWKα and the corresponding eigenvectors α=[α ₁ ,α ₂ ,…,α _N ] ^T , and sort them in descending order of eigenvalues;

计算测试样本与训练集样本之间的核矩阵K′，将测试样本投影到特征向量上 Calculate test samples The kernel matrix K′ between the training set sample and the test sample is projected onto the feature vector

优选的，所述将茶叶样品的滋味特征输入分类器,进行茶叶品质等级判定，包括：Preferably, the taste characteristics of the tea samples are input into the classifier to determine the quality level of the tea leaves, including:

对于待测样本和训练图像样本x_i，计算待测图像样本与训练图像样本之间的相似度 For samples to be tested and training image samples x _i , calculate the similarity between the test image samples and the training image samples

$d d (({\overset{~ ~}{x x}}_{i i},, {x x}_{i i})) = = \sqrt{{Σ Σ}_{k k = = 11}^{d d} {(({x x}_{i i k k}^{' '} - - {x x}_{i i k k}))}^{22}}$

若样本x_i属于类别k，则测试样本被决策为类别k。like Sample x _i belongs to category k, then the test sample is decided as class k.

本发明的上述技术方案的有益效果如下：The beneficial effects of above-mentioned technical scheme of the present invention are as follows:

上述方案中，能够对茶叶样品进行异常值剔除，利用优化参数后的核线性判别分析方法可以更好的表征不同等级茶叶样品的非线性特征，提升经过非线性映射后的样本在高维特征空间中的信号差异性。In the above scheme, outliers can be eliminated for tea samples, and the nonlinear characteristics of tea samples of different grades can be better characterized by using the kernel linear discriminant analysis method after optimizing parameters, and the samples after nonlinear mapping can be improved in the high-dimensional feature space. Signal difference in .

附图说明Description of drawings

图1是本发明实施例的基于核线性判别分析的味觉感应信号差异性特征提取方法流程图；1 is a flow chart of a method for extracting differences in taste sensory signals based on kernel linear discriminant analysis according to an embodiment of the present invention;

图2是本发明实施例的茶叶样本电子舌响应图谱；Fig. 2 is the tea sample electronic tongue response spectrum of the embodiment of the present invention;

图3a-3d是本发明实施例的主成分残差值-Mahalanobis距离分布图；Fig. 3a-3d is the principal component residual value-Mahalanobis distance distribution figure of the embodiment of the present invention;

图4是本发明实施例的茶叶样品正确识别率与参数σ²选择关系图；Fig. 4 is the tea sample correct recognition rate of the embodiment of the present invention and parameter σ ² Select relation figure;

图5a-5d是本发明实施例的KLDA与线性降维方法对茶叶样品区分性比较结果；Figures 5a-5d are the comparison results of the tea sample differentiation between KLDA and the linear dimensionality reduction method in the embodiment of the present invention;

图6a和6b是本发明实施例的KLDA与线性降维方法随降维维数变化的正确识别率曲线比较结果。6a and 6b are the comparison results of the correct recognition rate curves of the KLDA and the linear dimensionality reduction method according to the embodiment of the present invention as the dimensionality decreases.

具体实施方式detailed description

为使本发明要解决的技术问题、技术方案和优点更加清楚，下面将结合附图及具体实施例进行详细描述。In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following will describe in detail with reference to the drawings and specific embodiments.

如图1所示，本发明实施例的一种基于核线性判别分析的味觉感应信号差异性特征提取方法，所述基于核线性判别分析的味觉感应信号差异性特征提取方法包括：As shown in Figure 1, a method for extracting the difference feature of taste sensory signal based on kernel linear discriminant analysis according to the embodiment of the present invention, the method for extracting the difference feature of taste sensory signal based on kernel linear discriminant analysis includes:

步骤101：利用电子舌对茶叶样品进行检测，得到传感器响应时序信号；Step 101: Use the electronic tongue to detect the tea samples, and obtain the sensor response timing signal;

步骤102：根据所述响应时序信号采用主成分残差和马氏距离法对异常样本进行分析和剔除；Step 102: Analyzing and eliminating abnormal samples by using principal component residual and Mahalanobis distance method according to the response time series signal;

步骤103：对核线性判别分析方法的参数进行优化,以龙井茶品质等级正确识别率为依据选择核线性判别分析方法的参数；Step 103: optimize the parameters of the nuclear linear discriminant analysis method, and select the parameters of the nuclear linear discriminant analysis method based on the correct recognition rate of the quality grade of Longjing tea;

步骤104：采用核线性判别分析方法对传感器响应信号进行非线性特征提取,得到茶叶样品的滋味特征；Step 104: Using the kernel linear discriminant analysis method to extract the nonlinear feature of the sensor response signal to obtain the taste feature of the tea sample;

步骤105：将茶叶样品的滋味特征输入分类器,进行茶叶品质等级判定。Step 105: Input the taste characteristics of the tea samples into the classifier to determine the quality grade of the tea.

本发明实施例的基于核线性判别分析的味觉感应信号差异性特征提取方法，能够对茶叶样品进行异常值剔除，利用优化参数后的核线性判别分析方法可以更好的表征不同等级茶叶样品的非线性特征，提升经过非线性映射后的样本在高维特征空间中的信号差异性。The method for extracting the difference feature of taste sensory signals based on kernel linear discriminant analysis in the embodiment of the present invention can eliminate abnormal values of tea samples, and the kernel linear discriminant analysis method with optimized parameters can better characterize the non-identical characteristics of tea samples of different grades. Linear features, improve the signal difference of samples after nonlinear mapping in high-dimensional feature space.

其中，本发明可以采用法国Alpha MOS公司的ASTREE电子舌系统对龙井茶样品进行检测，专门针对味觉分析技术而设计。Among them, the present invention can use the ASTREE electronic tongue system of French Alpha MOS company to detect Longjing tea samples, which is specially designed for taste analysis technology.

数据采集前，电子舌系统可以经过自检、活化、训练、校准和诊断等步骤，以确保采集到的数据具有可靠性和稳定性。采集完毕后每个茶样得到七条电子舌响应指纹图谱，如图2所示。横轴为测量时间，纵轴为采集到的感应电压值。曲线上的点代表茶汤呈味物质通过传感器通道时，电势差随时间的改变情况。在测量过程中，每次检测时间为120s，电子舌每0.5s获取一组数据，每个样品经检测最终可获得7条随时间变化的时序信号，如附图2所示的电子舌传感器响应信号。因此，对于每次样品测试，所获得的数据为7×240维的矩阵。可以选择传感器响应第120s的稳定值作为后续茶叶品质模型建立的特征点。Before data collection, the electronic tongue system can go through steps such as self-check, activation, training, calibration and diagnosis to ensure the reliability and stability of the collected data. After the collection is completed, seven electronic tongue response fingerprints are obtained for each tea sample, as shown in Figure 2. The horizontal axis is the measurement time, and the vertical axis is the collected induced voltage value. The points on the curve represent the change of the potential difference with time when the taste substance of the tea soup passes through the sensor channel. During the measurement process, each detection time is 120s, and the electronic tongue acquires a set of data every 0.5s. After each sample is tested, it can finally obtain 7 time-series signals that change with time, as shown in Figure 2. The sensor response of the electronic tongue Signal. Therefore, for each sample test, the obtained data is a 7×240 dimensional matrix. The stable value of the sensor response at 120s can be selected as the feature point for subsequent tea quality model establishment.

对数据集X＝[x₁,x₂,…,x_N]∈R^m×N进行中心化， Center the dataset X=[x ₁ ,x ₂ ,…,x _N ]∈R ^m×N ,

以主成分残差值及样本点与同类样本均值之间的马氏距离为依据，将远离同类样本点总体分布的样本点判断为异常样本剔除。Based on the principal component residual value and the Mahalanobis distance between the sample point and the mean value of the same sample point, the sample points far away from the overall distribution of the same sample points are judged as abnormal samples and eliminated.

其中，分别对精品样本集、特级样本集、一级样本集和二级样本集进行主成分分析和马氏距离值计算，如附图3a-3d所示。图中标注的点即为异常样本点。将剔除异常样本点后的茶样数据进行后续数据处理。Among them, the principal component analysis and Mahalanobis distance value calculation are respectively performed on the high-quality sample set, the special sample set, the first-level sample set and the second-level sample set, as shown in Figures 3a-3d. The points marked in the figure are the abnormal sample points. The tea sample data after removing the abnormal sample points are subjected to subsequent data processing.

其中，参数选择是影响算法判别效果的重要因素，选取合适的参数会增强算法的有效性，而不切当的参数会大大削弱算法的功能，甚至使算法实效。对于核线性判别分析算法(KLDA)来说，核函数的构造是算法的核心。高维映射没有明确的形式，需要借助于核函数进行计算。用核函数k(x,y)来代替所有(Φ(x)·Φ(y))。核函数的选择决定了变换函数Φ和特征空间F。Among them, the selection of parameters is an important factor affecting the discrimination effect of the algorithm. Selecting appropriate parameters will enhance the effectiveness of the algorithm, while inappropriate parameters will greatly weaken the function of the algorithm and even make the algorithm ineffective. For Kernel Linear Discriminant Analysis Algorithm (KLDA), the construction of kernel function is the core of the algorithm. High-dimensional mapping has no clear form and needs to be calculated with the help of kernel functions. Replace all (Φ(x)·Φ(y)) with kernel function k(x,y). The choice of kernel function determines the transformation function Φ and feature space F.

本发明可以采用应用最广泛的高斯径向基核函数，该核函数中需对参数σ²进行优化选择，通过一组实验来确定参数的最佳取值。分别取σ²＝0.5,5,50,500,5000,50000，参数共取六个特定的值，特征维数以一定步长不断增大。从而得到不断增加特征维数的过程中，正确识别率的变化。不同参数下正确识别率随特征为叔的增加而变化的结果如附图3所示。The present invention can adopt the most widely used Gaussian radial basis kernel function, in which the parameter ^σ2 needs to be optimized and selected, and the optimal value of the parameter can be determined through a set of experiments. Taking σ ² =0.5, 5, 50, 500, 5000, and 50000 respectively, the parameters take six specific values in total, and the feature dimension increases continuously with a certain step size. In this way, the change of the correct recognition rate is obtained in the process of continuously increasing the feature dimension. The results of the correct recognition rate changing with the increase of the feature tertiary under different parameters are shown in Figure 3.

从附图4可以看出，当σ²＝0.5，5，500时，样品的正确识别率比较低。σ²＝50时，样品的正确识别率最高。σ²＝5000,50000时，样品的正确识别率具有相似的情况。因此本发明选择50作为KLDA算法中参数σ²的取值。It can be seen from Fig. 4 that when σ ² =0.5, 5,500, the correct recognition rate of the sample is relatively low. When σ ² =50, the correct recognition rate of the sample is the highest. When σ ² =5000, 50000, the correct recognition rate of the sample has a similar situation. Therefore, the present invention selects 50 as the value of the parameter ^σ2 in the KLDA algorithm.

KBKα＝λKWKαKBKα＝λKWKα

本发明实施例的基于核线性判别分析的味觉感应信号差异性特征提取方法，本发明首先比较核线性判别分析方法和传统线性降维方法对茶叶样本的区分能力，分别采用KLDA、PCA、LDA、LPP对电子舌智能感官仪器数据进行降维，特征维数选择为2。降维后的茶叶样本点分布图如附图5所示。In the embodiment of the present invention, based on the kernel linear discriminant analysis method for extracting differences in taste sensory signals, the present invention firstly compares the ability of the kernel linear discriminant analysis method and the traditional linear dimensionality reduction method to distinguish tea samples, respectively using KLDA, PCA, LDA, LPP reduces the dimensionality of the electronic tongue intelligent sensory instrument data, and the feature dimension is selected as 2. The distribution map of tea sample points after dimensionality reduction is shown in Figure 5.

从附图5a-5d可以看出，对于线性降维方法，不论是非监督型算法(PCA)还是监督型算法(LDA、LPP)，不同等级的茶叶样品在二维降维空间中被严重混叠。LDA和LPP算法虽然是监督型方法，但降维后的样品在二维空间中仍被严重混叠。实验结果表明，KLDA算法得到了最好的样品分离效果，相同类别的样品点被聚合在一起，而不同类别的样品点被正确分离。该算法利用茶叶样品的类别信息优化判别函数，且基于核的特征提取算法可以挖掘茶叶样品数据的非线性特征，在原始数据空间中无法被正确分类的样品经过高维映射后，在高维空间中被正确的分类。It can be seen from Figures 5a-5d that for linear dimensionality reduction methods, whether it is an unsupervised algorithm (PCA) or a supervised algorithm (LDA, LPP), tea samples of different grades are seriously aliased in the two-dimensional dimensionality reduction space . Although LDA and LPP algorithms are supervised methods, the samples after dimensionality reduction are still seriously aliased in two-dimensional space. The experimental results show that the KLDA algorithm has the best sample separation effect, the sample points of the same category are aggregated together, and the sample points of different categories are correctly separated. The algorithm uses the category information of tea samples to optimize the discriminant function, and the kernel-based feature extraction algorithm can mine the nonlinear characteristics of tea sample data. are correctly classified.

本发明对KLDA算法对不同等级龙井茶品质等级分类结果也进行分析比较。经过异常值剔除后，用于茶叶等级判别的样品共212个。为了验证核主成分分析方法的适应性和泛化性，拓宽算法的判定范围。实验中对每个等级的茶叶样品随机选取20个(case 1)、30个(case 2)样品进行训练，其余样品进行测试。附图6a-6b显示了两种实验条件下KLDA算法及PCA、LDA、LPP算法对不同等级茶叶样品的正确分类识别率比较(随降维维数变化)。可以看出，PCA算法的正确识别率整体低于LDA和LPP。LDA和LPP算法都采用监督型计算方法，从图中可以看到，当特征维数较低时，LPP的正确识别率优于LDA正确识别率，随着特征维数的增加，两者的差距逐渐缩小。总体来看，KLDA算法的正确识别率高于其他线性降维方法。由曲线图的变化趋势可以看出，算法最高正确识别率通常不是出现在特征维数最大的情The present invention also analyzes and compares the classification results of the quality grades of different grades of Longjing tea by the KLDA algorithm. After removing outliers, a total of 212 samples were used for tea grade discrimination. In order to verify the adaptability and generalization of the kernel principal component analysis method, the judgment range of the algorithm is broadened. In the experiment, 20 (case 1) and 30 (case 2) samples of each grade of tea samples were randomly selected for training, and the remaining samples were tested. Accompanying drawing 6a-6b shows the comparison of the correct classification and recognition rates of different grades of tea samples by KLDA algorithm and PCA, LDA, LPP algorithm under two experimental conditions (varies with dimensionality reduction). It can be seen that the correct recognition rate of the PCA algorithm is generally lower than that of LDA and LPP. Both LDA and LPP algorithms use supervised calculation methods. It can be seen from the figure that when the feature dimension is low, the correct recognition rate of LPP is better than that of LDA. As the feature dimension increases, the gap between the two Gradually shrink. Overall, the correct recognition rate of the KLDA algorithm is higher than other linear dimensionality reduction methods. It can be seen from the change trend of the graph that the highest correct recognition rate of the algorithm usually does not appear in the case of the largest feature dimension.

况下，识别率曲线的前期呈上升趋势，而后半端通常是下降或平级变化趋势。由此可以理解少量特征可以有效表征原样本信息，高维样本在低维空间中的映射可以提取样本有效信息而剔除噪声。Under normal circumstances, the early stage of the recognition rate curve shows an upward trend, while the second half usually shows a downward or flat trend. From this, it can be understood that a small number of features can effectively represent the original sample information, and the mapping of high-dimensional samples in low-dimensional space can extract effective information of samples and eliminate noise.

以上所述是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明所述原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above description is a preferred embodiment of the present invention, it should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, some improvements and modifications can also be made, these improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims

1. a taste sensory signal differential feature extraction method based on nuclear linear discriminant analysis, characterized in that, the taste sensory signal differential feature extraction method based on nuclear linear discriminant analysis comprises:

Use the electronic tongue to detect the tea samples, and get the sensor response timing signal;

Analyzing and eliminating abnormal samples by using principal component residual and Mahalanobis distance method according to the response timing signal;

The parameters of the kernel linear discriminant analysis method were optimized, and the parameters of the kernel linear discriminant analysis method were selected based on the correct recognition rate of the quality grade of Longjing tea;

Using the kernel linear discriminant analysis method to extract the nonlinear feature of the sensor response signal, the taste characteristics of the tea samples are obtained;

The taste characteristics of tea samples are input into the classifier to judge the quality of tea.

2. The taste sensory signal difference feature extraction method based on kernel linear discriminant analysis according to claim 1, wherein said sensor response time series signal comprises: ZA sensor response time series signal, BB sensor response time series signal, JE sensor At least one of response timing signal, GA sensor response timing signal, HA sensor response timing signal, JB sensor response timing signal, CA sensor response timing signal and Ag/AgCl reference electrode sensor response timing signal.

3. according to any one of claim 1 or 2, the method for extracting the difference feature of taste sensory signals based on nuclear linear discriminant analysis, is characterized in that, the tea sample is detected by the electronic tongue, comprising:

Place the sample and cleaning solution on the autosampler of the electronic tongue in sequence;

Each sample was collected repeatedly, and each collection was carried out according to the process of "tea soup sample→cleaning solution 1→cleaning solution 2".

4. the taste sensory signal difference feature extraction method based on nuclear linear discriminant analysis according to claim 1, is characterized in that, according to described response sequence signal, adopt principal component residual and Mahalanobis distance method to carry out abnormal sample Analysis and culling, including:

Center the dataset X=[x ₁ ,x ₂ ,…,x _N ]∈R ^m×N ,

Compute the covariance matrix for centered data:

Calculate the eigenvalues and eigenvectors of the covariance matrix: Cv=λv;

Sort the eigenvalues λ _i of the covariance matrix in descending order, and the eigenvectors corresponding to the eigenvalues are sorted in descending order;

use Project the data samples onto the eigenvectors obtained in Cv=λv;

use Calculate the estimated value of the sample, the principal component residual is the difference between the true value of the sample and the estimated value, that is

in, is the mean vector, v is the eigenvector corresponding to the eigenvalue;

The Mahalanobis distance between sample points is: d _ij =[( _xi -x _j ) ^T [Cov(X)] ^-1 ( _xi -x _j )] ^1/2 ;

Based on the principal component residual value and the Mahalanobis distance between the sample point and the mean value of the same sample point, the sample points far away from the overall distribution of the same sample points are judged as abnormal samples and eliminated.

5. according to any one of claim 1 or 4, the method for extracting the difference feature of taste sensory signals based on nuclear linear discriminant analysis, is characterized in that, the parameters of the nuclear linear discriminant analysis method are optimized, and the tea quality grade The correct recognition rate is based on the parameters of the kernel linear discriminant analysis method, including:

The Gaussian kernel function is used as the nonlinear conversion function of the kernel linear discriminant analysis method, and the parameter σ ² in the Gaussian kernel function k(x, y)=exp(-||xy|| ² /2σ ² ) is optimally selected;

Parameter selection is based on the correct recognition rate of tea quality grades to select parameter values.

6. the taste sensory signal differential feature extraction method based on nuclear linear discriminant analysis according to claim 5, is characterized in that, described Gaussian kernel function is:

k k ((x x,, y the y)) = = exp exp ((- - \frac{| | | | x x - - y the y | | {| |}^{22}}{22 {σ σ}^{22}}))

7. the taste sensory signal differential feature extraction method based on nuclear linear discriminant analysis according to claim 5, is characterized in that, described employing nuclear linear discriminant analysis method carries out nonlinear feature extraction to sensor response signal, obtains the tea sample Taste characteristics, including:

By a nonlinear transformation Φ: Map the input data to a high-dimensional feature space, and the data points after nonlinear transformation are Φ(x ₁ ), Φ(x ₂ ),...,Φ(x _N );

In the high-dimensional feature space, the problem of maximizing the Fisher criterion function is transformed into the problem of solving the eigenvalue and eigenvector of the characteristic equation;

The nonlinear feature extraction is performed on the sensor response signal to obtain the taste characteristics of the tea samples.

8. the taste sensory signal differential feature extraction method based on nuclear linear discriminant analysis according to claim 7, is characterized in that, described by a nonlinear transformation Φ: Map the input data to a high-dimensional feature space, and the data points after nonlinear transformation are Φ(x ₁ ), Φ(x ₂ ),...,Φ(x _N ), including:

Between-class dispersion matrix of training samples in high-dimensional feature space and the within-class scatter matrix for:

{S S}_{b b}^{Φ Φ} = = {Σ Σ}_{i i = = 11}^{L L} {P P}_{i i} (({m m}_{Φ Φ,, i i} - - {m m}_{Φ Φ})) {(({m m}_{Φ Φ,, i i} - - {m m}_{Φ Φ}))}^{T T}

{S S}_{w w}^{Φ Φ} = = {Σ Σ}_{i i = = 11}^{L L} \underset{{x x}_{k k} &Element; &Element; {c c}_{i i}}{Σ Σ} ((Φ Φ (({x x}_{k k})) - - {m m}_{Φ Φ,, i i})) {((Φ Φ (({x x}_{k k})) - - {m m}_{Φ Φ,, i i}))}^{T T}

Among them, m _Φ and m _Φ,i respectively represent the mean value of all training samples in the high-dimensional feature space and the mean value of the i-th class training samples;

The Fisher criterion function in the high-dimensional feature space is:

{J J}_{f f} ((W W)) = = | | \frac{{W W}^{T T} {S S}_{b b}^{Φ Φ} W W}{{W W}^{T T} {S S}_{w w}^{Φ Φ} W W} | |

In the high-dimensional feature space, the problem of maximizing the Fisher criterion function is transformed into the problem of solving the eigenvalue and eigenvector of the characteristic equation, including:

Define N×N kernel matrix K=[K _ij ], then the above formula becomes

KBKα＝λKWKα

Among them, K _ij =k(x _i ,x _j )=Φ(x _i ) ^T Φ(x _j ), B=GCG ^T ,

C=diag(n ₁ ,n ₂ ,…,n _L )∈R ^L×L ,

W W = = d d i i a a g g (({I I}_{{n no}_{11}} - - \frac{11}{{n no}_{11}} 11_{{n no}_{11} \times \times {n no}_{11}},, ... ...,, {I I}_{{n no}_{L L}} - - \frac{11}{{n no}_{L L}} 11_{{n no}_{L L} \times \times {n no}_{L L}})) &Element; &Element; {R R}^{N N \times \times N N}

9. the taste sensory signal differential feature extraction method based on kernel linear discriminant analysis according to claim 8, is characterized in that, described sensor response signal is carried out nonlinear feature extraction, obtains the taste feature of tea sample, comprises:

Calculate the kernel matrix K=[K _ij ] of the training sample set according to the determined kernel function and optimized kernel function parameters, where K _ij =k( _xi ,x _j )=Φ( _xi ) ^T Φ(x _j );

Transform the maximization of the Fisher criterion function into the problem of solving generalized eigenvalues, solve the eigenvalues of KBKα=λKWKα and the corresponding eigenvectors α=[α ₁ ,α ₂ ,…,α _N ] ^T , and follow the eigenvalues from large to Sort in small order;

Project the training sample Φ( _xi ) onto the kth eigenvector as the most sample nonlinear feature:

Φ Φ {(({x x}_{i i}))}^{T T} {v v}^{k k} = = Φ Φ {(({x x}_{i i}))}^{T T} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ {(({x x}_{i i}))}^{T T} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} {K K}_{i i j j}

Solve the eigenvalues of KBKα=λKWKα and the corresponding eigenvectors α=[α ₁ ,α ₂ ,…,α _N ] ^T , and sort them in descending order of eigenvalues;

Φ Φ {(({x x}_{i i}))}^{T T} {v v}^{k k} = = Φ Φ {(({x x}_{i i}))}^{T T} {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} Φ Φ {(({x x}_{i i}))}^{T T} Φ Φ (({x x}_{j j})) = = {Σ Σ}_{j j = = 11}^{N N} {α α}_{j j}^{k k} {K K}_{i i j j}

Calculate test samples The kernel matrix K′ between the training set sample and the test sample is projected onto the feature vector

10. the taste sensory signal difference feature extraction method based on kernel linear discriminant analysis according to claim 1, is characterized in that, described taste feature input classifier of tea sample, carries out tea quality grade judgment, comprises:

For samples to be tested and training image samples x _i , calculate the similarity between the test image samples and the training image samples

d d (({\overset{~ ~}{x x}}_{i i},, {x x}_{i i})) = = \sqrt{{Σ Σ}_{k k = = 11}^{d d} {(({x x}_{i i k k}^{' '} - - {x x}_{i i k k}))}^{22}}

like Sample x _i belongs to category k, then the test sample is decided as class k.