CN113984946B - Crayfish freshness detection method based on gas phase electronic nose and machine learning - Google Patents

Crayfish freshness detection method based on gas phase electronic nose and machine learning Download PDF

Info

Publication number
CN113984946B
CN113984946B CN202111228666.9A CN202111228666A CN113984946B CN 113984946 B CN113984946 B CN 113984946B CN 202111228666 A CN202111228666 A CN 202111228666A CN 113984946 B CN113984946 B CN 113984946B
Authority
CN
China
Prior art keywords
chromatogram
crayfish
peak height
electronic nose
phase electronic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111228666.9A
Other languages
Chinese (zh)
Other versions
CN113984946A (en
Inventor
许艳顺
汤楚涵
颜孙洁
夏文水
余达威
姜启兴
杨方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202111228666.9A priority Critical patent/CN113984946B/en
Publication of CN113984946A publication Critical patent/CN113984946A/en
Application granted granted Critical
Publication of CN113984946B publication Critical patent/CN113984946B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8675Evaluation, i.e. decoding of the signal into analytical information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Library & Information Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a crawfish freshness detection method based on a gas-phase electronic nose and machine learning, which comprises the steps of placing a crawfish sample in a beaker, sealing the sample by using a double-layer preservative film, and standing for headspace; preheating an ultra-fast gas phase electronic nose instrument, and deeply inserting a sample injection needle into a beaker for sampling to obtain a chromatogram map; normalizing the maximum value and the minimum value of the chromatogram peak height; preprocessing the baseline data of the peak height, and eliminating the label noise of the chromatogram by using belief learning; performing feature extraction on the chromatogram by using a sequence model to obtain the trend features of the chromatogram with different freshness and odor changes; extracting the content characteristics of the volatile compounds corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the content characteristics of the volatile compounds; performing feature classification by using the spliced features of the feedforward neural network; the method can accurately obtain the odor information of the crayfishes with different freshness, and realizes the accurate classification of the freshness of the crayfishes.

Description

一种基于气相电子鼻和机器学习的小龙虾新鲜度检测方法A crayfish freshness detection method based on gas-phase electronic nose and machine learning

技术领域technical field

本发明涉及小龙虾新鲜度检测的技术领域,尤其涉及一种基于气相电子鼻和机器学习的小龙虾新鲜度检测方法。The invention relates to the technical field of crayfish freshness detection, in particular to a crayfish freshness detection method based on gas-phase electronic nose and machine learning.

背景技术Background technique

小龙虾,又名克氏原螯虾,是我国重要的淡水经济类水产品之一。小龙虾因其肉质细嫩,滋味鲜美,营养丰富而深受消费者的喜爱。近年来,我国小龙虾产业发展迅速,养殖面积及产量快速增长,2020年我国小龙虾养殖总产量达到239.37万吨。但由于小龙虾养殖环境复杂,其体表和体内通常会携带较多微生物,导致小龙虾在鲜活贮运和加工过程中鲜度会发生不同程度的下降甚至因死亡而发生腐败变质,造成小龙虾加工产品潜在的安全风险。Crayfish, also known as Procambarus clarkii, is one of the important freshwater economic aquatic products in my country. Crayfish are deeply loved by consumers because of their tender meat, delicious taste and rich nutrition. In recent years, my country's crayfish industry has developed rapidly, and the aquaculture area and output have increased rapidly. In 2020, the total output of crayfish aquaculture in my country will reach 2.3937 million tons. However, due to the complex breeding environment of crayfish, many microorganisms are usually carried on the surface and inside of the crayfish, which leads to a decrease in the freshness of the crayfish to varying degrees during the fresh storage, transportation and processing, and even corruption and deterioration due to death, resulting in small crayfish. Potential safety risks of processed lobster products.

电子鼻是综合模仿生物嗅觉系统的装置,通过识别样品中的挥发性化合物,对样品进行分类和鉴别。电子鼻不需要任何样品前处理,也不需要溶剂,使用范围广、检测时间短、灵敏度高,可以给出较为全面、客观的结果。电子鼻的类型可以分为传感器型电子鼻、质谱电子鼻以及超快速气相电子鼻。在食品、医学、中药等领域最常用的主要是传感器型电子鼻,但是传感器型电子鼻也存在耗时、传感器冗杂、外界影响较大等局限性。HeraclesⅡ超快速气相电子鼻是一种新型的气味分析仪器,配有两根极性不同的色谱柱,将气相所得的色谱峰替代传统传感器型电子鼻中的传感器信号,得到更多的化合物信号,可以精确分离极性不同的挥发性化合物,具有灵敏度高、检测时间短、适用范围广等优点,已经在牛奶、白酒、羊肉、水果的分类和鉴别中起到重要的作用。但活体小龙虾贮藏不同时间、不同新鲜度的气味差异较小,数据处理阶段难以通过简单的数据降维方式,如主成分分析等手段精确判断小龙虾新鲜度。The electronic nose is a device that comprehensively imitates the biological olfactory system, and classifies and identifies samples by identifying volatile compounds in the sample. The electronic nose does not require any sample pretreatment, nor does it require solvents. It has a wide range of applications, short detection time, and high sensitivity, and can give more comprehensive and objective results. The types of electronic noses can be divided into sensor electronic noses, mass spectrometry electronic noses and ultra-fast gas phase electronic noses. Sensor-type electronic noses are most commonly used in the fields of food, medicine, and traditional Chinese medicine, but sensor-type electronic noses also have limitations such as time-consuming, complicated sensors, and large external influences. Heracles II ultra-fast gas-phase electronic nose is a new type of odor analysis instrument, equipped with two chromatographic columns with different polarities, the chromatographic peaks obtained from the gas phase replace the sensor signals in the traditional sensor-type electronic nose, and more compound signals are obtained. It can accurately separate volatile compounds with different polarities, has the advantages of high sensitivity, short detection time, and wide application range. It has played an important role in the classification and identification of milk, liquor, mutton, and fruits. However, the difference in the smell of live crayfish for different periods of time and different freshness is small, and it is difficult to accurately judge the freshness of crayfish through simple data dimensionality reduction methods in the data processing stage, such as principal component analysis.

发明内容Contents of the invention

本部分的目的在于概述本发明的实施例的一些方面以及简要介绍一些较佳实施例。在本部分以及本申请的说明书摘要和发明名称中可能会做些简化或省略以避免使本部分、说明书摘要和发明名称的目的模糊,而这种简化或省略不能用于限制本发明的范围。The purpose of this section is to outline some aspects of embodiments of the invention and briefly describe some preferred embodiments. Some simplifications or omissions may be made in this section, as well as in the abstract and titles of this application, to avoid obscuring the purpose of this section, the abstract and titles, and such simplifications or omissions should not be used to limit the scope of the invention.

鉴于上述现有存在的问题,提出了本发明。In view of the above existing problems, the present invention is proposed.

因此,本发明提供了一种基于气相电子鼻和机器学习的小龙虾新鲜度检测方法,能够准确获得不同新鲜度小龙虾的气味信息,精确判断小龙虾新鲜度。Therefore, the present invention provides a method for detecting the freshness of crayfish based on gas-phase electronic nose and machine learning, which can accurately obtain the odor information of crayfish with different freshness, and accurately judge the freshness of crayfish.

为解决上述技术问题,本发明提供如下技术方案:包括,将小龙虾样品放置于烧杯中,并用双层保鲜膜封口,静置顶空;预热超快速气相电子鼻仪器,将进样针深入烧杯进行取样,获得色谱图;将色谱图峰高的最大值、最小值进行归一化预处理;对峰高的基线数据进行预处理,并利用置信学习策略剔除小龙虾样品的标签噪声;利用序列模型对色谱图进行特征提取,获得不同新鲜度气味变化的色谱图趋势特征;根据所述色谱图趋势特征并通过多层感知器提取各保留时间对应的挥发性化合物含量特征,并拼接所述色谱图趋势特征和挥发性化合物含量特征;利用前馈神经网络拼接后的特征进行特征分类。In order to solve the above technical problems, the present invention provides the following technical solutions: including, placing the crayfish sample in the beaker, sealing it with double-layer plastic wrap, and leaving it in the headspace; preheating the ultra-fast gas-phase electronic nose instrument, and inserting the sampling needle into the beaker Perform sampling to obtain chromatograms; normalize and preprocess the maximum and minimum peak heights of the chromatograms; preprocess the baseline data of peak heights, and use the confidence learning strategy to remove the label noise of crayfish samples; use the sequence The model extracts the features of the chromatograms to obtain the chromatogram trend features of different freshness odor changes; according to the chromatogram trend features and extracts the volatile compound content features corresponding to each retention time through a multi-layer perceptron, and splices the chromatogram Graph trend features and volatile compound content features; use feedforward neural network spliced features for feature classification.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:所述归一化预处理包括,As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning according to the present invention, wherein: the normalized preprocessing includes,

Figure BDA0003315159880000021
Figure BDA0003315159880000021

其中,hscale为归一化后的色谱图峰高,h为色谱图峰高,hmin为色谱图峰高的最小值,hmax为色谱图峰高的最大值。Among them, h scale is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min is the minimum value of the peak height of the chromatogram, and h max is the maximum value of the peak height of the chromatogram.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:预处理所述峰高的基线数据包括,计算峰高经验分布

Figure BDA0003315159880000029
对于峰高h的取值范围R={h|0<h<+∞},对于任意给定的正常数s存在一个划分S={S1,S2,...,Sr},满足:As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning in the present invention, wherein: preprocessing the baseline data of the peak height includes calculating the peak height empirical distribution
Figure BDA0003315159880000029
For the value range R={h|0<h<+∞} of the peak height h, there is a division S={S 1 , S 2 ,...,S r } for any given normal constant s, satisfying :

Si={h|(i-1)×s≤h≤i×s,sup(R)≤r×s},i=1,2,...r;S i ={h|(i-1)×s≤h≤i×s, sup(R)≤r×s}, i=1, 2,...r;

定义峰高h落在不同数据段区间的事件Ai={h|h∈Si},则该事件的发生概率

Figure BDA0003315159880000022
计算估计的基线值
Figure BDA0003315159880000023
Define the event A i ={h|h∈S i } whose peak height h falls in different data segment intervals, then the probability of occurrence of this event
Figure BDA0003315159880000022
Calculate the estimated baseline value
Figure BDA0003315159880000023

Figure BDA0003315159880000024
Figure BDA0003315159880000024

Figure BDA0003315159880000025
Figure BDA0003315159880000025

其中,其中,Sr为划分的第r个数据段;m为发生概率最大的事件Ai对应的区间的编号,Sm为发生概率最大的事件对应的划分,n为峰高总数,

Figure BDA0003315159880000026
为第i个划分的经验分布,
Figure BDA0003315159880000027
为第i-1个划分的经验分布。Among them, S r is the rth data segment divided; m is the serial number of the interval corresponding to the event A i with the highest probability of occurrence, S m is the division corresponding to the event with the highest probability of occurrence, and n is the total number of peak heights,
Figure BDA0003315159880000026
is the empirical distribution for the i-th partition,
Figure BDA0003315159880000027
Empirical distribution for the i-1th partition.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:所述峰高经验分布

Figure BDA0003315159880000028
包括,将色谱图峰高h1,h2,...,hn视为独立同分布的实随机变量,累积分布函数为F(k),得到峰高经验分布
Figure BDA0003315159880000031
As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning according to the present invention, wherein: the peak height empirical distribution
Figure BDA0003315159880000028
Including, the chromatogram peak heights h 1 , h 2 ,..., h n are regarded as independent and identically distributed real random variables, the cumulative distribution function is F(k), and the empirical distribution of peak heights is obtained
Figure BDA0003315159880000031

Figure BDA0003315159880000032
Figure BDA0003315159880000032

其中,

Figure BDA0003315159880000033
为{hi|hi≤k}的指示函数。in,
Figure BDA0003315159880000033
is an indicator function of {h i |h i ≤ k}.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:剔除色谱图的标签噪声包括,将初始标注、可能存在错误的天数标签定义为

Figure BDA0003315159880000034
真实标签定义为y*,样本总数为N,类别数量为M;将N个样本平均分为a份,取其中一份作为测试集,剩余a-1份作为训练集,计算测试集样本的估计概率p={pj,j=0,1,...,M},重复a次,得到所有样本的折外预测;计算每个标定类别j下的平均概率tj,并将其作为置信度阈值:As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning according to the present invention, wherein: removing the label noise of the chromatogram includes defining the initial label and the label of days that may have errors as
Figure BDA0003315159880000034
The real label is defined as y * , the total number of samples is N, and the number of categories is M; the N samples are evenly divided into a parts, one of which is taken as the test set, and the remaining a-1 parts are used as the training set, and the estimation of the test set samples is calculated Probability p={p j , j=0, 1,...,M}, repeat a times to get the out-of-the-box prediction of all samples; calculate the average probability t j under each calibration category j, and take it as the confidence Degree Threshold:

Figure BDA0003315159880000035
Figure BDA0003315159880000035

计算计数矩阵

Figure BDA0003315159880000036
Calculate count matrix
Figure BDA0003315159880000036

Figure BDA0003315159880000037
Figure BDA0003315159880000037

Figure BDA0003315159880000038
Figure BDA0003315159880000038

标定计数矩阵:Calibration count matrix:

Figure BDA0003315159880000039
Figure BDA0003315159880000039

估计初始标签

Figure BDA00033151598800000310
和真实标签y*的联合分布
Figure BDA00033151598800000311
Estimating the initial label
Figure BDA00033151598800000310
and the joint distribution of the true label y *
Figure BDA00033151598800000311

Figure BDA00033151598800000312
Figure BDA00033151598800000312

对于计数矩阵

Figure BDA00033151598800000313
均非对角单元,选取
Figure BDA00033151598800000314
个样本进行过滤,并按照最大间隔
Figure BDA00033151598800000315
排序,过滤每一类别的
Figure BDA00033151598800000316
个最大间距样本;For count matrix
Figure BDA00033151598800000313
are non-diagonal units, choose
Figure BDA00033151598800000314
samples and filter according to the maximum interval
Figure BDA00033151598800000315
Sort, filter for each category
Figure BDA00033151598800000316
maximum distance samples;

其中,样本x属于第j个类别的概率

Figure BDA00033151598800000317
Figure BDA00033151598800000318
为初始标记
Figure BDA00033151598800000319
约个数;l表示满足
Figure BDA00033151598800000320
均标签;
Figure BDA00033151598800000321
为计数矩阵
Figure BDA00033151598800000322
的标定值。Among them, the probability that sample x belongs to the jth category
Figure BDA00033151598800000317
Figure BDA00033151598800000318
mark as initial
Figure BDA00033151598800000319
Approximate number; l means satisfied
Figure BDA00033151598800000320
average label;
Figure BDA00033151598800000321
is the count matrix
Figure BDA00033151598800000322
calibration value.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:其特征在于:所述色谱图趋势特征包括,所述序列模型通过多次卷积初步得到粗糙的趋势特征X,而后基于LSTM网络提取X的趋势特征SLSTM(X),即所述色谱图趋势特征:As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning in the present invention, it is characterized in that: the trend feature of the chromatogram includes that the sequence model is preliminarily obtained through multiple convolutions. Get the rough trend feature X, and then extract the trend feature SLSTM(X) of X based on the LSTM network, that is, the chromatogram trend feature:

Figure BDA0003315159880000043
Figure BDA0003315159880000043

其中,LSTM1、LSTM2为LSTM网络。Among them, LSTM 1 and LSTM 2 are LSTM networks.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:还包括,趋势特征X为一个长度为65的序列,每个位置t包含了相应时间段的64个数值特征xtAs a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning in the present invention, it also includes that the trend feature X is a sequence with a length of 65, and each position t contains the corresponding time The 64 numerical features x t of the segment.

作为本发明所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的一种优选方案,其中:所述挥发性化合物含量特征包括,As a preferred solution of the crayfish freshness detection method based on gas-phase electronic nose and machine learning according to the present invention, wherein: the volatile compound content characteristics include,

layeri(X)=ReLU(XWi)layer i (X)=ReLU(XW i )

Figure BDA0003315159880000041
Figure BDA0003315159880000041

Figure BDA0003315159880000042
Figure BDA0003315159880000042

其中,layeri为第i层网络;Wi为第i层的参数;x为位置特征的设计矩阵;layero为第o层网络。Among them, layer i is the i-th layer network; W i is the parameter of the i-th layer; x is the design matrix of the position feature; layer o is the o-th layer network.

本发明的有益效果:本发明利用超快速气相电子鼻获取不同新鲜度的活体小龙虾以及死后不同时间下的小龙虾气味的变化,可以更直观、准确获得不同新鲜度小龙虾的气味信息;利用置信学习剔除标签噪声,提高预测准确率;同时利用LSTM和MLP分别提取超快速气相电子鼻色谱数据的趋势特征和挥发性化合物相对含量特征,拼接提取到的特征后使用前馈神经网络实现对小龙虾新鲜度的分类,具有较好的稳定性和准确性。Beneficial effects of the present invention: the present invention utilizes an ultra-fast gas-phase electronic nose to acquire live crayfish of different freshness and the smell changes of crayfish at different times after death, so that the odor information of crayfish with different freshness can be obtained more intuitively and accurately; Use confidence learning to remove label noise and improve prediction accuracy; at the same time, use LSTM and MLP to extract the trend characteristics and relative content characteristics of volatile compounds from ultra-fast gas-phase electronic nose chromatography data, and use the feedforward neural network to realize the comparison after splicing the extracted features. The classification of crayfish freshness has good stability and accuracy.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。其中:In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without any creative effort. in:

图1为本发明第一个实施例所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的色谱图;Fig. 1 is the chromatogram of the crayfish freshness detection method based on gas-phase electronic nose and machine learning described in the first embodiment of the present invention;

图2为本发明第二个实施例所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的峰高数据PCA示意图;Fig. 2 is a schematic diagram of peak height data PCA of the crayfish freshness detection method based on gas-phase electronic nose and machine learning described in the second embodiment of the present invention;

图3为本发明第二个实施例所述的基于气相电子鼻和机器学习的小龙虾新鲜度检测方法的混淆矩阵。Fig. 3 is a confusion matrix of the crayfish freshness detection method based on gas-phase electronic nose and machine learning according to the second embodiment of the present invention.

具体实施方式detailed description

为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合说明书附图对本发明的具体实施方式做详细的说明,显然所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明的保护的范围。In order to make the above-mentioned purposes, features and advantages of the present invention more obvious and easy to understand, the specific implementation modes of the present invention will be described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Example. Based on the embodiments of the present invention, all other embodiments obtained by ordinary persons in the art without creative efforts shall fall within the protection scope of the present invention.

在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其他不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施例的限制。In the following description, a lot of specific details are set forth in order to fully understand the present invention, but the present invention can also be implemented in other ways different from those described here, and those skilled in the art can do it without departing from the meaning of the present invention. By analogy, the present invention is therefore not limited to the specific examples disclosed below.

其次,此处所称的“一个实施例”或“实施例”是指可包含于本发明至少一个实现方式中的特定特征、结构或特性。在本说明书中不同地方出现的“在一个实施例中”并非均指同一个实施例,也不是单独的或选择性的与其他实施例互相排斥的实施例。Second, "one embodiment" or "an embodiment" referred to herein refers to a specific feature, structure or characteristic that may be included in at least one implementation of the present invention. "In one embodiment" appearing in different places in this specification does not all refer to the same embodiment, nor is it a separate or selective embodiment that is mutually exclusive with other embodiments.

本发明结合示意图进行详细描述,在详述本发明实施例时,为便于说明,表示器件结构的剖面图会不依一般比例作局部放大,而且所述示意图只是示例,其在此不应限制本发明保护的范围。此外,在实际制作中应包含长度、宽度及深度的三维空间尺寸。The present invention is described in detail in conjunction with schematic diagrams. When describing the embodiments of the present invention in detail, for the convenience of explanation, the cross-sectional view showing the device structure will not be partially enlarged according to the general scale, and the schematic diagram is only an example, which should not limit the present invention. scope of protection. In addition, the three-dimensional space dimensions of length, width and depth should be included in actual production.

同时在本发明的描述中,需要说明的是,术语中的“上、下、内和外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。此外,术语“第一、第二或第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。At the same time, in the description of the present invention, it should be noted that the orientation or positional relationship indicated by "upper, lower, inner and outer" in the terms is based on the orientation or positional relationship shown in the accompanying drawings, and is only for the convenience of describing the present invention. The invention and the simplified description do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operate in a specific orientation, and thus should not be construed as limiting the present invention. In addition, the terms "first, second or third" are used for descriptive purposes only, and should not be construed as indicating or implying relative importance.

本发明中除非另有明确的规定和限定,术语“安装、相连、连接”应做广义理解,例如:可以是固定连接、可拆卸连接或一体式连接;同样可以是机械连接、电连接或直接连接,也可以通过中间媒介间接相连,也可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本发明中的具体含义。Unless otherwise specified and limited in the present invention, the term "installation, connection, connection" should be understood in a broad sense, for example: it can be a fixed connection, a detachable connection or an integrated connection; it can also be a mechanical connection, an electrical connection or a direct connection. A connection can also be an indirect connection through an intermediary, or it can be an internal communication between two elements. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention in specific situations.

实施例1Example 1

参照图1,为本发明的第一个实施例,该实施例提供了一种基于气相电子鼻和机器学习的小龙虾新鲜度检测方法,包括:Referring to Fig. 1, it is the first embodiment of the present invention, which provides a crayfish freshness detection method based on gas-phase electronic nose and machine learning, including:

S1:将小龙虾样品放置于烧杯中,并用双层保鲜膜封口,静置顶空。S1: Place the crayfish sample in a beaker, seal it with double-layer plastic wrap, and let it stand in the headspace.

S2:预热超快速气相电子鼻仪器,将进样针深入烧杯进行取样,获得色谱图。S2: Preheat the ultra-fast gas-phase electronic nose instrument, insert the sampling needle into the beaker for sampling, and obtain the chromatogram.

S3:将色谱图峰高的最大值、最小值进行归一化预处理。S3: Perform normalization preprocessing on the maximum and minimum peak heights of the chromatogram.

归一化预处理:Normalized preprocessing:

Figure BDA0003315159880000061
Figure BDA0003315159880000061

其中,hscale为归一化后的色谱图峰高,h为色谱图峰高,hmin为色谱图峰高的最小值,hmax为色谱图峰高的最大值。Among them, h scale is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min is the minimum value of the peak height of the chromatogram, and h max is the maximum value of the peak height of the chromatogram.

S4:对峰高的基线数据进行预处理,并利用置信学习策略剔除小龙虾样品的标签噪声。S4: Preprocess the baseline data of peak height, and use the confidence learning strategy to remove the label noise of crayfish samples.

(1)预处理峰高的基线数据(1) Preprocess the baseline data of peak height

①计算峰高经验分布

Figure BDA00033151598800000611
①Calculation of peak height empirical distribution
Figure BDA00033151598800000611

将色谱图峰高h1,h2,...,hn视为独立同分布的实随机变量,累积分布函数为F(k),得到峰高经验分布

Figure BDA0003315159880000062
The chromatogram peak heights h 1 , h 2 ,..., h n are regarded as independent and identically distributed real random variables, the cumulative distribution function is F(k), and the peak height empirical distribution is obtained
Figure BDA0003315159880000062

Figure BDA0003315159880000063
Figure BDA0003315159880000063

其中,

Figure BDA0003315159880000064
为{hi|hi≤k}的指示函数。in,
Figure BDA0003315159880000064
is an indicator function of {h i |h i ≤ k}.

②对于峰高h的取值范围R={h|0<h<+∞},对于任意给定的正常数s存在一个划分S={S1,S2,...,Sr},满足:②For the value range R={h|0<h<+∞} of the peak height h, there is a partition S={S 1 , S 2 ,..., S r } for any given constant s, satisfy:

Si={h|(i-1)×s≤h≤i×s,sup(R)≤r×s},i=1,2,...r;S i ={h|(i-1)×s≤h≤i×s, sup(R)≤r×s}, i=1, 2,...r;

其中,Sr为划分的第r个数据段。Among them, S r is the divided rth data segment.

③定义峰高h落在不同数据段区间的事件Ai={h|h∈Si},则该事件的发生概率

Figure BDA0003315159880000065
计算估计的基线值
Figure BDA0003315159880000066
③ Define the event A i ={h|h∈S i } whose peak height h falls in different data intervals, then the probability of occurrence of this event
Figure BDA0003315159880000065
Calculate the estimated baseline value
Figure BDA0003315159880000066

Figure BDA0003315159880000067
Figure BDA0003315159880000067

Figure BDA0003315159880000068
Figure BDA0003315159880000068

其中,m为发生概率最大的事件Ai对应的区间的编号,Sm为发生概率最大的事件对应的划分,n为峰高总数,

Figure BDA0003315159880000069
为第i个划分的经验分布,
Figure BDA00033151598800000610
为第i-1个划分的经验分布。Among them, m is the number of the interval corresponding to the event A i with the highest probability of occurrence, S m is the division corresponding to the event with the highest probability of occurrence, n is the total number of peak heights,
Figure BDA0003315159880000069
is the empirical distribution for the i-th partition,
Figure BDA00033151598800000610
Empirical distribution for the i-1th partition.

(2)剔除小龙虾样品的标签噪声包括,(2) Eliminate label noise of crayfish samples including,

①将初始标注、可能存在错误的天数标签定义为

Figure BDA0003315159880000071
真实标签定义为y*,样本总数为N,类别数量为M。①Define the initial label and the label of days with possible errors as
Figure BDA0003315159880000071
The true label is defined as y * , the total number of samples is N, and the number of categories is M.

②将N个样本平均分为a份,取其中一份作为测试集,剩余a-1份作为训练集,计算测试集样本的估计概率p={pj,j=0,1,...,M},重复a次,得到所有样本的折外预测;② Divide the N samples into a parts on average, take one part as the test set, and the remaining a-1 part as the training set, and calculate the estimated probability p={p j , j=0,1,... , M}, repeated a times to get the out-of-the-box prediction of all samples;

其中,样本x属于第j个类别的概率

Figure BDA0003315159880000072
Among them, the probability that sample x belongs to the jth category
Figure BDA0003315159880000072

③计算每个标定类别j下的平均概率tj,并将其作为置信度阈值:③Calculate the average probability t j under each calibration category j, and use it as the confidence threshold:

Figure BDA0003315159880000073
Figure BDA0003315159880000073

其中,

Figure BDA0003315159880000074
为初始标记
Figure BDA0003315159880000075
的个数。in,
Figure BDA0003315159880000074
mark as initial
Figure BDA0003315159880000075
the number of .

④计算计数矩阵

Figure BDA0003315159880000076
④ Calculate the count matrix
Figure BDA0003315159880000076

Figure BDA0003315159880000077
Figure BDA0003315159880000077

Figure BDA0003315159880000078
Figure BDA0003315159880000078

其中,l表示满足

Figure BDA0003315159880000079
的标签。Among them, l means satisfying
Figure BDA0003315159880000079
Tag of.

⑤标定计数矩阵:⑤ Calibration count matrix:

Figure BDA00033151598800000710
Figure BDA00033151598800000710

其中,

Figure BDA00033151598800000711
为计数矩阵
Figure BDA00033151598800000712
的标定值。in,
Figure BDA00033151598800000711
is the count matrix
Figure BDA00033151598800000712
calibration value.

⑥估计初始标签

Figure BDA00033151598800000713
和真实标签y*的联合分布
Figure BDA00033151598800000714
⑥ Estimate initial label
Figure BDA00033151598800000713
and the joint distribution of the true label y *
Figure BDA00033151598800000714

Figure BDA00033151598800000715
Figure BDA00033151598800000715

⑦对于计数矩阵

Figure BDA00033151598800000716
均非对角单元,选取
Figure BDA00033151598800000717
个样本进行过滤,并按照最大间隔
Figure BDA00033151598800000718
排序,过滤每一类别的
Figure BDA00033151598800000719
个最大间距样本;⑦For count matrix
Figure BDA00033151598800000716
are non-diagonal units, choose
Figure BDA00033151598800000717
samples and filter according to the maximum interval
Figure BDA00033151598800000718
Sort, filter for each category
Figure BDA00033151598800000719
maximum distance samples;

较佳的是,小龙虾新鲜度标签因产地、运输时间差异以及个体差异存在错判的情况,人工标签只能作为新鲜度的先验估计,本实施例通过使用置信学习策略剔除错误的人工标签,即结果明显有误差的样品,提高了预测准确率。Preferably, crayfish freshness labels may be misjudged due to differences in origin, transportation time, and individual differences. Artificial labels can only be used as a priori estimates of freshness. In this embodiment, wrong artificial labels are eliminated by using a confidence learning strategy. , that is, samples with obvious errors in the results, which improves the prediction accuracy.

S5:利用序列模型对色谱图进行特征提取,获得不同新鲜度气味变化的色谱图趋势特征。S5: Use the sequence model to extract the features of the chromatogram, and obtain the trend characteristics of the chromatogram with different freshness odor changes.

序列模型通过多次卷积初步得到粗糙的趋势特征X(趋势特征X为一个长度为65的序列,每个位置t包含了相应时间段的64个数值特征xt),而后基于LSTM网络提取X的趋势特征SLSTM(X),即色谱图趋势特征:The sequence model initially obtains a rough trend feature X through multiple convolutions (the trend feature X is a sequence with a length of 65, and each position t contains 64 numerical features x t of the corresponding time period), and then extracts X based on the LSTM network The trend feature SLSTM(X), that is, the chromatogram trend feature:

Figure BDA0003315159880000081
Figure BDA0003315159880000081

其中,LSTM1、LSTM2为LSTM网络。Among them, LSTM 1 and LSTM 2 are LSTM networks.

其中需要说明的是,LSTM是一种循环神经网络,能够学习长期的依赖关系,提取气味信息的深度趋势特征。It should be noted that LSTM is a recurrent neural network that can learn long-term dependencies and extract deep trend features of odor information.

LSTM网络按时间顺序依次处理序列,对每个位置的特征xt,分别馈入输入门和遗忘门,得到控制向量it和ft,计算公式如下:The LSTM network processes the sequence sequentially in chronological order, feeds the feature x t of each position into the input gate and the forget gate respectively, and obtains the control vectors it and f t , the calculation formula is as follows:

it=σ(Wiixt+bii+Whiht-1+bhi)i t = σ(W ii x t +b ii +W hi h t-1 +b hi )

ft=σ(Wifxt+bif+Whfht-1+bhf)f t =σ(W if x t +b if +W hf h t-1 +b hf )

每个LSTM网络都包含一个记忆向量c,在不同位置之间传递;LSTM网络用遗忘门得到的控制向量ft决定需要遗忘的信息:Each LSTM network contains a memory vector c, which is passed between different locations; the LSTM network uses the control vector f t obtained by the forget gate to determine the information to be forgotten:

ft=σ(Wifxt+bif+Whfht-1+bhf)f t =σ(W if x t +b if +W hf h t-1 +b hf )

Figure BDA0003315159880000082
Figure BDA0003315159880000082

得到抛弃无用信息后的新记忆

Figure BDA0003315159880000083
Get new memories after discarding useless information
Figure BDA0003315159880000083

Figure BDA0003315159880000084
Figure BDA0003315159880000084

LSTM网络包含一个隐层特征h用于引入序列信息;在输入门中,LSTM网络使用前一时刻的隐层特征ht-1重新校正输入门提取的深层表征gt,从而捕获不同位置特征的交互作用;The LSTM network contains a hidden layer feature h for introducing sequence information; in the input gate, the LSTM network uses the hidden layer feature h t-1 of the previous moment to recorrect the deep representation g t extracted by the input gate, so as to capture the features of different positions interaction;

gt=hanh(Wigxt+big+Whght-1+bng)g t =hanh(W ig x t +b ig +W hg h t-1 +b ng )

其中,gt包含了t时刻的气味数量信息和t时刻之前的整体信息(包含趋势和数量);Among them, g t contains the odor quantity information at time t and the overall information (including trend and quantity) before time t;

输入门通过向量it决定LSTM网络需要记忆的信息

Figure BDA0003315159880000085
The input gate determines the information that the LSTM network needs to remember through the vector it
Figure BDA0003315159880000085

Figure BDA0003315159880000086
Figure BDA0003315159880000086

综上,在时刻t,LSTM网络经更新后的记忆向量ctTo sum up, at time t, the updated memory vector c t of the LSTM network.

Figure BDA0003315159880000087
Figure BDA0003315159880000087

在更新了记忆向量之后,LSTM网络还要通过输出门得到融合了本时刻信息的ht;和输入门类似的,LSTM网络使用上一时刻的隐层特征辅助计算输出门控制向量σtAfter updating the memory vector, the LSTM network also needs to obtain h t that incorporates information at this moment through the output gate; similar to the input gate, the LSTM network uses the hidden layer features of the previous moment to assist in calculating the output gate control vector σ t :

ot=σ(Wioxt+bio+Whoht-1+bho)o t =σ(W io x t +b io +W ho h t-1 +b ho )

最后,使用输出门控制向量决定需要保留在隐层向量中的信息,更新htFinally, using the output gate control vector to decide what information needs to be kept in the hidden layer vector, ht is updated:

ht=ot⊙tanh(ct)h t =o t ⊙tanh(c t )

S6:根据色谱图趋势特征并通过多层感知器提取各保留时间对应的挥发性化合物含量特征,并拼接色谱图趋势特征和挥发性化合物含量特征。S6: According to the trend feature of the chromatogram and extract the content feature of the volatile compound corresponding to each retention time through the multi-layer perceptron, and stitch the trend feature of the chromatogram and the feature of the content of the volatile compound.

本实施例使用多层感知器(MLP)提取各保留时间对应的挥发性化合物含量特征:This embodiment uses a multi-layer perceptron (MLP) to extract the volatile compound content characteristics corresponding to each retention time:

layeri(X)=ReLU(XWi)layer i (X)=ReLU(XW i )

Figure BDA0003315159880000091
Figure BDA0003315159880000091

Figure BDA0003315159880000092
Figure BDA0003315159880000092

其中,layeri为第i层网络;Wi为第i层的参数;x为位置特征的设计矩阵;layero为第o层网络。Among them, layer i is the i-th layer network; W i is the parameter of the i-th layer; x is the design matrix of the position feature; layer o is the o-th layer network.

S7:利用前馈神经网络拼接后的特征进行特征分类。S7: Classify the features by using the spliced features of the feed-forward neural network.

实施例2Example 2

为了对本方法中采用的技术效果加以验证说明,本实施例选择主成分分析、LDA、RF、SVM算法和采用本方法进行对比测试,以科学论证的手段对比试验结果,以验证本方法所具有的真实效果。In order to verify and illustrate the technical effects adopted in this method, this embodiment selects principal component analysis, LDA, RF, SVM algorithms and adopts this method to carry out comparative tests, and compares the test results by means of scientific demonstration to verify the advantages of this method. real effect.

鲜活小龙虾样品,捕捞后收集样本记为第0天,置于4℃冰箱,贮藏1、2、3、4、5天。死虾样品在4℃贮藏6h、12h,常温(25℃)条件下贮藏3h、24h,其中25℃贮藏24h组为腐败组。For fresh and live crayfish samples, the samples collected after harvesting were recorded as day 0, placed in a refrigerator at 4°C, and stored for 1, 2, 3, 4, and 5 days. The dead shrimp samples were stored at 4°C for 6h and 12h, and at room temperature (25°C) for 3h and 24h, and the group stored at 25°C for 24h was the spoilage group.

其中,死虾样品处理如下:Among them, the dead shrimp samples are processed as follows:

(1)挑选大小相近的小龙虾,剔除死虾、残肢虾;每袋5只,在真空度0.1MPa条件下使小龙虾窒息死亡。(1) Select crayfish of similar size, remove dead shrimp and shrimp with residual limbs; 5 crayfish per bag, suffocate the crayfish to death under the condition of vacuum degree of 0.1MPa.

(2)每日定时从4℃冰箱中取样20只小龙虾,置于室温下回温1h,使小龙虾恢复室温;分别将每只小龙虾放入500mL烧杯中,用双层保鲜膜封口,静置顶空30min。(2) Sample 20 crayfish from the refrigerator at 4°C every day, and place them at room temperature for 1 hour to allow the crayfish to return to room temperature; put each crayfish into a 500mL beaker and seal it with double-layer plastic wrap. Stand in the headspace for 30min.

(3)仪器预热30min,进样针深入烧杯5cm取样5000μL,使用超快速气相电子鼻HeraclesⅡ对不同贮藏天数的小龙虾进行检测,设置仪器参数如表1所示。(3) The instrument was preheated for 30 minutes, and the sampling needle was 5 cm deep into the beaker to sample 5000 μL. The ultra-fast gas-phase electronic nose Heracles II was used to detect crayfish with different storage days. The instrument parameters were set as shown in Table 1.

表1:分析参数。Table 1: Analysis parameters.

序号serial number 参数parameter 条件condition 11 进样体积Injection volume 5000μL5000μL 22 进样口温度Inlet temperature 200℃200 33 进样持续时间Injection duration 45s45s 44 捕集阱初始温度Trap initial temperature 40℃40 55 捕集阱分流速率Trap split rate 10mL/min10mL/min 66 捕集持续时间capture duration 50s50s 77 捕集阱最终温度Trap final temperature 240℃240°C 88 柱温的初始温度The initial temperature of the column temperature 50℃50 99 柱温的程序升温方式Column temperature programming method 1℃/s至80℃以2℃/s至250℃保持60s1°C/s to 80°C, 2°C/s to 250°C for 60s 1010 采集时间collection time 177s177s 1111 检测器温度Detector temperature 260℃260°C

使用主成分分析对色谱峰数据进行数据降维,得到主成分分析图(图2),其中1-5为活小龙虾分别在4℃贮藏1-5天,6、7分别为死亡小龙虾在25℃贮藏3h、24h,8、9分别为死亡小龙虾4℃贮藏6h、12h;PCA可视化分析结果可知:PC1的贡献率为21.8%,PC2的贡献率为12.6%,总贡献率为34.4%。各分类样本点之间存在交叉重叠,不能区分小龙虾不同新鲜度;因此,采用主成分分析处理样本气味信息不能区分小龙虾新鲜度。Principal component analysis was used to reduce the dimensionality of the chromatographic peak data, and the principal component analysis diagram (Fig. 2) was obtained, in which 1-5 were live crayfish stored at 4°C for 1-5 days, and 6 and 7 were dead crayfish stored at 4°C. Stored at 25°C for 3 hours and 24 hours, 8 and 9 are dead crayfish stored at 4°C for 6 hours and 12 hours respectively; PCA visualization analysis results show that the contribution rate of PC1 is 21.8%, the contribution rate of PC2 is 12.6%, and the total contribution rate is 34.4% . There is overlap between the sample points of each classification, and the different freshness of crayfish cannot be distinguished; therefore, the freshness of crayfish cannot be distinguished by using principal component analysis to process the sample odor information.

基于实施例1,本方法的具体参数设置如下:Based on embodiment 1, the specific parameters of this method are set as follows:

(1)将得到的超快速气相电子鼻HeraclesⅡ色谱数据进行预处理。(1) Preprocessing the chromatographic data obtained from the ultrafast gas-phase electronic nose Heracles II.

(2)使用样本计算峰高经验分布,求得前666条记录的峰高最大概率区间,用区间均值估计基线值。(2) Use the sample to calculate the empirical distribution of peak height, obtain the maximum probability interval of the peak height of the first 666 records, and use the interval mean to estimate the baseline value.

在超快速气相电子鼻分析时间177s中共得到35407条峰高数据,取其中前666条数据;根据前666条数据的最大值、最小值,以步长10为一个单位划分为67个数据段,统计该666条数据落在不同数据段内的数量,找到落入数据最多的数据段,取该段数据均值作为基线值。A total of 35,407 pieces of peak height data were obtained in the ultra-fast gas-phase electronic nose analysis time of 177s, and the first 666 pieces of data were selected; according to the maximum value and minimum value of the first 666 pieces of data, the step length was 10 as a unit and divided into 67 data segments. Count the number of the 666 pieces of data falling in different data segments, find the data segment with the most data, and take the average value of the data in this segment as the baseline value.

(3)将初始标注、可能存在错误的天数标签定义为

Figure BDA0003315159880000101
真实标签定义为y*,样本总数为380,共有9类,分为活虾(4℃存放1天、2天、3天、4天、5天)、死虾(4℃贮藏6h、12h、25℃贮藏3h、24h),其中活虾样本每类60只,死虾样本每类20只。(3) Define the initial label and the label of days with possible errors as
Figure BDA0003315159880000101
The true label is defined as y*, the total number of samples is 380, and there are 9 categories in total, divided into live shrimp (stored at 4°C for 1 day, 2 days, 3 days, 4 days, 5 days), dead shrimp (stored at 4°C for 6h, 12h, Store at 25°C for 3h, 24h), with 60 live shrimp samples per type and 20 dead shrimp samples per type.

进一步的,将380个样本每类平均分为5份,取其中一份作为测试集,其余4份作为训练集,计算测试集样本的估计概率p={pj,j=0,1,...,9},重复5次,得到所有样本的折外预测,其中

Figure BDA0003315159880000111
即样本x属于第j个类别的概率。Further, divide the 380 samples into 5 parts on average for each category, take one of them as the test set, and the remaining 4 as the training set, and calculate the estimated probability p={p j , j=0,1,. .., 9}, repeated 5 times to get the out-of-fold predictions of all samples, where
Figure BDA0003315159880000111
That is, the probability that sample x belongs to the jth category.

选取

Figure BDA0003315159880000112
个样本进行过滤,并按照最大间隔
Figure BDA0003315159880000113
排序,过滤每一类别的22个最大间距样本;得到的22个过滤样本中,根据真实标签剔除错判活、死虾样本共10个,其余12个样本将原始标签修改为真实标签。select
Figure BDA0003315159880000112
samples and filter according to the maximum interval
Figure BDA0003315159880000113
Sorting and filtering the 22 samples with the largest distance in each category; among the obtained 22 filtered samples, a total of 10 falsely judged live and dead shrimp samples were eliminated according to the real label, and the original label was modified to the real label for the remaining 12 samples.

(4)输入多层感知器得到的128个特征与序列模型得到的64个特征拼接后使用前馈神经网络分类得到9个分类。(4) After splicing the 128 features obtained by inputting the multi-layer perceptron and the 64 features obtained by the sequence model, 9 classifications are obtained by using the feedforward neural network classification.

分别使用本方法、LDA、RF、SVM算法分别对活虾(4℃存放1天、2天、3天、4天、5天)300个样本、死虾(4℃贮藏6h、12h、25℃贮藏3h、24h)80个样本进行建模;每组随机挑选80%的小龙虾样本作为训练集,20%的样本作为验证集;根据混淆矩阵算得模型预测的准确率(图3);Using this method, LDA, RF, and SVM algorithms respectively, 300 samples of live shrimp (stored at 4°C for 1 day, 2 days, 3 days, 4 days, and 5 days) and dead shrimp (stored at 4°C for 6h, 12h, and 25°C Store 3h, 24h) 80 samples to model; Each group randomly selects 80% of the crayfish samples as a training set, and 20% of the samples as a verification set; calculate the accuracy of the model prediction according to the confusion matrix (Fig. 3);

表2:小龙虾新鲜度预测模型结果。Table 2: Crayfish freshness prediction model results.

Figure BDA0003315159880000114
Figure BDA0003315159880000114

由表2可见,利用本方法建模5次预测准确率分别为92.00%、93.33%、95.95%、97.30%、97.30%,总体预测结果可达95.16%;本方法的预测准确率远高于利用LDA、RF、SVM建模的准确率(79.16%、75.99%、71.50%),表明本方法有良好的稳定性和准确性。It can be seen from Table 2 that the prediction accuracy rates of the five modelings using this method are 92.00%, 93.33%, 95.95%, 97.30%, and 97.30%, respectively, and the overall prediction result can reach 95.16%; the prediction accuracy rate of this method is much higher than that using The accuracy rates of LDA, RF, and SVM modeling (79.16%, 75.99%, 71.50%) indicate that this method has good stability and accuracy.

应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention without limitation, although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be carried out Modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present invention shall be covered by the claims of the present invention.

Claims (7)

1. A crayfish freshness detection method based on a gas phase electronic nose and machine learning is characterized by comprising the following steps: comprises the steps of (a) preparing a substrate,
putting a crayfish sample into a beaker, sealing the sample with a double-layer preservative film, and standing for headspace;
preheating an ultra-fast gas phase electronic nose instrument, and inserting a sample injection needle into a beaker for sampling to obtain a chromatogram;
carrying out normalization pretreatment on the maximum value and the minimum value of the peak height of the chromatogram;
preprocessing the baseline data of the peak height, and eliminating the label noise of the crayfish sample by using a belief learning strategy;
performing feature extraction on the chromatogram by using a sequence model to obtain the trend features of the chromatogram with different freshness and odor changes;
extracting the volatile compound content characteristics corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the volatile compound content characteristics;
performing feature classification by using the spliced features of the feedforward neural network;
the sequence model preliminarily obtains rough trend characteristics X through multiple convolution, and then extracts trend characteristics SLSTM (X) of X based on an LSTM network, namely the chromatogram trend characteristics:
Figure FDA0003883383250000011
wherein, LSTM 1 、LSTM 2 Is an LSTM network.
2. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 1, characterized in that: the normalization pre-processing includes the steps of,
Figure FDA0003883383250000012
wherein h is scale Is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min Minimum value of the peak height of the chromatogram, h max The maximum value of the chromatogram peak height.
3. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 2, characterized in that: pre-processing the baseline data for the peak heights includes,
calculating a peak height empirical distribution
Figure FDA0003883383250000013
For the value range R = { h |0 of peak height h<h<+ ∞, there is one division S = { S } for any given normal number S 1 ,S 2 ,…,S r And satisfies:
S i ={h|(i-1)×s≤j≤i×s,sup(R)≤r×s},i=1,2,…r;
defining event A with peak height h falling in different data segment intervals i ={h|h∈S i The probability of occurrence of the event
Figure FDA0003883383250000014
Calculating an estimated baseline value
Figure FDA0003883383250000015
Figure FDA0003883383250000021
Figure FDA0003883383250000022
Wherein S is r Is the r-th divided data segment; m is the event A with the maximum occurrence probability i Number of corresponding section, S m N is the total peak height number of the corresponding division of the event with the maximum occurrence probability,
Figure FDA0003883383250000023
for the empirical distribution of the ith partition,
Figure FDA0003883383250000024
is the empirical distribution of the i-1 th partition.
4. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 3, characterized in that: empirical distribution of peak heights
Figure FDA0003883383250000025
Comprises the steps of (a) preparing a mixture of a plurality of raw materials,
the peak height h of the chromatogram map 1 ,h 2 ,…,h n The real random variables which are regarded as independent and same distribution are accumulated to be distributed as a function F (k), and the peak height is obtainedEmpirical distribution
Figure FDA0003883383250000026
Figure FDA0003883383250000027
Wherein,
Figure FDA0003883383250000028
is { h i |h i K ≦ k).
5. The crayfish freshness detection method based on gas phase electronic nose and machine learning of claim 4, wherein: the label noise for eliminating the chromatogram includes,
defining the initial annotation, the number of days of possible error label as
Figure FDA0003883383250000029
The real tag is defined as y * The total number of samples is N, and the number of categories is M;
averagely dividing N samples into a parts, taking one part as a test set, taking the rest a-1 parts as a training set, and calculating the estimated probability p = { p } of the test set samples j J =0,1, \8230;, M }, repeating a times to obtain the folded prediction of all samples;
calculating the average probability t under each calibration category j j And as a confidence threshold:
Figure FDA00038833832500000210
calculating a count matrix
Figure FDA00038833832500000211
Figure FDA00038833832500000212
Figure FDA00038833832500000213
Calibrating a counting matrix:
Figure FDA00038833832500000214
estimating initial tags
Figure FDA0003883383250000031
And a genuine label y * Joint distribution of
Figure FDA0003883383250000032
Figure FDA0003883383250000033
For a counting matrix
Figure FDA0003883383250000034
Off diagonal cell of (1), selecting
Figure FDA0003883383250000035
Filtering the samples at a maximum interval
Figure FDA0003883383250000036
Sorting, filtering of each category
Figure FDA0003883383250000037
A maximum-spaced sample;
wherein the probability that a sample x belongs to the jth class
Figure FDA0003883383250000038
Figure FDA0003883383250000039
Is an initial mark
Figure FDA00038833832500000310
The number of (2); l represents the satisfaction of
Figure FDA00038833832500000311
The label of (2);
Figure FDA00038833832500000312
is a counting matrix
Figure FDA00038833832500000313
To the calibration value of (c).
6. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 5, characterized in that: also comprises the following steps of (1) preparing,
the trend feature X is a sequence of length 65, each position t containing 64 numerical features X for a corresponding time segment t
7. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 6, characterized in that: the volatile compound content characteristics include,
layer i (X)=ReLU(XW i )
Figure FDA00038833832500000314
Figure FDA00038833832500000315
wherein, layer i Is the i-layer network; w i Is a parameter of the ith layer; x is a design matrix of the position characteristics; layer o Is a layer o network.
CN202111228666.9A 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning Active CN113984946B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111228666.9A CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111228666.9A CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Publications (2)

Publication Number Publication Date
CN113984946A CN113984946A (en) 2022-01-28
CN113984946B true CN113984946B (en) 2022-12-27

Family

ID=79740055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111228666.9A Active CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Country Status (1)

Country Link
CN (1) CN113984946B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389324A (en) * 2013-07-18 2013-11-13 浙江工商大学 Prawn freshness detection method based on smell analysis technology
WO2020027494A1 (en) * 2018-08-01 2020-02-06 (주)한그린테크 Device and method for measuring freshness on basis of machine learning
CN111624317A (en) * 2020-06-22 2020-09-04 南京农业大学 Nondestructive testing method for judging freshness of baby cabbage

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102297930A (en) * 2011-07-20 2011-12-28 浙江大学 Method for identifying and predicting freshness of meat
CN106568907B (en) * 2016-11-07 2019-06-21 常熟理工学院 A non-destructive detection method for hairy crab freshness based on semi-supervised discriminant projection
CN110308240A (en) * 2019-05-24 2019-10-08 深圳大学 A fast identification method of electronic nose
CN111665819A (en) * 2020-06-08 2020-09-15 杭州电子科技大学 Deep learning multi-model fusion-based complex chemical process fault diagnosis method
CN111783568B (en) * 2020-06-16 2022-07-15 厦门市美亚柏科信息股份有限公司 Pedestrian re-identification method and device based on belief learning and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389324A (en) * 2013-07-18 2013-11-13 浙江工商大学 Prawn freshness detection method based on smell analysis technology
WO2020027494A1 (en) * 2018-08-01 2020-02-06 (주)한그린테크 Device and method for measuring freshness on basis of machine learning
CN111624317A (en) * 2020-06-22 2020-09-04 南京农业大学 Nondestructive testing method for judging freshness of baby cabbage

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An artificial neural network model for predicting flavour intensity in blackcurrant concentrates;Raymond K 等;《Food Quality and Preference》;20021231;第13卷(第02期);第117-128页 *
基于高光谱成像的水果品质及木材含水量评估方法;朱晓琳;《中国优秀硕士学位论文全文数据库 (工程科技Ⅰ辑)》;20210115(第01期);第B024-883页 *

Also Published As

Publication number Publication date
CN113984946A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
Kibriya et al. Tomato leaf disease detection using convolution neural network
CN105044298B (en) A kind of Eriocheir sinensis class grade of freshness detection method based on machine olfaction
CN106568907B (en) A non-destructive detection method for hairy crab freshness based on semi-supervised discriminant projection
CN109919241B (en) Detection method of hyperspectral unknown category target based on probabilistic model and deep learning
CN104792826A (en) System and method for detecting milk freshness based on electronic nose
Mohammadi et al. Improvement in crop mapping from satellite image time series by effectively supervising deep neural networks
CN107871316A (en) A method for automatic extraction of regions of interest in hand bones from X-ray films based on deep neural network
CN111783685A (en) An Improved Target Detection Algorithm Based on Single-Stage Network Model
Devi et al. IoT-deep learning based prediction of amount of pesticides and diseases in fruits
WO2024039865A1 (en) Diversity quantification
CN110490028A (en) Recognition of face network training method, equipment and storage medium based on deep learning
Dong et al. Classification of strawberry diseases and pests by improved AlexNet deep learning networks
Cui et al. Fish feeding intensity assessment in aquaculture: A new audio dataset AFFIA3K and a deep learning algorithm
CN113984946B (en) Crayfish freshness detection method based on gas phase electronic nose and machine learning
CN110726813A (en) Electronic nose prediction method based on double-layer integrated neural network
CN112733883B (en) Point supervision target detection method
CN117612644B (en) Air safety evaluation method and system
CN117746267B (en) Crown extraction method, device and medium based on semi-supervised active learning
Ramadhan et al. Identification of cavendish banana maturity using convolutional neural networks
Mohammed et al. Citrus leaves disease diagnosis
Sharmila et al. Modified Barnacles Mating Optimization with Deep Learning based Tomato Disease Detection and Classification
Elaoud et al. Multi-view-based apple maturity classification using similarity network fusion versus classical machine learning classifiers
Gill et al. Sequential Model Utilization for Fruits Classification Using Optimized Deep Learning Methods on Adam Optimizer On Fine-Tuned Epochs
Arevalo et al. Shell Texture-based Identification of Damaged Tamarindus Indica Using YOLOv5
Zhimin et al. EGG QUALITY DETECTION BASED ON LIGHTWEIHT HCES-YOLO.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant