CN113984946B - Crayfish freshness detection method based on gas phase electronic nose and machine learning - Google Patents

Crayfish freshness detection method based on gas phase electronic nose and machine learning Download PDF

Info

Publication number
CN113984946B
CN113984946B CN202111228666.9A CN202111228666A CN113984946B CN 113984946 B CN113984946 B CN 113984946B CN 202111228666 A CN202111228666 A CN 202111228666A CN 113984946 B CN113984946 B CN 113984946B
Authority
CN
China
Prior art keywords
chromatogram
peak height
crayfish
electronic nose
phase electronic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111228666.9A
Other languages
Chinese (zh)
Other versions
CN113984946A (en
Inventor
许艳顺
汤楚涵
颜孙洁
夏文水
余达威
姜启兴
杨方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202111228666.9A priority Critical patent/CN113984946B/en
Publication of CN113984946A publication Critical patent/CN113984946A/en
Application granted granted Critical
Publication of CN113984946B publication Critical patent/CN113984946B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8675Evaluation, i.e. decoding of the signal into analytical information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computational Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Pure & Applied Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Library & Information Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention discloses a crawfish freshness detection method based on a gas-phase electronic nose and machine learning, which comprises the steps of placing a crawfish sample in a beaker, sealing the sample by using a double-layer preservative film, and standing for headspace; preheating an ultra-fast gas phase electronic nose instrument, and deeply inserting a sample injection needle into a beaker for sampling to obtain a chromatogram map; normalizing the maximum value and the minimum value of the chromatogram peak height; preprocessing the baseline data of the peak height, and eliminating the label noise of the chromatogram by using belief learning; performing feature extraction on the chromatogram by using a sequence model to obtain the trend features of the chromatogram with different freshness and odor changes; extracting the content characteristics of the volatile compounds corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the content characteristics of the volatile compounds; performing feature classification by using the spliced features of the feedforward neural network; the method can accurately obtain the odor information of the crayfishes with different freshness, and realizes the accurate classification of the freshness of the crayfishes.

Description

Crayfish freshness detection method based on gas phase electronic nose and machine learning
Technical Field
The invention relates to the technical field of crayfish freshness detection, in particular to a crayfish freshness detection method based on a gas phase electronic nose and machine learning.
Background
Crayfish, also known as procambarus clarkii, is one of the important freshwater economic aquatic products in China. The crayfish is popular with consumers because of its tender meat, delicious taste and rich nutrition. In recent years, the crayfish industry in China is rapidly developed, the breeding area and the breeding yield are rapidly increased, and the total breeding yield of the crayfish in China reaches 239.37 ten thousand tons in 2020. However, the crayfish breeding environment is complex, and the crayfish generally carries more microorganisms on the body surface and in the body, so that the crayfish is reduced in freshness to different degrees in the processes of fresh-keeping, storage, transportation and processing, and even is rotten and deteriorated due to death, and the potential safety risk of the crayfish processed products is caused.
The electronic nose is a device comprehensively simulating a biological olfactory system, and classifies and identifies samples by identifying volatile compounds in the samples. The electronic nose does not need any sample pretreatment, does not need a solvent, has wide application range, short detection time and high sensitivity, and can give a relatively comprehensive and objective result. The types of the electronic nose may be classified into a sensor type electronic nose, a mass spectrum electronic nose, and an ultra-fast gas phase electronic nose. The most common sensor type electronic nose is mainly used in the fields of food, medicine, traditional Chinese medicine and the like, but the sensor type electronic nose also has the limitations of time consumption, sensor redundancy, large external influence and the like. Heracles II ultra-fast gas phase electronic nose is a novel odor analysis instrument, is provided with two chromatographic columns with different polarities, replaces sensor signals in a traditional sensor type electronic nose with chromatographic peaks obtained by gas phase, obtains more compound signals, can accurately separate volatile compounds with different polarities, has the advantages of high sensitivity, short detection time, wide application range and the like, and plays an important role in the classification and identification of milk, white spirit, mutton and fruits. However, the odor difference of different freshness of the live crayfishes stored in different time is small, and the freshness of the crayfishes is difficult to accurately judge in the data processing stage through a simple data dimension reduction mode, such as principal component analysis and the like.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the invention provides the crayfish freshness detection method based on the gas-phase electronic nose and machine learning, which can accurately obtain the odor information of crayfish with different freshness and accurately judge the freshness of the crayfish.
In order to solve the technical problems, the invention provides the following technical scheme: comprises placing a crayfish sample in a beaker, sealing the sample with a double-layer preservative film, and standing for headspace; preheating an ultra-fast gas phase electronic nose instrument, and inserting a sample injection needle into a beaker for sampling to obtain a chromatogram; carrying out normalization pretreatment on the maximum value and the minimum value of the chromatogram peak height; preprocessing the baseline data of the peak height, and eliminating the label noise of the crayfish sample by using a belief learning strategy; performing feature extraction on the chromatogram by using a sequence model to obtain the trend features of the chromatogram with different freshness and odor changes; extracting the content characteristics of the volatile compounds corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the content characteristics of the volatile compounds; and performing feature classification by using the features spliced by the feedforward neural network.
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: the normalization pre-processing includes the steps of,
Figure BDA0003315159880000021
wherein h is scale Is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min Minimum value of the peak height of the chromatogram, h max The maximum value of the chromatogram peak height.
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: preprocessing the baseline data for the peak heights includes calculating an empirical distribution of peak heights
Figure BDA0003315159880000029
For a range of values R of peak height h = { h |0 < h < + > ∞ }, there is a division S = { S = for any given normal number S 1 ,S 2 ,...,S r And satisfies:
S i ={h|(i-1)×s≤h≤i×s,sup(R)≤r×s},i=1,2,...r;
defining event A with peak height h falling in different data segment intervals i ={h|h∈S i H, the probability of occurrence of the event
Figure BDA0003315159880000022
Calculating an estimated baseline value
Figure BDA0003315159880000023
Figure BDA0003315159880000024
Figure BDA0003315159880000025
Wherein, S r Is the r-th divided data segment; m is the event A with the maximum occurrence probability i Number of corresponding section, S m N is the total peak height for the partition corresponding to the event with the highest occurrence probability,
Figure BDA0003315159880000026
for the empirical distribution of the ith partition,
Figure BDA0003315159880000027
is the empirical distribution of the i-1 th partition.
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: empirical distribution of peak heights
Figure BDA0003315159880000028
Comprises the steps of measuring the peak height h of the chromatogram 1 ,h 2 ,...,h n The real random variables which are regarded as independent and same distribution are subjected to the cumulative distribution function of F (k), and the empirical distribution of the peak heights is obtained
Figure BDA0003315159880000031
Figure BDA0003315159880000032
Wherein the content of the first and second substances,
Figure BDA0003315159880000033
is { h i |h i K ≦ k).
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: label noise for chromatogram culling includes defining an initial labeling, possibly false day label as
Figure BDA0003315159880000034
The real tag is defined as y * The total number of samples is N, and the number of categories is M; averagely dividing N samples into a parts, taking one part as a test set, taking the rest a-1 parts as a training set, and calculating the estimated probability p = { p } of the test set samples j J =0, 1.., M }, repeating a times to obtain the folded prediction of all samples; calculating the average probability t under each calibration category j j And as a confidence threshold:
Figure BDA0003315159880000035
calculating a count matrix
Figure BDA0003315159880000036
Figure BDA0003315159880000037
Figure BDA0003315159880000038
Calibrating a counting matrix:
Figure BDA0003315159880000039
estimating initial tags
Figure BDA00033151598800000310
And a genuine label y * Joint distribution of
Figure BDA00033151598800000311
Figure BDA00033151598800000312
For a counting matrix
Figure BDA00033151598800000313
Non-diagonal cells, selecting
Figure BDA00033151598800000314
Filtering the samples at a maximum interval
Figure BDA00033151598800000315
Sorting, filtering of each category
Figure BDA00033151598800000316
A maximum-spaced sample;
wherein the probability that a sample x belongs to the jth class
Figure BDA00033151598800000317
Figure BDA00033151598800000318
Is an initial mark
Figure BDA00033151598800000319
About the number; l represents satisfy
Figure BDA00033151598800000320
Uniformly labeling;
Figure BDA00033151598800000321
is a counting matrix
Figure BDA00033151598800000322
To the calibration value of (c).
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: the method is characterized in that: the chromatogram trend characteristics comprise that the sequence model preliminarily obtains rough trend characteristics X through multiple convolution, and then extracts trend characteristics SLSTM (X) of X based on an LSTM network, namely the chromatogram trend characteristics:
Figure BDA0003315159880000043
wherein, LSTM 1 、LSTM 2 Is an LSTM network.
As a preferable scheme of the crawfish freshness detection method based on the gas phase electronic nose and the machine learning, the crawfish freshness detection method based on the gas phase electronic nose and the machine learning comprises the following steps: further comprising, the trend feature X is a sequence with a length of 65, and each position t contains 64 numerical features X of corresponding time period t
As a preferable scheme of the crawfish freshness detection method based on the gas-phase electronic nose and machine learning, the method comprises the following steps: the volatile compound content characteristics include,
layer i (X)=ReLU(XW i )
Figure BDA0003315159880000041
Figure BDA0003315159880000042
wherein, layer i Is the i-layer network; w i Is a parameter of the ith layer; x is a design matrix of the position characteristics; layer o Is a layer o network.
The invention has the beneficial effects that: according to the method, the ultra-fast gas-phase electronic nose is used for acquiring the smell changes of the live crayfishes with different freshness and the crayfishes with different dead times, so that the smell information of the crayfishes with different freshness can be acquired more visually and accurately; label noise is eliminated by using belief learning, and the prediction accuracy is improved; meanwhile, trend characteristics and relative content characteristics of volatile compounds of ultra-fast gas phase electronic nose chromatographic data are respectively extracted by using LSTM and MLP, and after the extracted characteristics are spliced, the freshness of the crayfish is classified by using a feed-forward neural network, so that the stability and the accuracy are better.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
FIG. 1 is a chromatogram of a crawfish freshness detection method based on gas phase electronic nose and machine learning according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of peak height data PCA of a crawfish freshness detection method based on gas phase electronic nose and machine learning according to a second embodiment of the present invention;
fig. 3 is a confusion matrix of crawfish freshness detection method based on gas phase electronic nose and machine learning according to a second embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Also in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, which are only for convenience of description and simplification of description, but do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
Referring to fig. 1, a first embodiment of the present invention provides a crawfish freshness detection method based on gas phase electronic nose and machine learning, comprising:
s1: the crayfish sample is placed in a beaker, sealed with a double-layer preservative film, and kept still for headspace.
S2: preheating an ultra-fast gas phase electronic nose instrument, and inserting a sample injection needle into a beaker for sampling to obtain a chromatogram map.
S3: and carrying out normalization pretreatment on the maximum value and the minimum value of the chromatogram peak height.
Normalization pretreatment:
Figure BDA0003315159880000061
wherein h is scale Is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min Minimum value of the peak height of the chromatogram, h max The maximum value of the peak height of the chromatogram.
S4: and preprocessing the baseline data of the peak height, and eliminating the label noise of the crayfish sample by using a belief learning strategy.
(1) Preprocessing baseline data for peak height
(1) Calculating a peak height empirical distribution
Figure BDA00033151598800000611
Subjecting chromatogram peak height h 1 ,h 2 ,...,h n The real random variables which are regarded as independent and identically distributed and the cumulative distribution function is F (k), and the empirical distribution of the peak heights is obtained
Figure BDA0003315159880000062
Figure BDA0003315159880000063
Wherein the content of the first and second substances,
Figure BDA0003315159880000064
is { h i |h i K ≦ k).
(2) For a range of values of peak height h, R = { h |0 < h < + ∞ }, there is a division S = { S for any given normal number S 1 ,S 2 ,...,S r And (4) satisfying:
S i ={h|(i-1)×s≤h≤i×s,sup(R)≤r×s},i=1,2,...r;
wherein S is r Is the r-th divided data segment.
(3) Defining event A with peak height h falling in different data segment intervals i ={h|h∈S i The probability of occurrence of the event
Figure BDA0003315159880000065
Calculating an estimated baseline value
Figure BDA0003315159880000066
Figure BDA0003315159880000067
Figure BDA0003315159880000068
Wherein m is the event A with the maximum occurrence probability i Number of the corresponding section, S m N is the total peak height number of the corresponding division of the event with the maximum occurrence probability,
Figure BDA0003315159880000069
for the empirical distribution of the ith partition,
Figure BDA00033151598800000610
is the empirical distribution of the i-1 th partition.
(2) The labeling noise of the rejected crayfish samples includes,
(1) defining the initial annotation, the number of days of possible error label as
Figure BDA0003315159880000071
The genuine tag is defined as y * The total number of samples is N, and the number of categories is M.
(2) Equally dividing N samples into a parts, taking one part as a test set, taking the rest a-1 parts as a training set, and calculating the estimated probability p = { p ] of the test set samples j J =0, 1.., M }, repeating a times to obtain the folded prediction of all samples;
wherein the probability that a sample x belongs to the jth class
Figure BDA0003315159880000072
(3) Calculating the average probability t under each calibration category j j And as a confidence threshold:
Figure BDA0003315159880000073
wherein the content of the first and second substances,
Figure BDA0003315159880000074
is an initial mark
Figure BDA0003315159880000075
The number of the cells.
(4) Calculating a count matrix
Figure BDA0003315159880000076
Figure BDA0003315159880000077
Figure BDA0003315159880000078
Wherein l represents a group satisfying
Figure BDA0003315159880000079
The label of (1).
(5) Calibrating a counting matrix:
Figure BDA00033151598800000710
wherein the content of the first and second substances,
Figure BDA00033151598800000711
is a counting matrix
Figure BDA00033151598800000712
To the calibration value of (c).
(6) Estimating initial tags
Figure BDA00033151598800000713
And a genuine label y * Joint distribution of
Figure BDA00033151598800000714
Figure BDA00033151598800000715
(7) For a counting matrix
Figure BDA00033151598800000716
Non-diagonal cells, selecting
Figure BDA00033151598800000717
Filtering the samples at a maximum interval
Figure BDA00033151598800000718
Sorting, filtering of each category
Figure BDA00033151598800000719
A maximum-spaced sample;
preferably, the crayfish freshness label is misjudged due to differences of producing areas, transportation time and individuals, the artificial label can only be used as priori estimation of freshness, and the embodiment eliminates the wrong artificial label by using a belief learning strategy, namely, a sample with an obviously wrong result, so that the prediction accuracy is improved.
S5: and (4) performing feature extraction on the chromatogram by using a sequence model to obtain the chromatogram trend features of different freshness and odor changes.
The sequence model initially obtains a rough trend feature X through multiple convolutions (the trend feature X is a sequence with a length of 65, and each position t contains 64 numerical features X of a corresponding time period t ) And then extracting a trend feature SLSTM (X) of X based on the LSTM network, namely a chromatogram trend feature:
Figure BDA0003315159880000081
wherein, LSTM 1 、LSTM 2 Is an LSTM network.
It should be noted that LSTM is a recurrent neural network, and can learn long-term dependency and extract the depth trend feature of the odor information.
The LSTM network processes the sequence in time order, for each location's feature x t Respectively fed into the input gate and the forgetting gate to obtain a control vector i t And f t The calculation formula is as follows:
i t =σ(W ii x t +b ii +W hi h t-1 +b hi )
f t =σ(W if x t +b if +W hf h t-1 +b hf )
each LSTM network contains a memory vector c, which is passed between different locations; control vector f obtained by forgetting gate for LSTM network t Determining information that needs to be forgotten:
f t =σ(W if x t +b if +W hf h t-1 +b hf )
Figure BDA0003315159880000082
get new memory after discarding useless information
Figure BDA0003315159880000083
Figure BDA0003315159880000084
The LSTM network comprises a hidden layer characteristic h for introducing sequence information; in the input gate, the LSTM network uses the hidden layer feature h of the previous time t-1 Re-correcting the input Gate extracted deep characterization g t Thereby capturing the interaction of different location features;
g t =hanh(W ig x t +b ig +W hg h t-1 +b ng )
wherein, g t The smell quantity information at the time t and the overall information before the time t (including the trend and the quantity) are included;
input Gate pass vector i t Determining information that an LSTM network needs to remember
Figure BDA0003315159880000085
Figure BDA0003315159880000086
To sum up, at time t, the updated memory vector c of the LSTM network t
Figure BDA0003315159880000087
After the memory vector is updated, the LSTM network also obtains h fused with the time information through an output gate t (ii) a Similar to the input gates, the LSTM network uses hidden layer features at the previous time to assist in computing the output gate control vector σ t
o t =σ(W io x t +b io +W ho h t-1 +b ho )
Finally, using the output gate control vector to determine the information needed to remain in the hidden layer vector, updating h t
h t =o t ⊙tanh(c t )
S6: extracting the content characteristics of the volatile compounds corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the content characteristics of the volatile compounds.
This example uses a multilayer perceptron (MLP) to extract the volatile compound content features for each retention time:
layer i (X)=ReLU(XW i )
Figure BDA0003315159880000091
Figure BDA0003315159880000092
wherein, layer i Is the i-th layer network; w i Is a parameter of the ith layer; x is a design matrix of the position characteristics; layer o Is a layer o network.
S7: and performing feature classification by using the features spliced by the feedforward neural network.
Example 2
In order to verify and explain the technical effects adopted in the method, the embodiment selects principal component analysis, LDA, RF and SVM algorithms and adopts the method to perform comparison test, and compares test results by means of scientific demonstration to verify the real effect of the method.
Collecting samples of fresh crayfish, recording as day 0 after catching, placing in a refrigerator at 4 deg.C, and storing for 1, 2, 3, 4, 5 days. The dead shrimp samples are stored for 6h and 12h at 4 ℃ and 3h and 24h at normal temperature (25 ℃), wherein the group with 24h at 25 ℃ is a putrefactive group.
Wherein, the dead shrimp sample is treated as follows:
(1) Selecting crayfishes with similar sizes, and removing dead shrimps and residual shrimps; 5 crayfish per bag are suffocated to die under the condition of 0.1MPa of vacuum degree.
(2) Sampling 20 crayfishes from a refrigerator at 4 ℃ at regular time every day, and putting the crayfishes at room temperature for 1h to restore the room temperature; putting each crayfish into a 500mL beaker, sealing the beaker by using a double-layer preservative film, and standing the beaker for 30min.
(3) The instrument is preheated for 30min, a sample injection needle penetrates into a beaker for 5cm, samples are taken for 5000 mu L, crayfish with different storage days are detected by using an ultra-fast gas-phase electronic nose Heracles II, and the set instrument parameters are shown in Table 1.
Table 1: and analyzing the parameters.
Serial number Parameter(s) Conditions of
1 Sample introduction volume 5000μL
2 Temperature at sample inlet 200
3 Duration of sample introduction 45s
4 Initial trap temperature 40
5 Trap shunt rate 10mL/min
6 Trapping duration 50s
7 Trap final temperature 240
8 Initial temperature of column temperature 50
9 Programmed temperature raising mode of column temperature Maintaining at 1 deg.C/s-80 deg.C for 60s at 2 deg.C/s-250 deg.C
10 Time of acquisition 177s
11 Temperature of detector 260℃
Performing data dimensionality reduction on the chromatographic peak data by using principal component analysis to obtain a principal component analysis graph (figure 2), wherein 1-5 are live crayfishes respectively stored at 4 ℃ for 1-5 days, 6 and 7 are dead crayfishes respectively stored at 25 ℃ for 3h, 24h,8 and 9 are dead crayfishes respectively stored at 4 ℃ for 6h and 12h; the results of PCA visual analysis can be known as follows: the contribution rate of PC1 was 21.8%, the contribution rate of PC2 was 12.6%, and the total contribution rate was 34.4%. The cross overlapping exists among all classification sample points, so that different freshness degrees of the crayfishes cannot be distinguished; therefore, processing the sample odor information using principal component analysis cannot distinguish crawfish freshness.
Based on example 1, the specific parameters of the method are set as follows:
(1) And preprocessing the acquired ultra-fast gas phase electronic nose Heracles II chromatographic data.
(2) And calculating the empirical distribution of the peak heights by using the samples, solving the maximum probability interval of the peak heights recorded in the previous 666 records, and estimating a baseline value by using the interval mean value.
Obtaining 35407 peak height data in the ultra-fast gas phase electronic nose analysis time 177s, and taking the front 666 data; dividing the data into 67 data segments by taking step length 10 as a unit according to the maximum value and the minimum value of the first 666 data, counting the number of the 666 data falling into different data segments, finding the data segment falling into the most data, and taking the average value of the segment of data as a baseline value.
(3) Defining the initial annotation, the number of days of possible error label as
Figure BDA0003315159880000101
The real label is defined as y, the total number of samples is 380, and 9 types are divided into live shrimps (stored at 4 ℃ for 1 day, 2 days, 3 days, 4 days and 5 days) and dead shrimps (stored at 4 ℃ for 6h, 12h and 25 ℃ for 3h and 24 h), wherein the live shrimp samples are 60 per type, and the dead shrimp samples are 20 per type.
Further, each type of 380 samples is divided into 5 parts on average, one part is taken as a test set, the other 4 parts are taken as training sets, and the estimated probability p = { p } of the test set samples is calculated j J =0, 1.. Multidot.9 }, repeated 5 times, resulting in an outlier prediction for all samples, where j =0, 1.. Multidot.9 }, where
Figure BDA0003315159880000111
I.e. the probability that sample x belongs to the jth class.
Selecting
Figure BDA0003315159880000112
Filtering the samples at a maximum interval
Figure BDA0003315159880000113
Sorting, filtering 22 maximum-spaced samples of each category; in 22 obtained filtered samples, 10 misjudged live and dead shrimp samples are removed according to real labels, and original labels are modified into real labels by the rest 12 samples.
(4) 128 features obtained by inputting the multilayer perceptron are spliced with 64 features obtained by the sequence model, and 9 classifications are obtained by using feed-forward neural network classification.
Respectively modeling 300 live shrimps (stored at 4 ℃ for 1 day, 2 days, 3 days, 4 days and 5 days) and 80 dead shrimps (stored at 4 ℃ for 6h, 12h and 25 ℃ for 3h and 24 h) by using the method and LDA, RF and SVM algorithms; in each group, 80% of crayfish samples are randomly selected as a training set, and 20% of samples are selected as a verification set; calculating the accuracy of model prediction according to the confusion matrix (figure 3);
table 2: and (5) predicting a model result of crayfish freshness.
Figure BDA0003315159880000114
As can be seen from the table 2, the prediction accuracy rates of 5 modeling times by using the method are respectively 92.00%, 93.33%, 95.95%, 97.30% and 97.30%, and the total prediction result can reach 95.16%; the prediction accuracy of the method is far higher than the accuracy (79.16%, 75.99% and 71.50%) of modeling by utilizing LDA, RF and SVM, and the method has good stability and accuracy.
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (7)

1. A crayfish freshness detection method based on a gas phase electronic nose and machine learning is characterized by comprising the following steps: comprises the steps of (a) preparing a substrate,
putting a crayfish sample into a beaker, sealing the sample with a double-layer preservative film, and standing for headspace;
preheating an ultra-fast gas phase electronic nose instrument, and inserting a sample injection needle into a beaker for sampling to obtain a chromatogram;
carrying out normalization pretreatment on the maximum value and the minimum value of the peak height of the chromatogram;
preprocessing the baseline data of the peak height, and eliminating the label noise of the crayfish sample by using a belief learning strategy;
performing feature extraction on the chromatogram by using a sequence model to obtain the trend features of the chromatogram with different freshness and odor changes;
extracting the volatile compound content characteristics corresponding to each retention time through a multilayer perceptron according to the chromatogram trend characteristics, and splicing the chromatogram trend characteristics and the volatile compound content characteristics;
performing feature classification by using the spliced features of the feedforward neural network;
the sequence model preliminarily obtains rough trend characteristics X through multiple convolution, and then extracts trend characteristics SLSTM (X) of X based on an LSTM network, namely the chromatogram trend characteristics:
Figure FDA0003883383250000011
wherein, LSTM 1 、LSTM 2 Is an LSTM network.
2. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 1, characterized in that: the normalization pre-processing includes the steps of,
Figure FDA0003883383250000012
wherein h is scale Is the peak height of the chromatogram after normalization, h is the peak height of the chromatogram, h min Minimum value of the peak height of the chromatogram, h max The maximum value of the chromatogram peak height.
3. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 2, characterized in that: pre-processing the baseline data for the peak heights includes,
calculating a peak height empirical distribution
Figure FDA0003883383250000013
For the value range R = { h |0 of peak height h<h<+ ∞, there is one division S = { S } for any given normal number S 1 ,S 2 ,…,S r And satisfies:
S i ={h|(i-1)×s≤j≤i×s,sup(R)≤r×s},i=1,2,…r;
defining event A with peak height h falling in different data segment intervals i ={h|h∈S i The probability of occurrence of the event
Figure FDA0003883383250000014
Calculating an estimated baseline value
Figure FDA0003883383250000015
Figure FDA0003883383250000021
Figure FDA0003883383250000022
Wherein S is r Is the r-th divided data segment; m is the event A with the maximum occurrence probability i Number of corresponding section, S m N is the total peak height number of the corresponding division of the event with the maximum occurrence probability,
Figure FDA0003883383250000023
for the empirical distribution of the ith partition,
Figure FDA0003883383250000024
is the empirical distribution of the i-1 th partition.
4. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 3, characterized in that: empirical distribution of peak heights
Figure FDA0003883383250000025
Comprises the steps of (a) preparing a mixture of a plurality of raw materials,
the peak height h of the chromatogram map 1 ,h 2 ,…,h n The real random variables which are regarded as independent and same distribution are accumulated to be distributed as a function F (k), and the peak height is obtainedEmpirical distribution
Figure FDA0003883383250000026
Figure FDA0003883383250000027
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003883383250000028
is { h i |h i K ≦ k).
5. The crayfish freshness detection method based on gas phase electronic nose and machine learning of claim 4, wherein: the label noise for eliminating the chromatogram includes,
defining the initial annotation, the number of days of possible error label as
Figure FDA0003883383250000029
The real tag is defined as y * The total number of samples is N, and the number of categories is M;
averagely dividing N samples into a parts, taking one part as a test set, taking the rest a-1 parts as a training set, and calculating the estimated probability p = { p } of the test set samples j J =0,1, \8230;, M }, repeating a times to obtain the folded prediction of all samples;
calculating the average probability t under each calibration category j j And as a confidence threshold:
Figure FDA00038833832500000210
calculating a count matrix
Figure FDA00038833832500000211
Figure FDA00038833832500000212
Figure FDA00038833832500000213
Calibrating a counting matrix:
Figure FDA00038833832500000214
estimating initial tags
Figure FDA0003883383250000031
And a genuine label y * Joint distribution of
Figure FDA0003883383250000032
Figure FDA0003883383250000033
For a counting matrix
Figure FDA0003883383250000034
Off diagonal cell of (1), selecting
Figure FDA0003883383250000035
Filtering the samples at a maximum interval
Figure FDA0003883383250000036
Sorting, filtering of each category
Figure FDA0003883383250000037
A maximum-spaced sample;
wherein the probability that a sample x belongs to the jth class
Figure FDA0003883383250000038
Figure FDA0003883383250000039
Is an initial mark
Figure FDA00038833832500000310
The number of (2); l represents the satisfaction of
Figure FDA00038833832500000311
The label of (2);
Figure FDA00038833832500000312
is a counting matrix
Figure FDA00038833832500000313
To the calibration value of (c).
6. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 5, characterized in that: also comprises the following steps of (1) preparing,
the trend feature X is a sequence of length 65, each position t containing 64 numerical features X for a corresponding time segment t
7. The crayfish freshness detection method based on the gas-phase electronic nose and the machine learning as claimed in claim 6, characterized in that: the volatile compound content characteristics include,
layer i (X)=ReLU(XW i )
Figure FDA00038833832500000314
Figure FDA00038833832500000315
wherein, layer i Is the i-layer network; w i Is a parameter of the ith layer; x is a design matrix of the position characteristics; layer o Is a layer o network.
CN202111228666.9A 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning Active CN113984946B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111228666.9A CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111228666.9A CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Publications (2)

Publication Number Publication Date
CN113984946A CN113984946A (en) 2022-01-28
CN113984946B true CN113984946B (en) 2022-12-27

Family

ID=79740055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111228666.9A Active CN113984946B (en) 2021-10-21 2021-10-21 Crayfish freshness detection method based on gas phase electronic nose and machine learning

Country Status (1)

Country Link
CN (1) CN113984946B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389324A (en) * 2013-07-18 2013-11-13 浙江工商大学 Prawn freshness detection method based on smell analysis technology
WO2020027494A1 (en) * 2018-08-01 2020-02-06 (주)한그린테크 Device and method for measuring freshness on basis of machine learning
CN111624317A (en) * 2020-06-22 2020-09-04 南京农业大学 Nondestructive testing method for judging freshness of baby cabbage

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102297930A (en) * 2011-07-20 2011-12-28 浙江大学 Method for identifying and predicting freshness of meat
CN106568907B (en) * 2016-11-07 2019-06-21 常熟理工学院 A kind of steamed crab freshness lossless detection method based on semi-supervised identification projection
CN110308240A (en) * 2019-05-24 2019-10-08 深圳大学 A kind of electronic nose method for quickly identifying
CN111665819A (en) * 2020-06-08 2020-09-15 杭州电子科技大学 Deep learning multi-model fusion-based complex chemical process fault diagnosis method
CN111783568B (en) * 2020-06-16 2022-07-15 厦门市美亚柏科信息股份有限公司 Pedestrian re-identification method and device based on belief learning and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389324A (en) * 2013-07-18 2013-11-13 浙江工商大学 Prawn freshness detection method based on smell analysis technology
WO2020027494A1 (en) * 2018-08-01 2020-02-06 (주)한그린테크 Device and method for measuring freshness on basis of machine learning
CN111624317A (en) * 2020-06-22 2020-09-04 南京农业大学 Nondestructive testing method for judging freshness of baby cabbage

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An artificial neural network model for predicting flavour intensity in blackcurrant concentrates;Raymond K 等;《Food Quality and Preference》;20021231;第13卷(第02期);第117-128页 *
基于高光谱成像的水果品质及木材含水量评估方法;朱晓琳;《中国优秀硕士学位论文全文数据库 (工程科技Ⅰ辑)》;20210115(第01期);第B024-883页 *

Also Published As

Publication number Publication date
CN113984946A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
Mazen et al. Ripeness classification of bananas using an artificial neural network
CN105044298B (en) A kind of Eriocheir sinensis class grade of freshness detection method based on machine olfaction
CN106568907B (en) A kind of steamed crab freshness lossless detection method based on semi-supervised identification projection
CN106653001B (en) Method and system for identifying baby crying
CN109934269B (en) Open set identification method and device for electromagnetic signals
CN102353701B (en) Diagnostic method for insect attacks on crops by utilizing volatile matter
CN104849321B (en) A kind of method based on smell finger-print quick detection Quality Parameters in Orange
Jiang et al. A novel data fusion strategy based on multiple intelligent sensory technologies and its application in the quality evaluation of Jinhua dry-cured hams
CN105954412A (en) Sensor array optimization method for Carya cathayensis freshness detection
CN105738581B (en) A kind of method for quick identification of the different freshness hickory nuts based on electronic nose
Devi et al. IoT-deep learning based prediction of amount of pesticides and diseases in fruits
CN110738259B (en) Fault detection method based on Deep DPCA-SVM
CN111833330A (en) Intelligent lung cancer detection method and system based on fusion of image and machine olfaction
Dong et al. Classification of strawberry diseases and pests by improved AlexNet deep learning networks
CN113984946B (en) Crayfish freshness detection method based on gas phase electronic nose and machine learning
Guo et al. Hyperspectral image analysis for the evaluation of chilling injury in avocado fruit during cold storage
CN110726813B (en) Electronic nose prediction method based on double-layer integrated neural network
CN113591816B (en) Hyperspectral anomaly detection method and system based on self-supervision guide coding network
Najib et al. Fish quality study using odor-profile case-based reasoning (CBR) classification technique
Fahmi et al. Oil palm fresh fruit bunch ripeness classification using back propagation and learning vector quantization
Wang et al. A Research of Neural Network Optimization Technology for Apple Freshness Recognition Based on Gas Sensor Array
Shah et al. Automatically localising ROIs in hyperspectral images using background subtraction techniques
CN112801173A (en) Lettuce near infrared spectrum classification method based on QR fuzzy discrimination analysis
CN105548268A (en) Method for fast predicting processing time of pecan based on electronic nose
CN117612644B (en) Air safety evaluation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant