WO2023077569A1

WO2023077569A1 - Deep learning-based method for updating spectral analysis model for fruit

Info

Publication number: WO2023077569A1
Application number: PCT/CN2021/132651
Authority: WO
Inventors: 应义斌; 杨杰; 林涛; 丁冠中
Original assignee: 浙江大学
Priority date: 2021-11-02
Filing date: 2021-11-24
Publication date: 2023-05-11
Also published as: CN114002167A

Abstract

A deep learning-based method for updating a spectral analysis model for fruit. The method comprises: inputting spectral data of fruit and a quality variable, which are collected in a historical batch, into a deep neural network, and optimizing model parameters by means of a random grid hyper-parameter search and a gradient descent algorithm, so as to obtain an optimal model structure and weights; keeping the weight of a weight frozen layer of the model fixed, and performing a fine adjustment on the weight of a weight variable layer of the model by using spectral data and a quality variable, which are obtained by using a small number of fruit samples in a new batch, so as to obtain an updated model; and inputting, into the updated model, a fruit spectrum having an unknown predicted value in a new batch, and outputting a quality variable prediction result of fruit in the new batch. In the method, a spectral model constructed by means of historical data is updated by using a small number of samples, and a prediction precision higher than that of various traditional model updating methods is provided; and common features of data of a historical batch and data of a new batch can be effectively reserved, and a better prediction precision and robustness are provided for fruit samples in a new batch.

Description

A Deep Learning Method for Updating the Spectral Analysis Model of Fruits

technical field

The invention belongs to the field of spectral analysis and chemometrics, and specifically relates to a method for updating a deep learning fruit spectral analysis model.

Background technique

The development of spectroscopy technology and chemometrics has promoted the application of on-site non-destructive testing in food, pharmaceutical, petrochemical and other industries. In recent years, the fruit high-throughput grading system has developed rapidly, with a processing speed of more than one per second, which can quickly sort the internal quality and safety of fruits. Because biological objects such as fruits are affected by factors such as the environment during the growth and development process, there are usually biological differences in fruits of different batches, years, and sources, which affect the role of light in the fruit tissue and the collection of spectral data. The developed fruit spectral analysis model is invalid, and it is difficult to provide good quality prediction decision support. Therefore, it is important to develop a reliable spectral model update method for different batches of tested fruits.

There are three main categories of traditional model update methods: 1) Global model: By constructing a global training set with multiple batches of data to improve the scope of application of the model, but due to the increased data variability and nonlinearity, this method usually loses 2) Remodeling: redevelop the model by collecting a new batch of spectral and quality variable data of a large number of fruit samples, but this method consumes more manpower and material resources, and cannot make good use of historical batches. data; 3) slope/bias correction: the slope and deviation of the existing model are corrected by some new batches of samples, but this method is only suitable for linear prediction models, and the reliability of this method is poor, and improper sample selection will lead to The leverage effect significantly reduces the prediction accuracy of the model.

Aiming at the limitations of existing methods, this research aims to propose a model update method suitable for nonlinear deep learning models, which makes good use of historical batch data and provides good performance when the new batch has a small number of labeled samples. model reliability.

Contents of the invention

In order to solve the problem that the developed fruit spectral analysis model significantly reduces the prediction accuracy of different batches of fruit samples, the present invention proposes a deep learning method for updating the fruit spectral analysis model, which solves the problem of biological The technology that cannot achieve accurate detection caused by the difference has brought about an improvement effect on the accuracy of the model method.

The invention uses a small number of samples to update the spectral model constructed by historical data, and provides higher prediction accuracy than multiple traditional model updating methods. This method can effectively preserve the common characteristics of historical batch data and new batch data, and provide better prediction accuracy and robustness for new batches of fruit samples.

The technical scheme that the present invention adopts is as follows, comprises the steps:

Step 1) Use historical batches of fruit spectral data as the sample set, and the fruit quality variable values corresponding to the sample set as the label set; construct a deep learning fruit spectral analysis model, use the sample set as input, and the label set as output, and use the deep learning fruit The spectral analysis model is trained, and the initial deep learning fruit spectral analysis model and its model weight are obtained through the gradient descent algorithm and the hyperparameter optimization method;

Step 2) Predict the quality variable value of the new batch of fruit:

2.1) Select a small number of representative samples from the new batch of total fruit samples, collect the spectral data of representative fruit samples and their corresponding fruit quality variable values as a training set, and input it into the initial deep learning fruit spectral analysis model obtained in step 1) Among them, the weight of the weight freezing layer in the model is fixed, and the model is retrained to complete the weight fine-tuning of the weight variable layer, so as to obtain an updated deep learning fruit spectral analysis model suitable for the prediction of the new batch of fruit quality variables;

2.2) Collect the fruit spectral data of the remaining unknown quality variable values in the new batch, and input the updated deep learning fruit spectral analysis model in step 2.1) to predict the quality variable values, and complete the detection of the new batch of fruit quality variables.

The deep neural network spectral analysis model adopts a convolutional neural network model, an autoencoder model, a recurrent neural network model or a Transformer model; the embodiment of the present invention adopts a convolutional neural network model, which usually consists of multiple convolutional layers, a The stretching layer is composed of multiple fully connected layers, where the front end of the convolutional layer is the input of the original spectrum, and the predicted value of the quality variable is input after the last fully connected layer.

The weight training of the deep neural network spectral analysis model guides the gradient descent algorithm to learn the model weights according to the loss function. The loss function is composed of the mean square error between the estimated predicted value and the real value and a regularization term. The final weight is determined through several rounds of iterations. .

The structural hyperparameter search of the deep neural network spectral analysis model is used to optimize the model structure of the neural network, such as the size of the convolution kernel, etc., and generate several hyperparameter combinations in the pre-set hyperparameter search space. According to the training set samples in the network training performance to determine the optimal model hyperparameters.

The training of the deep learning model adopts one or more combinations of the following four strategies: L2 norm regularization, learning rate decay strategy Learning rate decay, loss method Dropout and early stopping strategy Early stopping.

The fruit samples of the historical batches in the step 1) are the fruit samples obtained before the new batch, which come from different harvest years, different harvest seasons and different origins; the new batch of fruits in the step 2) are quality Variable fruit to be tested.

The fruit quality variable value in described step 1) and step 2.1) is a kind of quality parameter value in the sugar content of fruit, acidity, hardness; The sugar content and acidity of the fruit are detected; the hardness of the fruit is measured by a hardness meter.

In the step 2.1), representative samples are selected from the total samples of the new batch of fruits by the Kennard-Stone method for model update, and the representative samples account for 5% to 20% of the total samples of the new batch of fruits.

The weight freezing layer and the weight variable layer in the step 2.1) are optimized by the following method:

For an N-layer deep neural network model, the last 1 layer, the last 2 layers, ... the last N-1, and N layers in the model are respectively used as variable weight layers in the model, and the rest of the layers in the model are used as weight freezing layers, so that Obtain N models with different weight variable layers and weight frozen layers;

Input the training set into N models respectively, compare the errors between the predicted values output by the N models and the real values, and use the weight frozen layer and weight variable layer corresponding to the model with the smallest error as the weight frozen layer and weight variable layer obtained after optimization. Change layers.

Taking the convolutional neural network as an example, the parameters of the convolutional layer are generally fixed, and the parameters of the fully connected layer are fine-tuned.

In the step 2.1), the gradient descent algorithm is used to fine-tune the weight of the variable weight layer.

The beneficial effects of the present invention are:

1) Compared with the global modeling method, the present invention can realize the generalization of the model by fine-tuning the weight of the model. Due to the good nonlinear fitting ability of the deep neural network model, a new batch of data is used to fine-tune the model, so that the model can extract the general and differential features of different batches of data and improve the prediction accuracy of the model.

2) Compared with the remodeling method, the present invention can fine-tune the historical model based on a small number of samples in a new batch, avoiding the manpower and material resources spent on collecting a large number of samples required for remodeling, and Effective use of historical data.

3) Compared with the slope/deviation correction method, the present invention is applicable to the weight adjustment of the nonlinear neural network model, and has better reliability in the case of different data volumes.

4) The present invention is applicable to the application of the convolutional neural network model in the spectral analysis of different batches of fruit, and can effectively utilize the model constructed by the historical batch, and through the spectral data and quality variables of a small number of new batch samples, the model Fine-tune some parameters of the model to increase the generalization ability of the model, making it suitable for good prediction of a new batch of fruit samples.

Description of drawings

Fig. 1 is the architecture diagram of the deep learning model used for spectral analysis in the present invention, taking the convolutional neural network as an example;

Fig. 2 is the flowchart of implementing model update;

Fig. 3 is the schematic diagram of deep neural network fine-tuning method;

Fig. 4 is a comparison chart of prediction performance of different model update methods in the embodiment.

Detailed ways

The present invention will be described in further detail below in conjunction with the examples, but the protection scope of the present invention is not limited to the scope indicated by the examples. The following embodiments are run on Python software.

As shown in Figure 2, the technical scheme that the present invention adopts is as follows:

Step 1): Build a deep learning fruit spectral analysis model (this paper takes convolutional neural network as an example): use the fruit spectral data collected in historical batches and the quality variable data obtained through destructive tests as labels, and input them into the convolutional neural network respectively. In the neural network model, the initial deep neural network spectral analysis model structure and its model weight are obtained through the gradient descent algorithm and the random grid hyperparameter search method. This model is suitable for the quality variable prediction of historical batches of fruits;

Among them, the model weight refers to the connection weight of the neurons between the layers of the deep neural network spectral analysis model, and the structural hyperparameter refers to the parameters that determine the model structure and the number of weights.

In the described step 1), specifically:

1.1) Deep neural network spectral analysis models include convolutional neural network models, autoencoder models, recurrent neural network models, and Transformer models used in spectral analysis. The structural hyperparameters of the model are optimized and determined through manual search, network search, and random network search methods. This paper mainly takes the convolutional neural network model as an example. The model usually consists of multiple convolutional layers, a stretching layer, and multiple fully connected layers. The front end of the convolutional layer is the input of the original spectrum, and the last fully connected Output the predicted value of the quality variable after the layer;

1.2) The weight training of the deep neural network spectral analysis model guides the gradient descent algorithm to learn the weight of the model according to the loss function. The loss function is composed of the mean square error and the regularization item between the estimated predicted value and the real value. Through several rounds of iterations Determine the final weight parameter. The hyperparameter search of the model is used to optimize the model structure, such as the size of the convolution kernel, etc., generate several hyperparameter combinations in the pre-set hyperparameter search space, and determine the optimal model hyperparameter according to the performance of the training set samples in network training. parameter;

Structural hyperparameters include the number of neurons in different layers of the neural network, learning rate of network training, learning rate decay, activation function, random deactivation rate, batch size, etc.

Step 2): Fine-tune the parameters of the trained deep neural network spectral analysis model: collect spectral data and quality variable labels for a small number of fruit samples harvested in a new batch, and input them into the trained deep neural network in step 1) In the spectral analysis model, on the basis that the model structure and the weights of some layers are fixed, the weights of other layers of the model are fine-tuned to realize the update of the model weights, so as to be suitable for the prediction of the new batch of quality variables;

1.3) The training process of the deep learning model uses four strategies to reduce model overfitting and improve model accuracy, including 1) adding an L2 norm regularization term to the loss function and optimizing the strength of the regularization term; 2) in the model Add a dropout layer (Dropout) to the structure, optimize its strength, and randomly deactivate some neurons to avoid excessive dependence of the model on specific parameter configurations; 3) Use the Learning rate decay strategy to learn the weight of the model Gradually reduce the learning rate in the middle to avoid the model from falling into local optimum; 4) Use the early stopping strategy (Early stopping) to avoid over-fitting caused by over-training of the model.

In specific implementation, these four strategies may not all be used, and one or a combination of some of them may be used.

In the described step 2), specifically:

2.1) When fine-tuning the parameters of the trained deep learning model, the structural parameters of the model are kept fixed, the weights of some layers are fixed, and only the weights of other layers are updated to adjust the weight combination of the extracted features, so that the model is suitable for new models. The data collected in one batch is shown in Figure 3. For a deep learning model with a multi-layer network structure, the number of fixed layers selected for weights is determined by optimization;

2.2) The number of weight-freezing layers with fixed weights in the process of parameter fine-tuning is determined after optimization. By comparing the impact of fixing different layers on the model update prediction results, the number of layers with fixed weights in model fine-tuning is determined. Taking the convolutional neural network as an example, the parameters of the convolutional layer are generally fixed, and the parameters of the fully connected layer are fine-tuned;

2.3) Selecting a new batch of representative samples in the parameter fine-tuning process is conducive to improving the prediction accuracy of the model. For example, the Kennard-Stone method can be used to select representative samples for model update.

Step 3): Collect spectral data for a large number of fruit samples with unknown quality variable values in the new batch, and predict them through the updated model in step 2), so as to obtain the quality variable prediction results of the new batch of fruit samples . This method updates the model by fine-tuning some of the weights of the historical model. This method uses the neural network data-driven weight learning method to automatically retain the common features between different batches of data in the two trainings, and is suitable for the new generation. The samples collected in batches can improve the accuracy of predicting the quality variables of new batches of fruits.

The method proposed by the invention aims to update the model developed in the historical batch and apply it to the new batch of fruit samples, and the scope of application includes: different harvest years, different harvest seasons, different origins and so on. Aiming at the problem of model performance degradation caused by differences in the growth environment of different batches of fruits, a small amount of new batch data is used to update the model to be suitable for the prediction of the new batch of fruit quality.

Due to different harvesting years, different harvesting seasons, and different growth environments and planting management of fruits from different origins, there are usually biological differences in size, appearance, color, and quality variable distribution of fruits after harvesting. Models constructed using historical batches cannot Provides good prediction accuracy, usually leads to inaccurate detection results. The present invention solves the above-mentioned technical problems.

Example:

This example is applied to the quantitative analysis of visible/near-infrared spectrum to predict the sugar content of Cuiguan pear. The selected data set is the Cuiguan pear data collected in two batches (harvest year) in Tonglu County, Zhejiang Province in 2017 and 2018 by the pear sugar detection system developed by the Intelligent Bio-Industrial Equipment Innovation (IBE) team of Zhejiang University.

In 2017, 477 samples were collected, with a sugar content ranging from 8.95% to 16.25%, and in 2018, 256 samples were collected, with a sugar content ranging from 9.00% to 13.50%. Brix data were collected on a Brix meter after juicing the edible part of pears. The spectral data of each pear was tested on the Ocean Optics QE65pro spectrometer. After removing the noise part, the spectral band range used was 576.66 to 939.29nm, and a total of 475 variables were used to construct the model.

Through the Kennard-Stone sampling algorithm, 80% of the samples in 2017 (historical batch) are selected as the training set, and 20% are used as the prediction set for the development of the convolutional neural network model; the sample selection for 2018 (new batch) 5%, 10%, 15%, 20% are used for model update, and 80% are used for testing model update performance.

Steps as shown in Figure 2, the above-mentioned embodiment process is specifically as follows:

1) build deep learning model, use convolutional neural network model in this embodiment, as shown in Figure 1, this model comprises three convolutional layers, a stretching layer, two fully connected layers and an output layer connected successively .

2) Input the training set data in 2017 and the known sugar content into the model.

3) Use the Adam optimizer combined with the stochastic gradient descent algorithm to train the weight of the model, and optimize the hyperparameters of the neural network model. The convolution kernel sizes in the three convolution layers are 5, 7, and 3 respectively, and the step size is 5, respectively. 3, 1, the number of neurons in both fully connected layers is 16, the random deactivation rate is 0.2, the batch size is 32, the learning rate is 0.001, and the regularization coefficient is 0.05, etc.

4) Save the optimal weight of the convolutional neural network model trained by the 2017 data, and fix the hyperparameters of the network structure and the weight of the convolutional layer;

5) Input a small number of samples in 2018 into the model, and use the stochastic gradient descent algorithm to fine-tune the weight parameters in the fully connected layer. Use 5%, 10%, 15%, and 20% sample sizes to update and obtain four updated models respectively, and save the optimal weights of each updated model.

6) Input the spectral data of 80% of the test samples in 2018 into the above four updated models respectively, and output the predicted value of sugar content.

7) Use 5%, 10%, 15%, and 20% sample sizes for testing: as shown in Figure 4, using the method of fine-tuning the model of the present invention, the obtained prediction sets RMSEP are 0.481, 0.477, 0.476, and 0.407 respectively; using the global Model method, the obtained prediction set RMSEP is 0.516, 0.499, 0.501, 0.448 respectively; using the slope/bias correction method, the obtained prediction set RMSEP is 0.621, 0.554, 0.566, 0.549 respectively; using the remodeling method, the obtained The RMSEP of the prediction set are 0.843, 0.538, 0.737, 0.530 respectively.

It can be seen from the comparison that the method of the present invention is superior to the three methods in updating the fruit spectral model between different batches, and can improve the accuracy of fruit sugar content prediction. This method has good reliability under different sample sizes and has wider application prospects.

Claims

A method for updating a deep learning fruit spectrum analysis model, characterized in that it comprises the following steps:

Step 1) Use historical batches of fruit spectral data as the sample set, and the fruit quality variable values corresponding to the sample set as the label set; construct a deep learning fruit spectral analysis model, use the sample set as input, and the label set as output, and use the deep learning fruit The spectral analysis model is trained, and the initial deep learning fruit spectral analysis model and its model weight are obtained through the gradient descent algorithm and the hyperparameter optimization method;

Step 2) Predict the quality variable value of the new batch of fruit:

2.1) Select a small number of representative samples from the new batch of total fruit samples, collect the spectral data of representative fruit samples and their corresponding fruit quality variable values as a training set, and input it into the initial deep learning fruit spectral analysis model obtained in step 1) Among them, the weight of the weight freezing layer in the model is fixed, and the model is retrained to complete the weight fine-tuning of the weight variable layer, so as to obtain an updated deep learning fruit spectral analysis model suitable for the prediction of the new batch of fruit quality variables;

2.2) Collect the fruit spectral data of the remaining unknown quality variable values in the new batch, and input the updated deep learning fruit spectral analysis model in step 2.1) to predict the quality variable values, and complete the detection of the new batch of fruit quality variables.
A kind of deep learning fruit spectral analysis model update method according to claim 1, it is characterized in that, described deep neural network spectral analysis model adopts convolutional neural network model, self-encoder model, recurrent neural network model or Transformer model;

The training of the deep learning model adopts one or more combinations of the following four strategies: L2 norm regularization, learning rate decay strategy, loss method and early stopping strategy.
A kind of deep learning fruit spectral analysis model update method according to claim 1, it is characterized in that, the fruit sample of history batch in described step 1) is the fruit sample that obtains before new batch, respectively comes from different harvests Year, different harvest seasons and different production areas; the new batch of fruit in the step 2) is the fruit whose quality variable is to be detected.
A kind of deep learning fruit spectral analysis model update method according to claim 1, it is characterized in that, the fruit quality variable value in described step 1) and step 2.1) is a kind of quality in the sugar content of fruit, acidity, hardness parameter value;

After the fruit juice is obtained through the destructive test, the sugar content and acidity of the fruit are detected by using a sugar meter and a pH meter respectively; the hardness of the fruit is measured by a hardness meter.
A kind of deep learning fruit spectral analysis model update method according to claim 1, it is characterized in that, in described step 2.1), select representative sample from new batch of fruit total samples by Kennard-Stone method for model update, Representative samples accounted for 5% to 20% of the total samples of the new batch of fruit.
A kind of deep learning fruit spectral analysis model update method according to claim 1, is characterized in that, the weight freezing layer and weight variable layer in described step 2.1) are obtained by following method optimization:

For an N-layer deep neural network model, the last 1 layer, the last 2 layers, ... the last N-1 layer, and N layer in the model are respectively used as variable weight layers in the model, and the remaining layers in the model are used as weight freezing layers. Thus, N models with different weight variable layers and weight frozen layers are obtained;

Input the training set into N models respectively, compare the errors between the predicted values output by the N models and the real values, and use the weight frozen layer and weight variable layer corresponding to the model with the smallest error as the weight frozen layer and weight variable layer obtained after optimization. Change layers.
A kind of deep learning fruit spectral analysis model update method according to claim 1, is characterized in that, adopts gradient descent algorithm to fine-tune the weight of weight variable layer in described step 2.1).