CN114638291A

CN114638291A - Food-borne pathogenic bacteria classification method based on multilayer feedforward neural network algorithm

Info

Publication number: CN114638291A
Application number: CN202210226118.0A
Authority: CN
Inventors: 曾万聃; 杨雪钰
Original assignee: Shanghai Institute of Technology
Current assignee: Shanghai Institute of Technology
Priority date: 2022-03-08
Filing date: 2022-03-08
Publication date: 2022-06-17

Abstract

The invention discloses a food-borne pathogenic bacteria classification method based on a multilayer feedforward neural network algorithm, which comprises the following steps of 1: carrying out data preprocessing on the obtained food-borne pathogenic bacteria data; step 2: constructing a multilayer feedforward neural network; and step 3: and evaluating the constructed multilayer feedforward neural network model to obtain the accuracy of the current model classification. The method based on the multilayer feedforward neural network algorithm is used, and the accuracy of spectrum classification of the food-borne pathogenic bacteria is improved through structural configuration and optimization of the multilayer feedforward neural network algorithm. The invention mainly solves the technical problem of realizing the automation of the food-borne pathogenic bacteria spectral data classification by modeling a multilayer feedforward neural network algorithm and providing a new method for detecting the food-borne pathogenic bacteria.

Description

Food-borne pathogenic bacteria classification method based on multilayer feedforward neural network algorithm

Technical Field

The invention relates to the fields of Raman spectrum, food-borne pathogenic bacteria, deep learning and machine learning, in particular to a food-borne pathogenic bacteria classification method based on a multilayer feedforward neural network algorithm.

Background

The most common methods for detecting food-borne pathogenic bacteria at present mainly comprise microorganisms, biochemical identification, a bacterium-increasing culture separation method, simple polymerase chain reaction amplification (PCR), multiple polymerase chain reaction amplification (mPCR) and real-time/instant polymerase chain reaction amplification (qPCR). And the neural network algorithm for processing food-borne pathogenic bacteria classification detection can ensure that the detection accuracy is higher. Therefore, the present invention combines raman spectroscopy with neural network algorithms in deep learning.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a food-borne pathogenic bacteria classification method based on a multilayer feedforward neural network algorithm, and solves the problems of complex detection operation, long turnover time and the like of food-borne pathogenic bacteria.

In order to achieve the above purpose, the technical solution adopted to solve the technical problems is as follows:

a food-borne pathogenic bacteria classification method based on a multilayer feedforward neural network algorithm comprises the following steps:

step 1: carrying out data preprocessing on the obtained food-borne pathogenic bacteria data;

step 2: constructing a multilayer feedforward neural network;

and step 3: and evaluating the constructed multilayer feedforward neural network model to obtain the accuracy of the current model classification.

Further, step 1 comprises the following:

step 1.1: carrying out Savitzky-Golay convolution smoothing algorithm on the obtained original Raman spectrum data, setting window point number and polynomial order parameters, and reducing noise interference and smoothing spectrum;

step 1.2: carrying out normalization processing on the food-borne pathogenic bacteria data, and distributing the data values between 0 and 1 by utilizing a Min-Max method;

step 1.3: carrying out dimension reduction on the normalized experimental data, and reducing the characteristic dimension of the experimental data by using a PCA dimension reduction method;

step 1.4: the spectral data is divided into training set and testing set according to the ratio of 7:3, so as to facilitate the training of the model.

Further, step 2 comprises the following steps:

step 2.1: the model comprises an input layer, a hidden layer and an output layer, wherein the number of nodes of the input layer, the hidden layer and the output layer is determined firstly, then the activation functions of the input layer and the hidden layer are determined, the activation function of the first layer uses a tanh function, the activation function of the second layer enables a sigmoid function, the input value is multiplied by a weight value, and the output value is obtained in a mode of summing offset values, so that a classification result is obtained;

step 2.2: initializing parameters, randomly initializing the weight w and the bias b, and performing forward propagation in a mode of y ═ x × w + b;

step 2.3: constructing a loss function, and constructing the loss function in the classification problem by using cross entropy;

step 2.4: error back propagation, adopting a gradient descending method to search the minimum value of the loss function along the gradient descending direction of the loss function, thereby obtaining the optimal parameter and optimizing the network structure;

step 2.5: training the neural network using the training data.

Further, in step 2.1, signals in the three-layer BP neural network are firstly propagated in the forward direction, and the signals pass through the hidden layer from the input layer and finally reach the output layer:

the output of the input layer is equal to the input signal of the whole network:

wherein the content of the first and second substances,

represents the input of the mth neuron of the input layer, and x (n) represents the input value;

passing through the hidden layer, the input of the ith neuron of the hidden layer is equal to

Weighted sum of (c):

wherein the content of the first and second substances,

input, ω, representing the ith neuron of the hidden layer_mi(n) represents the weight of the neuron between the two layers;

the output of the hidden layer passes through a Sigmoid function, and the output of the ith neuron of the hidden layer is equal to:

wherein the content of the first and second substances,

the output of the ith neuron representing the hidden layer,

an input representing a jth neuron of the hidden layer;

input to jth neuron of output layer is equal to

Weighted sum of (c):

wherein the content of the first and second substances,

representing the input, ω, of the jth neuron of the output layer_mi(n) represents the weight of the neuron between the two layers,

an output representing the ith neuron of the hidden layer;

the output of the jth neuron of the output layer is equal to:

wherein the content of the first and second substances,

represents the output of the jth neuron of the output layer,

an input representing the jth neuron of the output layer;

error of jth neuron of output layer:

wherein e is_j(n) represents the error of the j-th neuron, d_j(n) represents the actual value of the jth neuron,

a value representing a jth neural network prediction;

total error of network:

wherein e is_j(n) represents the error of the jth neuron, e (n) represents the total error of the network;

error signal back propagation of three-layer BP neural network:

firstly, the error is reversely propagated through an output layer, the weight between a hidden layer and the output layer is adjusted, and the adjustment is carried out layer by layer along a network in a reverse direction in the weight adjusting stage; then adjusting the error of the input neuron and the hidden layer neuron; the calculation of the local gradient needs to use the result of the previous calculation, and the local gradient of the previous layer is the weighted sum of the local gradients of the next layer.

Further, in step 2.3, the loss function in back propagation is a Cross Entropy (Cross Entropy) loss function:

H(y_-，y)＝-∑y_-*ln y

wherein y represents the true value of the data and y represents the predicted value of the neural network.

Further, in step 2.4, a gradient descent method is adopted:

calculate the gradient of the loss function with respect to the current parameter:

wherein, each iteration is one batch, t represents the total number of iterations of the current batch, ω_tFor the parameters to be optimized, f (w) represents the loss function, g_tIs the current parameter gradient t times;

calculating first-order momentum and second-order momentum according to historical gradients:

m_t＝φ(g₁，g₂，...，g_t)，V_t＝ψ(g₁，g₂，...，g_t)

wherein g is the gradient, m_tOf first order momentum of t times, V_tA second order momentum of t times;

calculating the falling gradient at the current moment:

wherein, alpha is the initial learning rate, m_tOf first order momentum of t times, V_tIs a second order momentum, η_tA decreasing gradient of t times;

updating according to the gradient of descent:

ω_t+1＝ω_t-η_t

wherein, ω is_tFor the parameter to be optimized, η_tIs t timesGradient of decrease, ω_t+1Is the updated falling gradient.

Further, in step 2.5, a batch mode is selected as the training mode of the multilayer feedforward neural network, the batch mode is that the network obtains all training samples, the mean square error of all samples is calculated to be used as the total error, the batch mode can realize parallelization, and the learning speed of the neural network is accelerated.

Further, step 3 comprises the following steps:

step 3.1: taking k as 5 to perform cross validation, repeating for 5 times, randomly disordering the original data, and dividing a training set and a test set according to the ratio of 7: 3;

step 3.2: respectively training 5 groups of training sets by using the model in the step 2 to obtain 5 groups of different test results, finally taking the average value of the test results to obtain the overall performance index, and introducing a cross validation method to determine the generalization capability of the model.

Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects:

1. according to the invention, by means of a multilayer feedforward neural network algorithm model, food-borne pathogenic bacteria data are selected for data preprocessing, and then the multilayer feedforward neural network algorithm model is utilized to relieve the problem of artificially identifying two food-borne pathogenic bacteria Escherichia coli O with similar wave crests to a certain extent₁₅₇∶H₇And the misjudgment problem of the Brucella S2 strain;

2. the method based on the multilayer feedforward neural network algorithm is used, the accuracy of the spectrum classification of the food-borne pathogenic bacteria is improved through the structural configuration and optimization of the multilayer feedforward neural network algorithm, the automation of the spectrum data classification of the food-borne pathogenic bacteria is realized through modeling of the multilayer feedforward neural network algorithm, and a new method is provided for detecting the food-borne pathogenic bacteria.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 is a detailed flowchart of a food-borne pathogenic bacteria classification method based on a multi-layer feedforward neural network algorithm model according to the present invention;

FIG. 2 is a structural model diagram of a multi-layer feedforward neural network algorithm model according to an embodiment of the present invention;

FIG. 3 is a flow chart of a multi-layer feedforward neural network algorithm of an embodiment of the present invention;

FIG. 4 is a graph showing the spectrum of Escherichia coli O157: H7 and the spectrum of Brucella S2 strain in the examples of the present invention;

FIG. 5 is a Raman spectrum of two pathogens after pretreatment in accordance with an embodiment of the present invention;

FIG. 6 is a graph of a sigmoid function of an activation function used by an example of the present invention;

FIG. 7 is a diagram of the function of the activation function tanh used in an example of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1 to 3, a specific flowchart of a food-borne pathogenic bacteria classification method based on a Multilayer feed forward Neural Network Algorithm (Multilayer feed forward Neural Network Algorithm) provided by an embodiment of the present invention includes the following steps:

specifically, step 1 includes the following:

And 2, step: constructing a Multilayer Feedforward Neural Network (multilayered fed forward Neural Network);

specifically, step 2 includes the following:

step 2.1: the model comprises an input layer, a hidden layer and an output layer, the number of nodes of the input layer, the hidden layer and the output layer is determined firstly, then the activation functions of the input layer and the hidden layer are determined, the activation function of the first layer uses a tanh function, the activation function of the second layer enables a sigmoid function, the input value is multiplied by a weight value, and the output value is obtained in a mode of summing offset values, so that a classification result is obtained.

The process of the multilayer feedforward neural network (multilayer feed-forward) is mainly divided into two stages, wherein the first stage is the forward propagation of signals, and the signals pass through an implied layer from an input layer and finally reach an output layer; the second stage is the back propagation of error, from the output layer back to the hidden layer and finally to the input layer, and the weights and offsets from the hidden layer to the output layer and from the input layer to the hidden layer are adjusted in turn.

Specifically, in step 2.1, signals in the three-layer BP neural network are first propagated in the forward direction, and the signals pass through the hidden layer from the input layer and finally reach the output layer:

wherein the content of the first and second substances,

representing the mth nerve of the input layerThe input of the element, x (n) represents the input value;

Weighted sum of (c):

wherein, the first and the second end of the pipe are connected with each other,

wherein the content of the first and second substances,

the output of the ith neuron representing the hidden layer,

an input representing the ith neuron of the hidden layer;

input to jth neuron of output layer is equal to

Weighted sum of (c):

wherein the content of the first and second substances,

an output representing the ith neuron of the hidden layer;

the output of the jth neuron of the output layer is equal to:

wherein the content of the first and second substances,

representing the output of the jth neuron of the output layer,

an input representing the jth neuron of the output layer;

error of jth neuron of output layer:

a value representing a jth neural network prediction;

total error of network:

in this embodiment, the number of neurons in each layer is: the number of input layers depends on the dimension of the input vector and the number of nodes of the output layers depends on the number of classes. The number of neurons in the hidden layer generally depends on an empirical formula:

wherein m is the number of nodes of the input layer, n is the number of nodes of the output layer, and alpha is a constant between 1 and 10.

In this embodiment, the introduction of the activation function is to increase the nonlinearity of the neural network model, and the use of the activation function can effectively alleviate the problem of gradient disappearance. The common activation functions include nonlinear functions such as Sigmoid function, Tanh function, Relu function, Leaky Relu function, and the like. Two activation functions, tanh and sigmoid, are used in this embodiment:

(1) the tanh function is:

(2) the sigmoid function is:

error signal back propagation of three-layer BP neural network:

Step 2.2: and initializing parameters, wherein the weight w and the bias b are randomly initialized, and forward propagation is carried out according to a mode of y ═ x × w + b.

Step 2.3: and constructing a loss function, and constructing the loss function in the classification problem by using Cross Entropy (Cross Entropy). The effectiveness of the neural network model and the goal of optimization are defined by the loss function. The loss function is typically a Mean Square Error (Mean Square Error) loss function, a Cross Entropy loss (Cross Entropy) function, and a custom loss function.

Specifically, in step 2.3, the loss function in back propagation is a Cross Entropy (Cross Entropy) loss function:

H(y_-，y)＝-∑y_-*ln y

wherein, y_-Representing the true value of the data and y representing the predicted value of the neural network.

Step 2.4: and (3) error back propagation, namely searching the minimum value of the loss function along the gradient descending direction of the loss function by adopting a gradient descending method so as to obtain the optimal parameter and optimize the network structure.

Specifically, in step 2.4, a gradient descent method is adopted:

wherein, each iteration is a batch, t represents the total times of the current batch iteration, and ω is_tFor the parameters to be optimized, f (w) represents the loss function, g_tIs the current parameter gradient of t times;

m_t＝φ(g₁，g₂，...，g_t)，V_t＝ψ(g₁，g₂，...，g_t)

wherein g is gradient, m_tFirst order momentum of t times, V_tA second order momentum of t times;

calculating the falling gradient at the current moment:

where α is the initial learning rate, m_tIs one of t timesMagnitude of order, V_tIs a second order momentum, η_tA decreasing gradient of t times;

updating according to the gradient of descent:

ω_t+1＝ω_t-η_t

wherein, ω is_tFor the parameter to be optimized, η_tIs a decreasing gradient of t times, ω_t+1Is the updated falling gradient.

Step 2.5: training the neural network using the training data.

Specifically, in step 2.5, a batch mode is selected as the training mode of the multilayer feedforward neural network, the batch mode is that the network obtains all training samples, the mean square error of all samples is calculated to be used as the total error, the batch mode can realize parallelization, and the learning speed of the neural network is accelerated.

Specifically, step 3 includes the following:

step 3.1: taking k as 5 to perform cross validation, repeating 5 times to randomly scramble the original data, and dividing a training set (training set) and a test set (test set) according to a ratio of 7: 3;

FIG. 4 is the spectrum diagram of two pathogenic bacteria, Escherichia coli O157: H7 and Brucella S2 strain, with the spectrum curves at 1200, 1400 and 1700cm^-1Similar raman peaks are found on both sides.

FIG. 5 is a spectrum image of two pathogenic bacteria, Escherichia coli O157: H7 and Brucella S2 strain, which are subjected to Min-max normalization and Savitzky-Golay method smoothing denoising.

Fig. 6 is a diagram of a sigmoid function of an activation function used in an example of the present invention, and fig. 7 is a diagram of a tanh function of an activation function used in an example of the present invention.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A food-borne pathogenic bacteria classification method based on a multilayer feedforward neural network algorithm is characterized by comprising the following steps:

step 2: constructing a multilayer feedforward neural network;

2. The food-borne pathogenic bacteria classification method based on the multi-layer feedforward neural network algorithm is characterized in that the step 1 comprises the following steps:

step 1.4: the spectral data were measured as 7:3 into a training set and a test set to facilitate training of the model.

3. The food-borne pathogenic bacteria classification method based on the multilayer feedforward neural network algorithm is characterized in that the step 2 comprises the following steps:

step 2.5: training the neural network using the training data.

4. The food-borne pathogenic bacteria classification method based on the multilayer feedforward neural network algorithm as claimed in claim 3, characterized in that, in step 2.1, the signal in the three-layer BP neural network is firstly propagated in the forward direction, and the signal passes through the hidden layer from the input layer and finally reaches the output layer:

wherein the content of the first and second substances,

Weighted sum of：

Wherein the content of the first and second substances,

the output of the ith neuron representing the hidden layer,

an input representing the ith neuron of the hidden layer;

input to jth neuron of output layer is equal to

Weighted sum of (c):

wherein the content of the first and second substances,

an output representing the ith neuron of the hidden layer;

the output of the jth neuron of the output layer is equal to:

wherein the content of the first and second substances,

represents the output of the jth neuron of the output layer,

an input representing the jth neuron of the output layer;

error of jth neuron of output layer:

a value representing a jth neural network prediction;

total error of network:

and (3) error signals of the three-layer BP neural network are propagated reversely:

5. A food-borne pathogenic bacteria classification method based on a multilayer feed-forward neural network algorithm as claimed in claim 3, characterized in that in step 2.3, the loss function in back propagation is Cross Entropy (Cross Entropy) loss function:

H(y_，y)＝-∑y_*lny

wherein y _ represents the true value of the data and y represents the predicted value of the neural network.

6. The food-borne pathogenic bacteria classification method based on the multilayer feedforward neural network algorithm as claimed in claim 3, characterized in that in step 2.4, a gradient descent method is adopted:

m_t＝φ(g₁，g₂，...，g_t)，V_t＝ψ(g₁，g₂，...，g_t)

wherein g is the gradient, m_tOf first order momentum of t times, V_tA second order momentum of t orders;

calculating the falling gradient at the current moment:

where α is the initial learning rate, m_tOf first order momentum of t times, V_tIs a second order momentum, η_tA decreasing gradient of t times;

updating according to the gradient of descent:

ω_t+1＝ω_t-η_t

7. The food-borne pathogenic bacteria classification method based on the multilayer feedforward neural network algorithm as claimed in claim 3, characterized in that in step 2.5, the training mode of the multilayer feedforward neural network selects a batch mode, the batch mode is that the network obtains all training samples, the mean square error of all samples is calculated as the total error, the batch mode can realize parallelization, and the learning speed of the neural network is accelerated.

8. The food-borne pathogenic bacteria classification method based on the multi-layer feedforward neural network algorithm as claimed in claim 1, wherein the step 3 comprises the following steps:

step 3.1: taking k to be 5 for cross validation, repeating 5 times to randomly disorder the original data, and dividing a training set and a test set according to the ratio of 7: 3;