CN112329439A

CN112329439A - Food safety event detection method and system based on graph convolution neural network model

Info

Publication number: CN112329439A
Application number: CN202011291703.6A
Authority: CN
Inventors: 段大高; 刘文文; 刘峥; 王东; 曹若湘; 韩忠明; 田雨薇; 芦月; 王福成
Original assignee: Beijing Center for Disease Prevention and Control; Beijing Technology and Business University
Current assignee: Beijing Center for Disease Prevention and Control; Beijing Technology and Business University
Priority date: 2020-11-18
Filing date: 2020-11-18
Publication date: 2021-02-05
Anticipated expiration: 2040-11-18
Also published as: CN112329439B

Abstract

The invention relates to a food safety event detection method and system based on a graph convolution neural network model. The method comprises the following steps: preprocessing data acquired from a food safety related website; constructing a text classification model according to the preprocessed data; acquiring food safety event data to be predicted; and inputting the data of the food safety event to be predicted into the text classification model to obtain a classification result of the food safety event. The method can quickly and accurately classify the obtained food safety events according to the national food safety event classification standard.

Description

Food safety event detection method and system based on graph convolution neural network model

Technical Field

The invention relates to the field of food safety event detection, in particular to a food safety event detection method and system based on a graph convolution neural network model.

Background

With the development of economy and the improvement of living standard, the requirement of people on food safety is higher and higher, and the food safety becomes a hot problem which is often discussed by people. In recent years, food safety events frequently occur all over the country, such as excessive pesticide residues, excessive food additives, food pollution, food-borne diseases and the like, and the current state of food safety of China is not optimistic. In the big data era, a large amount of information related to food safety exists on the internet. Valuable and more targeted information can be extracted from the information by using a text classification method in data mining, so that relevant departments are helped to make preventive measures, and loss caused by events is reduced.

How to extract effective information from mass data becomes a research hotspot, and the traditional supervised learning text classification method emphasizes on the manual design characteristic and consumes a large amount of manpower and material resources. However, since the food safety event data contains a large amount of text information, and the text data has context information and sparse semantics, it is difficult to obtain useful classification features through the traditional supervised learning method. And food safety information contains words which play a decisive role in classification, and the words cannot be extracted by the traditional supervised learning text classification, and some proper nouns cannot be effectively extracted, so that the classification effect is not ideal.

Disclosure of Invention

The invention aims to provide a food safety event detection method and system based on a graph convolution neural network model, which can be used for rapidly and accurately classifying the obtained food safety events.

In order to achieve the purpose, the invention provides the following scheme:

a food safety event detection method based on a graph convolution neural network model comprises the following steps:

preprocessing data acquired from a food safety related website;

constructing a text classification model according to the preprocessed data;

acquiring food safety event data to be predicted;

and inputting the data of the food safety event to be predicted into the text classification model to obtain a classification result of the food safety event.

Optionally, the preprocessing the data acquired from the food safety-related website specifically includes:

and carrying out text extraction, stop word removal and jieba word segmentation on the data acquired from the food safety related website.

Optionally, the text classification model includes: the word embedding layer, the bidirectional cyclic neural network layer, the convolutional neural network layer, the graph convolutional neural layer and the classification layer, wherein the output end of the word embedding layer is respectively connected with the input end of the bidirectional cyclic neural network layer and the input end of the graph convolutional neural layer, the output end of the bidirectional cyclic neural network layer is connected with the input end of the convolutional neural network layer, and data obtained by the output end of the graph convolutional neural layer and the output end of the convolutional neural network layer are subjected to point multiplication and then input to the classification layer.

Optionally, the constructing a text classification model according to the preprocessed data specifically includes:

constructing a co-occurrence matrix and a topological graph of the vocabulary according to the preprocessed data;

inputting the co-occurrence matrix into a bidirectional long-short term memory network to obtain a text matrix, wherein the text matrix is the context of each vocabulary;

inputting the text matrix into a convolutional neural network to obtain a feature matrix, wherein the feature matrix is the relation between the features of the whole text and sentences;

inputting the topological graph into a graph convolution neural network to obtain a global structure matrix, wherein the global structure matrix is global structure information of a text graph network;

performing dot multiplication calculation on the feature matrix and the global structure matrix to obtain a dot multiplication result;

and determining the classification result of the food safety event according to the dot product result.

Optionally, the classification result of the food safety event comprises: general food safety incidents, major food safety incidents, and particularly major food safety incidents.

A food safety event detection system based on a graph convolution neural network model comprises:

the preprocessing module is used for preprocessing data acquired from a food safety related website;

the text classification model building module is used for building a text classification model according to the preprocessed data;

the system comprises a to-be-predicted data acquisition module, a to-be-predicted data prediction module and a prediction module, wherein the to-be-predicted data acquisition module is used for acquiring food safety event data to be predicted;

and the classification result determining module is used for inputting the data of the food safety events to be predicted into the text classification model to obtain the classification result of the food safety events.

Optionally, the preprocessing module specifically includes:

and the preprocessing unit is used for performing text extraction, stop word removal and jieba word segmentation on the data acquired from the food safety related website.

Optionally, the text classification model building module specifically includes:

the co-occurrence matrix and topological graph constructing unit is used for constructing a co-occurrence matrix and a topological graph of the vocabulary according to the preprocessed data;

the text matrix determining unit is used for inputting the co-occurrence matrix into a bidirectional long-short term memory network to obtain a text matrix, and the text matrix is the context of each vocabulary;

the feature matrix determining unit is used for inputting the text matrix into a convolutional neural network to obtain a feature matrix, wherein the feature matrix is the relation between the features of the whole text and sentences;

the global structure matrix determining unit is used for inputting the topological graph into a graph convolution neural network to obtain a global structure matrix, and the global structure matrix is global structure information of the text graph network;

the dot multiplication calculation unit is used for performing dot multiplication calculation on the feature matrix and the global structure matrix to obtain a dot multiplication result;

and the classification unit is used for determining the classification result of the food safety event according to the dot product result.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention uses the graph convolution neural network model based on deep learning for classifying food safety events, does not need to spend a large amount of manpower to extract features, and can comprehensively and effectively extract relevant features by combining the graph convolution neural network model with the bidirectional LSTM and the convolution neural network, thereby improving the classification performance of the model and being convenient for accurately acquiring effective food safety information.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a flow chart of a method for detecting food safety events based on a graph convolution neural network model;

FIG. 2 is a flow chart of a method of constructing a text classification model;

FIG. 3 is a schematic diagram of a text classification model;

FIG. 4 is a structural view of Bi-LSTM;

FIG. 5 is a view showing the structure of CNN;

FIG. 6 is a diagram of the GCN architecture;

FIG. 7 is a graph of classification criteria;

fig. 8 is a diagram of a food safety event detection system based on a graph convolution neural network model.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

The graph convolution neural network model based on deep learning is used for classifying food safety events, does not need to spend a large amount of manpower to extract features, is combined with the bidirectional LSTM and the convolution neural network, can comprehensively and effectively extract relevant features, improves the classification performance of the model, and is convenient to accurately acquire effective food safety information.

Fig. 1 is a flow chart of a food safety event detection method based on a graph convolution neural network model. As shown in fig. 1, a method for detecting food safety events based on a graph-convolution neural network model includes:

step 101: preprocessing data acquired from a food safety related website, specifically comprising:

And removing invalid information from a segmented word corpus obtained after word segmentation, normalizing written irregular words, restoring abbreviation words, and deleting emoticons and ambiguous network expressions.

Step 102: and constructing a text classification model according to the preprocessed data.

The text classification model comprises: the word embedding layer, the bidirectional cyclic neural network layer, the convolutional neural network layer, the graph convolutional neural layer and the classification layer, wherein the output end of the word embedding layer is respectively connected with the input end of the bidirectional cyclic neural network layer and the input end of the graph convolutional neural layer, the output end of the bidirectional cyclic neural network layer is connected with the input end of the convolutional neural network layer, and data obtained by the output end of the graph convolutional neural layer and the output end of the convolutional neural network layer are subjected to point multiplication and then input to the classification layer. A word embedding layer for distributed representation of text. The bidirectional cyclic neural network layer is used for capturing the context relation of each word in the sentence. The convolutional neural network layer can acquire the relationship between the characteristics of the text as a whole and sentences. The graph convolution neural network can obtain the global structure information of the graph network formed by the texts. And performing dot multiplication on the result of the convolutional neural network and the result of the graph convolutional neural network, and inputting the result into a classification layer to realize the classification of the food safety event. FIG. 3 is a schematic diagram of a text classification model.

Fig. 2 is a flowchart of a method for constructing a text classification model, that is, step 102 specifically includes:

step 1021: and constructing a co-occurrence matrix and a topological graph of the vocabulary according to the preprocessed data.

The method learns word vectors by using GloVe to learn word segmentation obtained by data preprocessing and learns the word vectors based on the statistic information of global vocabulary co-occurrence, thereby combining the statistic information with the advantages of a local context window method.

The co-occurrence matrix of the constructed words of the participle corpus obtained based on data preprocessing is X, wherein each element X is_ijThe sum of the number of times that the word j appears in the context of the word i, let X_i＝∑_kX_ikDenotes the sum of all times of the vocabulary appearing in the context of vocabulary i, k is the number of vocabulary in the context of vocabulary i, let P_ij＝

Representing the probability that word j appears in the context of word i.

Learning word vectors based on co-occurrence matrices and GloVe models: begin → statistical co-occurrence matrix X → training word vector → end.

Cost function of GloVe model:

v_i，v_jis a word vector of vocabulary i and vocabulary j, b_i，b_jAre two scalars (bias terms), N is the size of the vocabulary (co-occurrence matrix dimension N x N); based on the principle that the higher the frequency of appearance, the higher the weight of the word pair should be, f weight function is added, wherein X is X mentioned above_ijXmax is the maximum number of occurrences in the context of vocabulary i. The goal of the process of training the word vectors is to minimize the result of the cost function so that the more accurate the resulting word vector is.

Constructing a word segmentation corpus into a topological graph, wherein nodes of the graph are composed of documents and words, namely the number of the nodes in the graph is | v | ═ doc | + | voc |, wherein | doc | represents the number of the documents, and | voc | represents the total amount of the words.

The constructed topological graph is represented by an adjacency matrix A, which defines the following formula, wherein TF-IDF is adopted for representing the weights of document nodes and vocabulary nodes, the periods are reserved as the distinction between sentences and the line changes are reserved as the distinction of paragraphs during text processing for the weights between the vocabulary nodes:

wherein, # W (i) denotes the number of occurrences of word i under a fixed sliding window, # W₁(i, j) represents the number of words i, j appearing in the same sentence at the same time under a fixed sliding window, # W₂(i, j) represents the number of words i, j appearing in the same paragraph but not in the same sentence under a fixed sliding window, M (i, j) is the number of sentences counted when two words appear in one sentence at the same time, and N (i, j) is the number of paragraphs counted when two words appear in one paragraph at the same time. When the SPMI (i, j) is positive, the vocabulary i and the vocabulary j have strong semantic relevance, and when the SPMI (i, j) is negative, the vocabulary i and the vocabulary j have low semantic relevance, so that only edges are added between word pairs with positive SPMI values, and then the adjacency matrix of the constructed graph is input into the graph convolution neural network.

Step 1022: and inputting the co-occurrence matrix into a bidirectional long-short term memory network to obtain a text matrix, wherein the text matrix is the context of each vocabulary. FIG. 4 is a structural view of Bi-LSTM.

The Bi-directional long-short term memory network Bi-LSTM is a special RNN that can learn context information. The key to LSTM is the cellular state, which is analogous to a conveyor belt, running directly on the entire chain with only a few linear interactions. LSTM removes or adds information to the ability of the cell state through the structure of a "gate".

The Bi-directional long-short term memory network Bi-LSTM is composed of two LSTMs in opposite directions, two hidden state sequences are respectively returned from the forward direction and the reverse direction, and the two hidden state sequences are connected into a fixed dimension vector to be output as text representation.

The Bi-directional long-short term memory network Bi-LSTM can capture the long-term dependence between words in a sentence by learning control information flow, and further obtain the context relationship of word vectors in the sentence. The text consists of H sentences, each of which consists of N word vectors. Inputting the word vector of each sentence into a Bi-LSTM, wherein the input of each time step is a word vector, calculating the hidden state of the current time step, outputting the word vector used for the current time step and transmitting the word vector to the next time step and the next word as an LSTM unit input, and then calculating the LSTM hidden state of the next time step, and repeating the steps until all the input word vectors are processed. And (4) splicing the hidden states of the forward LSTM and the reverse LSTM at the last time step, and taking the obtained result as the input of the convolutional neural network. The embedding of a single word is a 1 x d row vector, the input into the Bi-directional long-short term memory network Bi-LSTM is a sentence vector formed by splicing N word vectors in a sentence, and the dimension is 1 x M, wherein M is d x N; the output is a 1 × M 'matrix, where M' is 2d × N. Finally, the whole text is processed by a Bi-directional long-short term memory network Bi-LSTM to obtain H vectors with dimensions of 1 multiplied by M ', the H vectors form an H multiplied by M' text matrix, and the matrix is used as the input of a convolutional neural network CNN.

Step 1023: and inputting the text matrix into a convolutional neural network to obtain a feature matrix, wherein the feature matrix is the relation between the overall features of the text and sentences. FIG. 5 is a structure diagram of CNN.

Adopting a convolution neural network to obtain a text matrix with dimension H multiplied by M' for the bidirectional long-short term memory network Bi-LSTM to carry out integral feature extraction, wherein the text matrix can obtain the relation between sentences, and the CNN is composed of a convolution layer, a pooling layer and a full-connection layer:

convolution layers, which are obtained by multiplying different local matrixes of the input matrix and elements of each position of the convolution kernel matrix, and then adding, can extract the sentence characteristics. For the convolved output, the ReLU activation function is generally used to change the element value corresponding to a position smaller than 0 in the output tensor to 0.

And the pooling layer is used for placing the matrix S after the processing of the ReLU activation function in the pooling layer, and the pooling layer is used when compression is carried out to reduce the number of the features, wherein maximum pooling is used. After pooling is complete, regularization (to reduce the complexity of the model) and Dropout (to subsequently remove neurons in the neural network to solve the over-fitting problem) are required.

Fully connected layers, the pooled data is "flattened" and dropped to the Flatten Layer, and the results of the Flatten Layer are then placed in a full connected Layer.

In the CNN processing procedure, a text matrix with dimension H × M' obtained by processing the Bi-directional long-short term memory network Bi-LSTM is input into the CNN (where d is 3 and N is 4), and is convolved into a feature matrix with a specified dimension (the dimension depends on the size of the selected convolution kernel), and if the size of the convolution kernel is 2 × 1, a 4 × 4 feature matrix is obtained through a convolution layer. Pooling is a compression to reduce the number of features, assuming a 2 × 2 matrix as a result of pooling. The fully connected layer may splice the pooled data, such as converting the 2 × 2 matrices resulting from pooling into an 8 × 1 column vector.

Step 1024: and inputting the topological graph into a graph convolution neural network to obtain a global structure matrix, wherein the global structure matrix is global structure information of the text graph network.

The adjacency matrix is input into a graph convolution neural network (GCN) model, and the GCN can learn the relation between words with edges in the text. FIG. 6 is a diagram of the GCN structure.

The GCN comprises 3 modules: graph convolution 1, graph convolution 2, full connection layer. The graph adjacency matrix is input into the convolution of the graph of the first layer, and the neighbor information in the graph can be obtained through the convolution operation of the first layer.

Several definitions are given here, the initial feature matrix U ∈ R^n×mN denotes all nodes (vocabulary nodes and document nodes) of the graph, and m is a characteristic dimension of each node. A denotes the adjacency matrix of the graph, D is the degree matrix, where D is_ii＝∑_jA_ij，

W₀Is a weight, Z₁Is the feature matrix after one layer of convolution:

for the convolved output Z₁In general, the ReLU activation function is used to change the element value corresponding to a position smaller than 0 in the output tensor to 0.

Z₂＝ReLU(Z₁)

Will pass the Dropout result Z₂' input into the convolution of the second layer map, W₁As weights:

the result Z obtained by the graph convolution calculation₃And inputting the data into a full connection layer for unfolding and splicing to obtain a vector with the same dimension as the result after the CNN processing.

Step 1025: and performing dot multiplication calculation on the feature matrix and the global structure matrix to obtain a dot multiplication result. Namely, the result of CNN and the result of GCN are subjected to dot multiplication:

Z＝dot_product(result(CNN)，result(GCN))

step 1026: and determining the classification result of the food safety event according to the dot product result.

The food safety events are classified into four categories, which are general food safety events, major food safety events, and particularly major food safety events. FIG. 7 is a classification criteria graph. Dot-multiplying the result of CNN with the result of GCN:

Z＝dot_product(result(CNN)，result(GCN))

the classification layer uses a softmax classifier, which is an application of logistic regression to multi-classification problems. Taking the result (Z) of the point multiplication of the CNN and the GCN as an input, calculating the probability of dividing the food safety event into a general food safety event, a large food safety event, a major food safety event and a special major food safety event by the softmax function.

Wherein, b_kAn offset value, w, of the type k_kThe weight value of k represents the category, T represents transposition, and k represents the number of categories.

From the above probability formula, a row vector of [0.025, 0.37, 0.075, 0.53] is obtained, and it can be seen that 0.53 is the largest, and the category of food safety event to which it corresponds is a particularly significant food safety event.

Step 103: food safety event data to be predicted is obtained.

Step 104: and inputting the data of the food safety event to be predicted into the text classification model to obtain a classification result of the food safety event.

Fig. 8 is a diagram of a food safety event detection system based on a graph convolution neural network model. As shown in fig. 8, a system for detecting food safety events based on a graph-convolution neural network model includes:

the preprocessing module 201 is configured to preprocess data acquired from a food safety-related website.

And a text classification model building module 202, configured to build a text classification model according to the preprocessed data. The text classification model comprises: the word embedding layer, the bidirectional cyclic neural network layer, the convolutional neural network layer, the graph convolutional neural layer and the classification layer, wherein the output end of the word embedding layer is respectively connected with the input end of the bidirectional cyclic neural network layer and the input end of the graph convolutional neural layer, the output end of the bidirectional cyclic neural network layer is connected with the input end of the convolutional neural network layer, and data obtained by the output end of the graph convolutional neural layer and the output end of the convolutional neural network layer are subjected to point multiplication and then input to the classification layer.

And the data to be predicted acquiring module 203 is used for acquiring food safety event data to be predicted.

And the classification result determining module 204 is configured to input the data of the food safety event to be predicted into the text classification model to obtain a classification result of the food safety event. The classification result of the food safety event comprises: general food safety incidents, major food safety incidents, and particularly major food safety incidents.

The preprocessing module 201 specifically includes:

The text classification model building module 202 specifically includes:

and the co-occurrence matrix and topological graph constructing unit is used for constructing the co-occurrence matrix and the topological graph of the vocabulary according to the preprocessed data.

And the text matrix determining unit is used for inputting the co-occurrence matrix into a bidirectional long-short term memory network to obtain a text matrix, and the text matrix is the context of each vocabulary.

And the characteristic matrix determining unit is used for inputting the text matrix into a convolutional neural network to obtain a characteristic matrix, wherein the characteristic matrix is the relation between the characteristics of the whole text and sentences.

And the global structure matrix determining unit is used for inputting the topological graph into a graph convolution neural network to obtain a global structure matrix, and the global structure matrix is global structure information of the text graph network.

And the dot multiplication calculating unit is used for performing dot multiplication calculation on the feature matrix and the global structure matrix to obtain a dot multiplication result.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A food safety event detection method based on a graph convolution neural network model is characterized by comprising the following steps:

preprocessing data acquired from a food safety related website;

constructing a text classification model according to the preprocessed data;

acquiring food safety event data to be predicted;

2. The method for detecting food safety events based on the graph convolution neural network model according to claim 1, wherein the preprocessing of the data acquired from the food safety-related website specifically includes:

3. The method of claim 1, wherein the text classification model comprises: the word embedding layer, the bidirectional cyclic neural network layer, the convolutional neural network layer, the graph convolutional neural layer and the classification layer, wherein the output end of the word embedding layer is respectively connected with the input end of the bidirectional cyclic neural network layer and the input end of the graph convolutional neural layer, the output end of the bidirectional cyclic neural network layer is connected with the input end of the convolutional neural network layer, and data obtained by the output end of the graph convolutional neural layer and the output end of the convolutional neural network layer are subjected to point multiplication and then input to the classification layer.

4. The method for detecting food safety events based on the graph convolution neural network model according to claim 1, wherein the constructing a text classification model according to the preprocessed data specifically includes:

5. The method for detecting food safety events based on the graph convolution neural network model, according to claim 1, wherein the classification result of the food safety events comprises: general food safety incidents, major food safety incidents, and particularly major food safety incidents.

6. A food safety event detection system based on a graph convolution neural network model is characterized by comprising:

7. The system for detecting food safety events based on the atlas neural network model of claim 6, wherein the preprocessing module specifically comprises:

8. The system of claim 6, wherein the text classification model comprises: the word embedding layer, the bidirectional cyclic neural network layer, the convolutional neural network layer, the graph convolutional neural layer and the classification layer, wherein the output end of the word embedding layer is respectively connected with the input end of the bidirectional cyclic neural network layer and the input end of the graph convolutional neural layer, the output end of the bidirectional cyclic neural network layer is connected with the input end of the convolutional neural network layer, and data obtained by the output end of the graph convolutional neural layer and the output end of the convolutional neural network layer are subjected to point multiplication and then input to the classification layer.

9. The system for detecting food safety events based on the atlas neural network model of claim 6, wherein the text classification model building module specifically comprises:

10. The system of claim 6, wherein the classification result of the food safety event comprises: general food safety incidents, major food safety incidents, and particularly major food safety incidents.