CN112784919A

CN112784919A - Intelligent manufacturing multi-mode data oriented classification method

Info

Publication number: CN112784919A
Application number: CN202110146422.XA
Authority: CN
Inventors: 黎志豪; 余志文; 杨楷翔; 孟献兵; 陈俊龙
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2021-02-03
Filing date: 2021-02-03
Publication date: 2021-05-11
Anticipated expiration: 2041-02-03
Also published as: CN112784919B

Abstract

The invention discloses a classification method for intelligent manufacturing multi-modal data, which comprises the following steps: 1) collecting and cleaning production data logs to obtain multi-mode data; 2) dividing the multi-modal data according to the data composition form of the multi-modal data, and performing corresponding preprocessing; 3) and performing feature extraction and feature fusion on the preprocessed multi-modal data, and classifying the fused features. According to the method, multi-modal data are preprocessed, the surface features of the data are extracted and the deep features of the data are mined in combination with a self-encoder and an embedding processing mode, the data are classified in real time and the result is displayed by utilizing a fully-connected feedforward deep neural network based on an online learning mode, the accuracy of multi-modal data classification is effectively improved, and the AUC index is improved.

Description

Intelligent manufacturing multi-mode data oriented classification method

Technical Field

The invention relates to the field of computer artificial intelligence, in particular to a classification method for intelligent manufacturing multi-modal data.

Background

With the arrival of the 4.0 era of industry and the rapid development of artificial intelligence, the production and manufacturing of many traditional industrial industries and emerging pharmaceutical and pharmaceutical industries are also going to be intelligent. In the era of intelligent manufacturing big data, a large amount of manufacturing data which are complex in structure and difficult to analyze can be generated in the industrial production and pharmaceutical production processes. How to mine the hidden value behind the massive multi-modal production data and effectively classify the hidden value is a key development direction in the field of intelligent manufacturing research at the present stage. Aiming at the characteristics of poor compatibility, low expansibility, high modal imbalance, high dimension attribute and the like of the current intelligent manufacturing multi-modal data, how to ensure the consistency, accuracy, integrity and reliability of the data and improve the real-time property, compatibility and expansibility of multi-modal data processing is a key point for efficiently classifying the intelligent manufacturing multi-modal data.

Disclosure of Invention

The invention aims to provide a method for classifying multi-modal data for intelligent manufacturing, which can effectively overcome the defect of complicated feature processing of the multi-modal data, and can achieve the purposes of automatically extracting features to improve the accuracy of data classification and improve AUC (automatic characteristic collection) indexes.

In order to achieve the purpose, the technical scheme provided by the invention is as follows: a classification method for intelligent manufacturing multi-modal data comprises the following steps:

1) collecting and cleaning production data logs to obtain multi-mode data;

2) dividing the multi-modal data according to the data composition form of the multi-modal data, and performing corresponding preprocessing;

3) and performing feature extraction and feature fusion on the preprocessed multi-modal data, and classifying the fused features.

In the step 1), collecting and cleaning production data logs, and acquiring multi-mode data refers to collecting data logs of an intelligent manufacturing production platform, and screening and filtering abnormal data and noise data in the data logs; the abnormal data refers to that for all records of the production log, in a given time slice, values of the data under certain variable dimensions exceed a reasonable range, or the data do not conform to normal distribution under the 3 sigma principle, and the data are considered unreasonable and abnormal; the noise data refers to that the difference between certain sample data and normal sample data is larger than a threshold value due to abnormal factors such as network faults, data sample loss, timestamp deviation and data basic feature loss when the intelligent manufacturing production platform collects logs, and then the data can be regarded as noise data; the abnormal data and the noise data are screened and filtered in the data cleaning stage, the screened and filtered data samples are stored in a storage module based on a distributed file system (HDFS), and a corresponding Hive database table is created to obtain original multi-modal data.

In step 2), the multi-modal data is divided according to the data composition form, and corresponding preprocessing is performed, namely the server preprocesses the data in different forms by using different methods through a preprocessing layer to obtain the multi-modal data suitable for subsequent processing, and the method comprises the following steps:

2.1) dividing the multi-modal data into image data, text data and numerical data according to a data composition form;

2.2) acquiring a pixel value matrix of the image data obtained in the step 2.1) and carrying out standardization preprocessing:

in the formula, the set of pixel matrices of all the original image data is represented as

n₁Representing the number of original image data, a_uA pixel matrix representing the u-th original image data, wherein u ranges from 1 to n₁；μ_AA mean value of a pixel matrix representing all of the original image data; sigma_AA standard deviation of a pixel matrix representing all of the original image data;

the method comprises the steps of representing a pixel matrix obtained after the u-th original image data are subjected to standardization processing;

when each original image data is standardized, replacing the obtained standardized pixel matrix with the corresponding pixel matrix of the original image data to obtain a preprocessed image data set

Carrying out word vector pre-training treatment on the text data obtained in the step 2.1):

performing preliminary Word segmentation on the text data, performing Word vector training on the text data by using a Word2Vec method according to the Word segmentation Word bank result, converting the text data into numerical data, and collecting the preprocessed text data into a preprocessed text data set

n₂Which represents the number of text data to be displayed,

denotes the n-th₂Pre-processing the text data;

carrying out data regularization treatment on the numerical data obtained in the step 2.1):

in the formula, the set of all numerical data is represented as

n₃Number of data representing numerical type, c_rRepresenting the r-th data, wherein r has a value ranging from 1 to n₃(ii) a n represents the dimension of the data, RⁿRepresenting an n-dimensional real number space;

denotes c_rThe d-th dimension of (1), wherein d ranges from 1 to n; l is_q(c_r) Denotes c_rQ norm of (a), wherein the value of q is set by a user; c'_rDenotes c_rThe result after regularization treatment;

after each numerical data is regularized, replacing the original numerical data with the obtained regularized data to obtain a preprocessed numerical data set

After all data are preprocessed, integrating the data together to obtain a final multi-modal data set X { A, B, C }₁,x₂,...,x_mWhere m is n₁+n₂+n₃Representing the number, x, of multimodal data sets_kRepresenting the kth data, k ranges from 1 to m.

In step 3), the method for extracting and fusing the features of the preprocessed multi-modal data and classifying the fused features comprises the following steps:

3.1) preprocessing the multi-modal data set X ═{x₁,x₂,...,x_mInputting the data into a self-encoder comprising an encoder and a decoder, reconstructing the encoder and generating the decoder, and taking the output of the decoder as a characteristic F₁Wherein the reconstruction loss function is:

wherein h represents an encoder; g represents a decoder; λ represents a hyper-parameter, the value of which is set by the user; m represents the number of data; x is the number of_kRepresenting the kth data, wherein the k value range is 1 to m;

representing data x_kF norm of the jacobian matrix of (d); l is_AERepresenting a loss function; g (h (x)_k) ) represents data x_kSequentially reconstructing by an encoder h and generating a result by a decoder g; | | g (h (x)_k))-x_k||₁Denotes g (h (x)_k) And x)_k1 norm of the difference of (a);

at the loss function L_AEWhen convergence is reached, the feature F is obtained₁I.e. F₁The matrix is a matrix with m rows and L columns, wherein m represents the number of data, and L represents the dimension of each data;

3.2) characterization of F obtained in step 3.1)₁Duplicate copies were made to yield three signatures, F respectively₁、F₂、F₃To F₂And F₃Feature embedding (embedding) processing, noted as F'₂、F'₃：

F'₂＝W₂·F₂

F'₃＝W₃·F₃

In the formula, W₂、W₃A parameter matrix representing m rows and L columns; w₂·F₂Represents W₂And F₂Performing dot product;W₃·F₃represents W₃And F₃Performing dot product; f'₂、F'₃Representing the resulting embedded features, both being a matrix of m rows and L columns;

3.3) embedding feature F'₂And F'₃After the softmax processing, the feature F is obtained₁Performing weighted fusion, wherein the specific process comprises the following steps:

in the formula (I), the compound is shown in the specification,

is represented by F'₃Is a matrix of L rows and m columns;

is represented by F'₂And

obtaining m rows and m columns of matrix after matrix multiplication, and dividing each column of the matrix by a constant

Presentation pair

Performing softmax processing on each line; f represents

And F₁Performing matrix multiplication to obtain a matrix of m rows and L columns, namely the final fusion characteristic;

adding the obtained fusion characteristic F into a full-connection feedforward deep neural network of the T layer for training, wherein the formula is expressed as follows:

in the formula, h_t、h_t+1Respectively representing output results of a T-th layer and a T + 1-th layer of fully-connected feedforward deep neural network, wherein the value range of T is from 1 to T-1; w_t ¹、W_t ²Representing a weight parameter of the t-th layer fully-connected feedforward deep neural network;

respectively represent and W_t ¹、W_t ²Corresponding bias parameters; f (-) represents the Leaky-ReLU function, which is formulated as:

wherein a ranges from 0 to 1;

taking the output result h of the last layer of the fully-connected feedforward deep neural network_TWeight parameter

And bias parameter

The total number of categories of multimodal data is C; to h_T、

Performing softmax processing to obtain a classification result of each datum:

in the formula (I), the compound is shown in the specification,

to represent

And

performing matrix multiplication; z denotes a matrix of m rows and C columns, Z_iThe ith row of Z is represented as a C-dimensional vector, wherein the value range of i is 1 to m; exp (z)_i) Represents a pair z_iEach element of (1) is subjected to exponential operation with e as a base, and the result is still a C-dimensional vector;

denotes z_iThe v-th dimension element of (1);

represents a pair z_iEach element of (1) is subjected to exponential operation with e as a base and summed, and the result is a constant value;

denotes exp (z)_i) Each element in (1) is divided by

The result is a C-dimensional vector; p is a radical of_iRepresents from

Selecting the maximum value as the classification probability value of the ith data;

3.4) adopting a cross entropy loss function and an ADAM algorithm to carry out iterative optimization on the fully-connected feedforward deep neural network in the step 3.3), pre-training the network under the condition of off-line learning, and adopting an on-line learning mechanism to carry out parameter real-time updating on the pre-trained network so as to classify the data in real time, wherein the process is as follows:

reading a preset amount of data samples from the HDFS, and obtaining preprocessed data through the step 2); carrying out iterative training on the preprocessed data for preset times through steps 3.1), 3.2) and 3.3), and solving the fully-connected feedforward deep neural network by using a cross entropy loss function J added with a regular term, wherein the formula is expressed as:

in the formula, p_jIs the calculated probability value of the jth data, y_jThe method comprises the following steps that (1) the method is a real class label, N is the total number of samples of network pre-training, and beta is a regularization parameter; w is a_jThe weight parameter is the jth data; the formula uses ADAM algorithm to carry out iterative optimization;

when the pre-training process reaches a convergence state, the network is subjected to fine-tuning updating in an online learning mode, namely, a preset amount of data is read from a real-time sample according to a batch (batch), after corresponding preprocessing, the data is input into a fully-connected feedforward deep neural network for training, a server receives a parameter updating result of the network in real time to update the network, and classification and identification are carried out on the data sample of the batch (batch) of the latest network, so that a classification result of each data is obtained and is visually displayed.

Compared with the prior art, the invention has the following advantages:

1. the invention can acquire the data log in real time, and accurately clean the data log according to the actual production requirement to acquire the needed multi-mode data.

2. The invention can automatically divide and preprocess the multi-mode data in the actual production environment, and can adopt different preprocessing methods aiming at different source data.

3. The method carries out feature extraction and feature fusion on the multi-modal data through two modes of self-encoder and low-dimensional embedding, can automatically extract the surface features of the data and mine the deep features of the data without manually designing feature engineering, improves the accuracy of data classification and promotes AUC indexes.

4. The invention classifies the data in real time in an online learning mode, realizes the visual display of data classification and realizes the visual management of intelligent manufacturing data.

5. The method has the advantages of wide use space in an intelligent manufacturing system, simple operation, strong adaptability and wide prospect in intelligent manufacturing data analysis and decision.

Drawings

FIG. 1 is a logic flow diagram of the method of the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples, but the embodiments of the present invention are not limited thereto.

As shown in fig. 1, the classification method for intelligent manufacturing multi-modal data provided by this embodiment includes the following steps:

1) and collecting and cleaning a data log of the flexible production pharmaceutical platform to obtain multi-mode pharmaceutical data. The data log collection and cleaning of the flexible production pharmaceutical platform means that the pharmaceutical production data log is collected, and abnormal data and noise data in the production log are screened and filtered according to all records of the production log. The abnormal data refers to that the value of the pharmaceutical data in certain variable dimensions exceeds a reasonable range or does not comply with normal distribution under the 3 sigma principle in a set time granularity, and the data are considered unreasonable and abnormal; the noisy data refers to abnormal factors which cause that some pharmaceutical data are different from normal pharmaceutical data by more than a threshold value due to network faults, data sample missing, timestamp deviation and data base feature missing which may happen when the flexible production pharmaceutical platform collects logs, and the pharmaceutical data are considered as the noisy data. The above-mentioned abnormal data and noise data are removed in the data cleansing stage.

And storing the cleaned pharmaceutical data into a storage module based on a distributed file system (HDFS), and creating a corresponding Hive database table.

2) Dividing pharmaceutical data according to data composition form, and carrying out corresponding pretreatment, wherein the method comprises the following steps:

and 2.1) reading a certain amount of processed production logs from the Hive database table, and dividing the production logs into image pharmaceutical data, text pharmaceutical data and numerical pharmaceutical data according to the composition form of the pharmaceutical data.

2.2) carrying out standardization preprocessing on the pixel matrix of the image pharmaceutical data, wherein the standardization formula is as follows:

in the formula, the pixel matrix set of all the original image pharmaceutical data is expressed as

n₁Number of pharmaceutical data representing original image class, a_uA pixel matrix for representing the pharmaceutical data of the u-th original image class, wherein u has a value ranging from 1 to n₁；μ_AMean of pixel matrix representing all original image class pharmaceutical data; sigma_AA standard deviation of a pixel matrix representing all of the original image-like pharmaceutical data;

and the pixel matrix is obtained by standardizing the u-th original image pharmaceutical data.

When each original image pharmaceutical data is standardized, replacing the pixel matrix of the corresponding original image pharmaceutical data with the obtained standardized pixel matrix to obtain a preprocessed image pharmaceutical data set

Aiming at the text pharmaceutical data, the preprocessing process comprises the following steps: performing preliminary Word segmentation processing on the text pharmaceutical data by using an NLTK toolkit, and using a Word2Vec tool in a Gensim library according to the result of a Word segmentation lexiconPerforming word vector training on the text pharmaceutical data, converting the text pharmaceutical data into numerical data, and collecting the preprocessed text pharmaceutical data into a set

Denotes the n-th₂Pre-processed textual pharmaceutical data.

Aiming at numerical pharmaceutical data, regularization pretreatment is carried out on the numerical pharmaceutical data, and the specific formula is as follows:

wherein the set of all numerical pharmaceutical data is represented as

n₃Number of numerical pharmaceutical data, c_rRepresenting the r-th numerical pharmaceutical data, wherein r ranges from 1 to n₃(ii) a n represents the dimension of the numerical pharmaceutical data, RⁿRepresenting an n-dimensional real number space;

denotes c_rThe d-th dimension of (1), wherein d ranges from 1 to n; l is_q(c_r) Denotes c_rQ norm of (a), wherein the value of q is set by the user himself; c'_rDenotes c_rAnd (5) carrying out regularization processing on the obtained product.

After each numerical pharmaceutical data is regularized, replacing each original numerical pharmaceutical data with the obtained regularized pharmaceutical data to obtain a preprocessed numerical pharmaceutical data set

When all ofAfter the pharmaceutical data are preprocessed, the pharmaceutical data are integrated together to obtain a final multi-modal pharmaceutical data set X ═ { A, B, C } ═ X₁,x₂,...,x_mWhere m is n₁+n₂+n₃Number, x, representing a multimodal pharmaceutical data set_kRepresenting the kth data, k ranges from 1 to m.

3) The method comprises the following steps of performing feature extraction and feature fusion on the preprocessed pharmaceutical data, and classifying the fused features:

3.1) preprocessing the multi-modal pharmaceutical data set X ═ { X₁,x₂,...,x_mInputting the data into a self-encoder comprising an encoder and a decoder, reconstructing the encoder and generating the decoder, and taking the output of the decoder as a characteristic F₁Wherein the reconstruction loss function is:

wherein h represents an encoder; g represents a decoder; λ represents a hyper-parameter, the value of which is set by the user; m represents the number of pharmaceutical data; x is the number of_kRepresenting the kth pharmaceutical data, wherein the k value range is 1 to m;

representing pharmaceutical data x_kF norm of the jacobian matrix of (d); l is_AERepresenting a loss function; g (h (x)_k) Express pharmaceutical data x_kSequentially reconstructing by an encoder h and generating a result by a decoder g; | | g (h (x)_k))-x_k||₁Denotes g (h (x)_k) And x)_k1 norm of the difference of (a).

At the loss function L_AEWhen convergence is reached, the feature F is obtained₁I.e. F₁The matrix is a matrix with m rows and L columns, m represents the number of pharmaceutical data, and L represents the dimension of each data.

3.2) comparison of feature F obtained in step 3.1)₁Duplicate copies were made to yield three signatures, F respectively₁，F₂，F₃To F₂And F₃Feature embedding (embedding) processing, noted as F'₂，F'₃：

F'₂＝W₂ ^T·F₂

F'₃＝W₃ ^T·F₃

In the formula, W₂、W₃A parameter matrix representing m rows and L columns; w₂·F₂Represents W₂And F₂Performing dot product; w₃·F₃Represents W₃And F₃Performing dot product; f'₂、F'₃The resulting embedded features are represented, both being a matrix of m rows and L columns.

3.3) embedding feature F'₂And F'₃After the softmax processing, the feature F is obtained₁And performing feature fusion, wherein the specific formula is as follows:

in the formula (I), the compound is shown in the specification,

is represented by F'₃Is a matrix of L rows and m columns;

is represented by F'₂And

Presentation pair

Performing softmax processing on each line; f represents

And F₁And performing matrix multiplication to obtain a matrix with m rows and L columns, namely the final fusion characteristic.

wherein a ranges from 0 to 1.

Offset parameter

The total number of categories of multimodal pharmaceutical data is C. To h_T、

Performing softmax processing to obtain a classification result of each pharmaceutical data:

in the formula (I), the compound is shown in the specification,

to represent

And

denotes z_iThe v-th dimension element of (1);

denotes exp (z)_i) Each element in (1) is divided by

The result is a C-dimensional vector; p is a radical of_iRepresents from

The largest value is selected as the classification probability value of the ith pharmaceutical data.

reading a certain amount of pharmaceutical data from the HDFS, and obtaining original characteristics through the step 2); carrying out iterative training on the obtained original features for a certain number of times through steps 3.1), 3.2) and 3.3), and solving the fully-connected feedforward neural network by using a cross entropy loss function J added with a regular term, wherein a specific formula is expressed as follows:

in the formula, p_jIs the calculated probability value of the jth pharmaceutical data, y_jThe classification label is a real class label, N is the total number of the pharmaceutical data pre-trained by the network, and beta is a regularization parameter; w is a_jIs the weight parameter of the jth pharmaceutical data. The above formula is iteratively optimized using an ADAM algorithm.

And when the pre-training process reaches a convergence state, performing fine tuning updating on the network in an online learning mode. Reading a certain amount of data from the pharmaceutical data produced in real time according to batches (batch), inputting the data into a fully-connected feedforward deep neural network for training after corresponding preprocessing, receiving the parameter updating result of the network by a server in real time to update the network, classifying and identifying the pharmaceutical data of the batch (batch) by adopting the latest network, obtaining the classification result of each pharmaceutical data and carrying out visual display.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. The classification method for the intelligent manufacturing multi-modal data is characterized by comprising the following steps of:

1) collecting and cleaning production data logs to obtain multi-mode data;

2. The method for classifying multimodal data on intelligent manufacturing according to claim 1, wherein: in the step 1), collecting and cleaning production data logs, and acquiring multi-mode data refers to collecting data logs of an intelligent manufacturing production platform, and screening and filtering abnormal data and noise data in the data logs; the abnormal data refers to that for all records of the production log, in a given time slice, values of the data under certain variable dimensions exceed a reasonable range, or the data do not conform to normal distribution under the 3 sigma principle, and the data are considered unreasonable and abnormal; the noise data refers to that the difference between certain sample data and normal sample data is larger than a threshold value due to abnormal factors such as network faults, data sample loss, timestamp deviation and data basic feature loss when the intelligent manufacturing production platform collects logs, and then the data can be regarded as noise data; the abnormal data and the noise data are screened and filtered in the data cleaning stage, the screened and filtered data samples are stored in a storage module based on a distributed file system (HDFS), and a corresponding Hive database table is created to obtain original multi-modal data.

3. The method for classifying multimodal data on intelligent manufacturing according to claim 1, wherein: in step 2), the multi-modal data is divided according to the data composition form, and corresponding preprocessing is performed, namely the server preprocesses the data in different forms by using different methods through a preprocessing layer to obtain the multi-modal data suitable for subsequent processing, and the method comprises the following steps:

n₂Which represents the number of text data to be displayed,

denotes the n-th₂Pre-processing the text data;

in the formula, the set of all numerical data is represented as

4. The method for classifying multimodal data on intelligent manufacturing according to claim 1, wherein: in step 3), the method for extracting and fusing the features of the preprocessed multi-modal data and classifying the fused features comprises the following steps:

3.1) on the preprocessed multimodal data set X ═ { X₁,x₂,...,x_mInputting the data into a self-encoder comprising an encoder and a decoder, reconstructing the encoder and generating the decoder, and taking the output of the decoder as a characteristic F₁Wherein the reconstruction loss function is:

representing data x_kF norm of the jacobian matrix of (d); l is_AERepresents the lossA function; g (h (x)_k) ) represents data x_kSequentially reconstructing by an encoder h and generating a result by a decoder g; | | g (h (x)_k))-x_k||₁Denotes g (h (x)_k) And x)_k1 norm of the difference of (a);

3.2) characterization of F obtained in step 3.1)₁Duplicate copies were made to yield three signatures, F respectively₁、F₂、F₃To F₂And F₃Is subjected to characteristic embedding treatment and is recorded as F'₂、F′₃：

F′₂＝W₂·F₂

F′₃＝W₃·F₃

In the formula, W₂、W₃A parameter matrix representing m rows and L columns; w₂·F₂Represents W₂And F₂Performing dot product; w₃·F₃Represents W₃And F₃Performing dot product; f'₂、F′₃Representing the resulting embedded features, both being a matrix of m rows and L columns;

in the formula, F₃'^TIs represented by F'₃Is a matrix of L rows and m columns;

is represented by F'₂And F₃'^TObtaining m rows and m columns of matrix after matrix multiplication, and dividing each column of the matrix by a constant

Presentation pair

Performing softmax processing on each line; f represents

wherein a ranges from 0 to 1;

get allOutput result h of last layer of connected feedforward deep neural network_TWeight parameter

And bias parameter

The total number of categories of multimodal data is C; to h_T、

Performing softmax processing to obtain a classification result of each datum:

in the formula (I), the compound is shown in the specification,

to represent

And

denotes z_iThe v-th dimension element of (1);

denotes exp (z)_i) Each element in (1) is divided by

The result is a C-dimensional vector; p is a radical of_iRepresents from

in the formula, p_jIs the calculated probability value of the jth data, y_jThe method comprises the following steps that (1) the method is a real class label, N is the total number of samples of network pre-training, and beta is a regularization parameter;w_jthe weight parameter is the jth data; the formula uses ADAM algorithm to carry out iterative optimization;

when the pre-training process reaches a convergence state, the network is subjected to fine tuning updating in an online learning mode, namely, a preset amount of data is read from a real-time sample according to batches, after corresponding pre-processing is carried out, the data is input into a fully-connected feedforward deep neural network for training, a server receives a parameter updating result of the network in real time to update the network, a latest data sample of the network batch is adopted for classification and identification, and a classification result of each data is obtained and is displayed visually.