CN108764314B

CN108764314B - Structured data classification method and device, electronic equipment and storage medium

Info

Publication number: CN108764314B
Application number: CN201810475821.9A
Authority: CN
Inventors: 刘奎; 康桂霞; 张宁波; 侯蓓蓓
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2018-05-17
Filing date: 2018-05-17
Publication date: 2020-10-02
Anticipated expiration: 2038-05-17
Also published as: CN108764314A

Abstract

The embodiment of the invention provides a structured data classification method, a structured data classification device, electronic equipment and a storage medium, wherein the structured data classification method comprises the following steps: acquiring structured data to be classified; and inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural sub-network, and the full connection layer is the first layer in the convolutional neural network model. The method and the device can improve the accuracy of the convolutional neural network model in classifying the structured data.

Description

Structured data classification method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a structured data classification method and apparatus, an electronic device, and a storage medium.

Background

Structured data generally refers to data stored in a database having a logical structure and a physical structure, and the data can be generally expressed by a two-dimensional table structure, for example, a blood routine test report of a patient, which includes: white blood cell number, red blood cell number, platelet number, lymphocyte percentage and the like, and the data are structured data. The doctor can judge whether the patient has a certain disease or not by analyzing the structured data in the test report, the process is a process of classifying the structured data, the classification object is the test report sample containing the structured data, and the classification result is that the patient has a certain disease or does not have a certain disease.

Multi-layer perceptrons (MLPs) are machine learning algorithms commonly used in the artificial neural network family for data classification. All layers of the MLP are fully connected layers, which are not sensitive to local features in the data. The classification performance is the same for MLP, regardless of whether unstructured data has local features or structured data does not have local features. The Convolutional Neural Networks (CNN) improves the MLP and proposes Convolutional layers with local connections. The convolutional layer can extract local features in the data, and then the features are combined through a multi-layer structure of the network to obtain features with higher distinguishability, so that the classification of the data with the local features is improved according to the features. The conventional convolutional neural network has a significant performance improvement on data with local features (such as image data) compared to MLP.

However, the structured data itself has no local features, and therefore, compared with MLP, the classification accuracy of the structured data cannot be improved directly through the conventional convolutional neural network model.

Disclosure of Invention

The embodiment of the invention aims to provide a structured data classification method, a structured data classification device, electronic equipment and a storage medium, so as to improve the accuracy of a convolutional neural network model in classifying structured data. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a structured data classification method, where the method includes:

acquiring structured data to be classified;

and inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural sub-network, and the full connection layer is the first layer in the convolutional neural network model.

Further, the convolutional neural sub-network is a one-dimensional convolutional neural sub-network;

the training mode of the convolutional neural network model comprises the following steps:

obtaining a structured data sample set, wherein the structured data sample set comprises a plurality of structured data samples;

for each structured data sample, the following operations are performed:

inputting the structured data sample into the full connection layer to obtain a first transition sample;

inputting the first transition sample into the one-dimensional convolution neural sub-network to obtain neural network output;

and training the full-connection layer and the one-dimensional convolution neural sub-network based on the difference between the neural network output and the structured data sample to obtain the convolution neural network model.

based on the structured data sample set, carrying out parameter adjustment through a softmax loss function, and training the full connection layer;

respectively inputting each structured data sample in the structured data sample set into a trained full-connection layer to obtain a first transition data sample set;

training the one-dimensional convolutional neural subnetwork based on the first set of transitional data samples;

and training a first transition network model consisting of the trained full-link layer and the trained one-dimensional convolutional neural sub-network based on the structured data sample set to obtain a convolutional neural network model.

Further, the convolutional neural sub-network is a two-dimensional convolutional neural sub-network; the convolutional neural network model also comprises a deformation layer;

establishing two-dimensional image data corresponding to the structured data sample, and expanding the two-dimensional image data into one-dimensional image data;

for each structured data sample, the following operations are performed:

inputting the structured data sample into the full connection layer to obtain a second transition sample;

inputting the second transition sample into the deformation layer to obtain a third transition sample;

inputting the third transition sample into the two-dimensional convolution neural sub-network to obtain neural network output;

and training the full connection layer and the two-dimensional convolution neural sub-network based on the difference between the second transition sample and the one-dimensional image data and the difference between the neural network output and the structured data sample to obtain the convolution neural network model.

Further, the convolutional neural sub-network is a two-dimensional convolutional neural network model; the convolutional neural network model also comprises a deformation layer;

based on the structured data sample set and the one-dimensional image data, parameter adjustment is carried out through a mean square error loss function, and the full connection layer is trained;

respectively inputting each structured data sample in the structured data sample set into the trained full-connection layer to obtain a second transition data sample set;

respectively inputting samples in the second transition data sample set into the deformation layer to obtain a third transition sample set;

training the two-dimensional convolutional neural subnetwork based on the third set of transitional data samples;

and training a second transition network model consisting of the trained full-link layer and the trained two-dimensional convolutional neural sub-network based on the structured data sample set and the one-dimensional image data to obtain a convolutional neural network model.

Further, the convolutional neural network model comprises a first sub-network model, a second sub-network model, a third sub-network model and a fourth sub-network model; wherein the models of the first and second sub-network models each comprise: a full-connection layer, a one-dimensional convolution neural subnetwork; the models of the third and fourth sub-network models each include: the system comprises a full connection layer, a deformation layer and a two-dimensional convolution neural sub-network;

the inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain the classification result of the structured data comprises the following steps:

inputting the structured data to be classified into the first sub-network model, and obtaining first one-dimensional vector data with local features through the operation of a full connection layer in the first sub-network model; inputting the first one-dimensional vector data into a one-dimensional convolution neural subnetwork in the first subnetwork model to obtain a first confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into the second sub-network model, and obtaining second one-dimensional vector data with local features through the operation of a full connection layer in the second sub-network model; inputting the second one-dimensional vector data into a one-dimensional convolution neural subnetwork in the second subnetwork model to obtain a second confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into the third sub-network model, and obtaining third one-dimensional vector data with local features through the operation of a full connection layer in the third sub-network model; inputting the third one-dimensional vector data into a deformation layer in the third sub-network model to obtain first two-dimensional image data; inputting the first two-dimensional image data into a two-dimensional convolution neural subnetwork in the third subnetwork model to obtain a third confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into the fourth sub-network model, and obtaining fourth one-dimensional vector data with local features through the operation of a full connection layer in the fourth sub-network model; inputting the fourth one-dimensional vector data into a deformation layer in the fourth sub-network model to obtain second two-dimensional image data; inputting the second two-dimensional image data into a two-dimensional convolution neural sub-network in the fourth sub-network model to obtain a fourth confidence coefficient of the structured data for each data type;

calculating the comprehensive confidence of the structured data to be classified for each data type according to the first confidence, the second confidence, the third confidence and the fourth confidence;

and obtaining a classification result of the structured data to be classified according to the comprehensive confidence.

Further, the calculating, according to the first confidence, the second confidence, the third confidence and the fourth confidence, a comprehensive confidence of the structured data to be classified for each data type includes:

according to the first confidence coefficient, the second confidence coefficient, the third confidence coefficient and the fourth confidence coefficient, calculating the comprehensive confidence coefficient of the structured data to be classified for each data type by a method of obtaining the average value of the confidence coefficients of the structured data to be classified for each data type, and adopting a preset relational expression, wherein the preset relational expression is as follows:

wherein: y._iThe comprehensive confidence coefficient of the structured data to be classified aiming at the data type corresponding to the i is obtained; y is_ijAnd inputting the structured data to be classified into a jth sub-network model obtained by pre-training to obtain the confidence coefficient aiming at the data type corresponding to the i.

The obtaining of the classification result of the structured data to be classified according to the comprehensive confidence includes:

and determining the data type corresponding to the maximum comprehensive confidence coefficient in the comprehensive confidence coefficients as the data type of the structured data to be classified.

In a second aspect, an embodiment of the present invention provides a structured data classification apparatus, where the apparatus includes:

the data to be classified acquisition module is used for acquiring structured data to be classified;

and the classification result acquisition module is used for inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural subnetwork, and the full connection layer is the first layer in the convolutional neural network model.

the device further comprises:

the device comprises a first sample set acquisition module, a second sample set acquisition module and a third sample set acquisition module, wherein the first sample set acquisition module is used for acquiring a structured data sample set, and the structured data sample set comprises a plurality of structured data samples;

a first model training module, configured to perform the following operations for each structured data sample:

the device further comprises:

a second sample set obtaining module, configured to obtain a structured data sample set, where the structured data sample set includes a plurality of structured data samples;

a second model training module to:

the device further comprises:

a third sample set obtaining module, configured to obtain a structured data sample set, where the structured data sample set includes a plurality of structured data samples;

the first image data establishing module is used for establishing two-dimensional image data corresponding to the structured data sample and expanding the two-dimensional image data into one-dimensional image data;

a third model training module to: for each structured data sample, the following operations are performed:

the device further comprises:

a fourth sample set obtaining module, configured to obtain a structured data sample set, where the structured data sample set includes a plurality of structured data samples;

the second image data establishing module is used for establishing two-dimensional image data corresponding to the structured data sample and expanding the two-dimensional image data into one-dimensional image data;

a fourth model training module to:

the classification result acquisition module comprises: a first confidence coefficient obtaining sub-module, a second confidence coefficient obtaining sub-module, a third confidence coefficient obtaining sub-module, a fourth confidence coefficient obtaining sub-module, a comprehensive confidence coefficient obtaining sub-module and a classification result obtaining sub-module;

the first confidence coefficient obtaining submodule is used for inputting the structured data to be classified into the first sub-network model and obtaining first one-dimensional vector data with local features through operation of a full connection layer in the first sub-network model; inputting the first one-dimensional vector data into a one-dimensional convolution neural subnetwork in the first subnetwork model to obtain a first confidence coefficient of the structured data for each data type;

the second confidence coefficient obtaining submodule is used for inputting the structured data to be classified into the second sub-network model, and obtaining second one-dimensional vector data with local features through operation of a full connection layer in the second sub-network model; inputting the second one-dimensional vector data into a one-dimensional convolution neural subnetwork in the second subnetwork model to obtain a second confidence coefficient of the structured data for each data type;

the third confidence coefficient obtaining submodule is used for inputting the structured data to be classified into the third sub-network model, and obtaining third one-dimensional vector data with local features through operation of a full connection layer in the third sub-network model; inputting the third one-dimensional vector data into a deformation layer in the third sub-network model to obtain first two-dimensional image data; inputting the first two-dimensional image data into a two-dimensional convolution neural subnetwork in the third subnetwork model to obtain a third confidence coefficient of the structured data for each data type;

the fourth confidence coefficient obtaining submodule is used for inputting the structured data to be classified into the fourth sub-network model, and obtaining fourth one-dimensional vector data with local features through operation of a full connection layer in the fourth sub-network model; inputting the fourth one-dimensional vector data into a deformation layer in the fourth sub-network model to obtain second two-dimensional image data; inputting the second two-dimensional image data into a two-dimensional convolution neural sub-network in the fourth sub-network model to obtain a fourth confidence coefficient of the structured data for each data type;

the comprehensive confidence coefficient obtaining sub-module is used for calculating the comprehensive confidence coefficient of the structured data to be classified aiming at each data type according to the first confidence coefficient, the second confidence coefficient, the third confidence coefficient and the fourth confidence coefficient;

and the classification result acquisition submodule is used for acquiring a classification result of the structured data to be classified according to the comprehensive confidence.

Further, the comprehensive confidence obtaining sub-module is specifically configured to calculate, according to the first confidence, the second confidence, the third confidence and the fourth confidence, a comprehensive confidence of the structured data to be classified for each data type by using a preset relational expression by a method of obtaining a confidence average of the structured data to be classified for each data type, where the preset relational expression is:

The classification result obtaining sub-module is specifically configured to determine a data type corresponding to a maximum comprehensive confidence coefficient in the comprehensive confidence coefficients as the data type of the structured data to be classified.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing the steps of any one of the structural data classification methods when executing the program stored in the memory.

In a fourth aspect, the present invention further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute any one of the above structured data classification methods.

The embodiment of the invention provides a structured data classification method, a structured data classification device, electronic equipment and a storage medium, wherein structured data to be classified are acquired; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified, wherein: the convolutional neural network model comprises a fully connected layer and a convolutional neural sub-network. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart illustrating a structured data classification method according to an embodiment of the present invention;

FIG. 2 is an architecture diagram of a convolutional neural network provided by an embodiment of the present invention;

FIG. 3 is an architecture diagram of a convolutional neural network provided in accordance with another embodiment of the present invention;

FIG. 4 is a flowchart illustrating a structured data classification method according to another embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a structured data classification apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, the method for classifying structured data provided in the embodiment of the present invention may specifically include the following steps:

step 101, obtaining structured data to be classified.

And 102, inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified.

The convolutional neural network model comprises a full connection layer and a convolutional neural sub-network, wherein the full connection layer is the first layer in the convolutional neural network model.

The fully-connected layer is arranged on the first layer in the whole convolutional neural network, so that the structured data without local features can be converted into the data with the local features after passing through the fully-connected layer, the convolutional neural sub-network is used for classifying the converted unstructured data with the local features, and the classification result of the structured data to be classified is finally obtained.

The convolutional neural sub-network in the present embodiment is any convolutional neural network that can classify unstructured data having local features, and may be, for example, a one-dimensional convolutional neural network, a two-dimensional convolutional neural network, or the like, which is not limited herein.

Further, the convolutional neural sub-network may be a one-dimensional convolutional neural sub-network;

the training mode of the convolutional neural network model can include:

acquiring a structured data sample set, wherein the structured data sample set comprises a plurality of structured data samples;

for each structured data sample, the following operations are performed:

inputting the structured data sample into a full connection layer to obtain a first transition sample;

inputting the first transition sample into a one-dimensional convolution neural sub-network to obtain neural network output;

and training the full-connection layer and the one-dimensional convolution neural sub-network based on the difference between the neural network output and the structured data sample to obtain a convolution neural network model.

Further, in another embodiment provided by the present invention, the convolutional neural sub-network is a one-dimensional convolutional neural sub-network, and the architecture of the convolutional neural network in this embodiment is shown in fig. 2, where x represents input structured data, and after passing through a full connection layer, the data is converted into unstructured data with local features, and then the unstructured data is subjected to one-dimensional convolutional neural network (1D-CNN) operation to obtain a classification result.

The training mode of the convolutional neural network model can include:

based on the structured data sample set, parameter adjustment is carried out through a softmax loss function, and a full connection layer is trained;

respectively inputting each structured data sample in the structured data sample set into the trained full-connection layer to obtain a first transition data sample set;

training a one-dimensional convolutional neural subnetwork based on the first transition data sample set;

and training a first transition network model consisting of the trained full-connection layer and the trained one-dimensional convolutional neural subnetwork based on the structured data sample set to obtain a convolutional neural network model.

Further, in yet another embodiment provided by the present invention, the convolutional neural subnetwork may be a two-dimensional convolutional neural subnetwork; the convolutional neural network model also comprises a deformation layer;

the training mode of the convolutional neural network model can include:

for each structured data sample, the following operations are performed:

inputting the structured data sample into a full connection layer to obtain a second transition sample;

inputting the third transition sample into a two-dimensional convolution neural sub-network to obtain neural network output;

and training the full-connection layer and the two-dimensional convolution neural sub-network based on the difference between the second transition sample and the one-dimensional image data, the neural network output and the difference between the structured data samples to obtain a convolution neural network model.

Further, in another embodiment provided by the present invention, the convolutional neural sub-network may be a two-dimensional convolutional neural sub-network, the convolutional neural network model further includes a deformable layer, and an architecture of the convolutional neural network in this embodiment is shown in fig. 3, where x represents input structured data, the data is converted into unstructured data with local features after passing through a full connection layer, the unstructured data is one-dimensional, and needs to be converted into two-dimensional unstructured data through the deformable layer, and then the two-dimensional convolutional neural network (2D-CNN) performs operation to obtain a classification result.

The training mode of the convolutional neural network model can include:

based on the structured data sample set and the one-dimensional image data, performing parameter adjustment through a mean square error loss function (MSE) to train a full-link layer;

respectively inputting samples in the second transition data sample set into a deformation layer to obtain a third transition sample set;

training a two-dimensional convolutional neural subnetwork based on the third transition data sample set;

The mean square error loss function is formulated as:

wherein: c. C_jIs the jth element in the one-dimensional image data c; n is the number of elements in the one-dimensional image data; h is_jIs the jth element in the outgoing data of the fully-connected layer.

Adjusting parameters through a mean square error loss function (MSE), wherein the process of training the full-connection layer is as follows: and continuously adjusting the hyper-parameter in the full-connection layer to minimize the value of the mean square error loss function.

In the structured data classification method shown in fig. 1 provided in the embodiment of the present invention, structured data to be classified is obtained; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified, wherein: the convolutional neural network model comprises a fully connected layer and a convolutional neural sub-network. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

Fig. 4 is another structured data classification method provided in the embodiment of the present invention, which specifically includes the following steps:

step 201, obtaining structured data to be classified.

Step 202, inputting the structured data to be classified into a convolutional neural network model obtained by pre-training, and obtaining a first confidence, a second confidence, a third confidence and a fourth confidence of the structured data for each data type.

The convolutional neural network model comprises a first sub-network model, a second sub-network model, a third sub-network model and a fourth sub-network model; wherein the models of the first and second sub-network models each comprise: a full-connection layer, a one-dimensional convolution neural subnetwork; the models of the third and fourth sub-network models each include: the system comprises a full connection layer, a deformation layer and a two-dimensional convolution neural sub-network; the first confidence coefficient is the confidence coefficient of the structured data obtained by inputting the structured data to be classified into the first sub-network model aiming at each data type; the second confidence coefficient is the confidence coefficient of the structured data obtained by inputting the structured data to be classified into the second sub-network model aiming at each data type; the third confidence coefficient is the confidence coefficient of the structured data obtained by inputting the structured data to be classified into a third sub-network model aiming at each data type; the fourth confidence is the confidence of the structured data obtained by inputting the structured data to be classified into the fourth sub-network model for each data type.

The model integration method is a common method for mixing a plurality of models to obtain higher performance, and in this step, the convolutional neural network model is integrated based on the four sub-network models. The network architecture and the model training method used by the four sub-network models are different, so that the correlation among the sub-network models can be reduced, and the integrated model formed by the four sub-network models has better performance than any single sub-network model.

In particular, the first sub-network model and the second sub-network model may both be composed of a fully connected layer and a one-dimensional convolutional neural sub-network; the third sub-network model and the fourth sub-network model may each be composed of a fully connected layer, a deformable layer, and a two-dimensional convolutional neural sub-network.

The first sub-network model and the second sub-network model have the same network architecture, but the training methods adopted by the first sub-network model and the second sub-network model are different when the network model is trained. For example: the training method of the first subnetwork model may be: acquiring a structured data sample set, wherein the structured data sample set comprises a plurality of structured data samples; for each structured data sample, the following operations are performed: inputting the structured data sample into a full connection layer of a first sub-network model to obtain a first transition sample; inputting the first transition sample into a one-dimensional convolution neural sub-network of the first sub-network model to obtain neural network output; training a full connection layer and a one-dimensional convolution neural sub-network of the first sub-network model based on the difference between the neural network output and the structured data sample to obtain a first sub-network model; the training method of the second sub-network model may be: acquiring a structured data sample set, wherein the structured data sample set comprises a plurality of structured data samples; based on the structured data sample set, carrying out parameter adjustment through a softmax loss function, and training a full connection layer of a second sub-network model; respectively inputting each structured data sample in the structured data sample set into the trained full-connection layer to obtain a first transition data sample set; training a one-dimensional convolutional neural subnetwork of a second subnetwork model based on the first set of transition data samples; and training a first transition network model consisting of the trained full-connection layer and the trained one-dimensional convolution neural sub-network based on the structured data sample set to obtain a second sub-network model.

Similarly, the third subnetwork model and the fourth subnetwork model have the same network architecture, but the training methods adopted by the third subnetwork model and the fourth subnetwork model are different when network model training is performed, for example: the training method of the third subnetwork model can be as follows: acquiring a structured data sample set; establishing two-dimensional image data corresponding to the structured data sample, and expanding the two-dimensional image data into one-dimensional image data; for each structured data sample, the following operations are performed: inputting the structured data sample into a connection layer to obtain a second transition sample; inputting the second transition sample into the deformation layer to obtain a third transition sample; inputting the third transition sample into a two-dimensional convolution neural sub-network of a third sub-network model to obtain neural network output; and training a full connection layer and a two-dimensional convolution neural sub-network of a third sub-network model based on the difference between the second transition sample and the one-dimensional image data, the difference between the neural network output and the structured data sample to obtain the third sub-network model. The training method of the fourth sub-network model may be: acquiring a structured data sample set; establishing two-dimensional image data corresponding to the structured data sample, and expanding the two-dimensional image data into one-dimensional image data; based on the structured data sample set and the one-dimensional image data, parameter adjustment is carried out through a mean square error loss function, and a full connection layer of a fourth sub-network model is trained; respectively inputting each structured data sample in the structured data sample set into the trained full-connection layer to obtain a second transition data sample set; respectively inputting samples in the second transition data sample set into a deformation layer of a fourth sub-network model to obtain a third transition sample set; training a two-dimensional convolutional neural subnetwork of a fourth subnetwork model based on the third transition data sample set; and training a second transition network model consisting of the trained full-connection layer and the trained two-dimensional convolution neural sub-network based on the structured data sample set and the one-dimensional image data to obtain a fourth sub-network model.

The first confidence, the second confidence, the third confidence and the fourth confidence in the step are the confidence of the structured data to be classified for each data type, which is obtained by inputting the structured data to be classified into the first sub-network model, the second sub-network model, the third sub-network model and the fourth sub-network model.

Further, the following method may be adopted to obtain a first confidence, a second confidence, a third confidence and a fourth confidence of the structured data for each data type:

inputting structured data to be classified into a first sub-network model, and obtaining first one-dimensional vector data with local features through operation of a full connection layer in the first sub-network model; inputting the first one-dimensional vector data into a one-dimensional convolution neural subnetwork in the first subnetwork model to obtain a first confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into a second sub-network model, and obtaining second one-dimensional vector data with local features through the operation of a full connection layer in the second sub-network model; inputting the second one-dimensional vector data into a one-dimensional convolution neural subnetwork in the second subnetwork model to obtain a second confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into a third sub-network model, and obtaining third one-dimensional vector data with local features through the operation of a full connection layer in the third sub-network model; inputting the third one-dimensional vector data into a deformation layer in a third sub-network model to obtain first two-dimensional image data; inputting the first two-dimensional image data into a two-dimensional convolution neural sub-network in a third sub-network model to obtain a third confidence coefficient of the structured data for each data type;

inputting the structured data to be classified into a fourth sub-network model, and obtaining fourth one-dimensional vector data with local features through the operation of a full connection layer in the fourth sub-network model; inputting the fourth one-dimensional vector data into a deformation layer in a fourth sub-network model to obtain second two-dimensional image data; and inputting the second two-dimensional image data into a two-dimensional convolution neural sub-network in the fourth sub-network model to obtain a fourth confidence coefficient of the structured data for each data type.

And 203, calculating the comprehensive confidence of the structured data to be classified aiming at each data type according to the first confidence, the second confidence, the third confidence and the fourth confidence.

Further, according to the first confidence, the second confidence, the third confidence and the fourth confidence, by a method of obtaining a confidence average value of the structured data to be classified for each data type, a preset relational expression is adopted to calculate a comprehensive confidence of the structured data to be classified for each data type, and the preset relational expression may be:

wherein: y._iThe comprehensive confidence coefficient of the structured data to be classified aiming at the data type corresponding to the i is obtained; y is_ijThe confidence coefficient of the structured data to be classified for the data type corresponding to the i is obtained after the structured data to be classified is input into the jth sub-network model obtained through pre-training.

And step 204, obtaining a classification result of the structured data to be classified according to the comprehensive confidence.

Further, the data type corresponding to the maximum comprehensive confidence in the comprehensive confidences can be determined as the data type of the structured data to be classified.

In the structured data classification method shown in fig. 4 provided in the embodiment of the present invention, structured data to be classified is obtained; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a first confidence coefficient, a second confidence coefficient, a third confidence coefficient and a fourth confidence coefficient of the structured data aiming at each data type, wherein the convolutional neural network model comprises a first sub-network model, a second sub-network model, a third sub-network model and a fourth sub-network model; wherein the models of the first and second sub-network models each comprise: a full-connection layer, a one-dimensional convolution neural subnetwork; the models of the third and fourth sub-network models each include: the system comprises a full connection layer, a deformation layer and a two-dimensional convolution neural sub-network; calculating the comprehensive confidence of the structured data to be classified aiming at each data type according to the first confidence, the second confidence, the third confidence and the fourth confidence; and obtaining a classification result of the structured data to be classified according to the comprehensive confidence. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

Based on the same inventive concept, according to the structured data classification method provided in the above embodiment of the present invention, accordingly, an embodiment of the present invention provides a structured data classification apparatus, a schematic structural diagram of which is shown in fig. 5, including:

a to-be-classified data acquisition module 301, configured to acquire to-be-classified structured data;

the classification result obtaining module 302 is configured to input the structural data to be classified into a convolutional neural network model obtained through pre-training, so as to obtain a classification result of the structural data to be classified, where the convolutional neural network model includes a full connection layer and a convolutional neural subnetwork, and the full connection layer is a first layer in the convolutional neural network model.

the device still includes:

the second sample set acquisition module is used for acquiring a structured data sample set, and the structured data sample set comprises a plurality of structured data samples;

a second model training module to:

Further, the convolutional neural subnetwork is a two-dimensional convolutional neural subnetwork; the convolutional neural network model also comprises a deformation layer;

the device still includes:

the third sample set acquisition module is used for acquiring a structured data sample set, and the structured data sample set comprises a plurality of structured data samples;

the device still includes:

the fourth sample set acquisition module is used for acquiring a structured data sample set, and the structured data sample set comprises a plurality of structured data samples;

a fourth model training module to:

based on the structured data sample set and the one-dimensional image data, parameter adjustment is carried out through a mean square error loss function, and a full connection layer is trained;

the classification result obtaining module 302 includes: a first confidence coefficient obtaining sub-module, a second confidence coefficient obtaining sub-module, a third confidence coefficient obtaining sub-module, a fourth confidence coefficient obtaining sub-module, a comprehensive confidence coefficient obtaining sub-module and a classification result obtaining sub-module;

the first confidence coefficient obtaining submodule is used for inputting the structured data to be classified into a first sub-network model and obtaining first one-dimensional vector data with local features through the operation of a full connection layer in the first sub-network model; inputting the first one-dimensional vector data into a one-dimensional convolution neural subnetwork in the first subnetwork model to obtain a first confidence coefficient of the structured data for each data type;

the second confidence coefficient acquisition submodule is used for inputting the structured data to be classified into a second sub-network model and obtaining second one-dimensional vector data with local features through the operation of a full connection layer in the second sub-network model; inputting the second one-dimensional vector data into a one-dimensional convolution neural subnetwork in the second subnetwork model to obtain a second confidence coefficient of the structured data for each data type;

the third confidence coefficient obtaining submodule is used for inputting the structured data to be classified into a third sub-network model, and obtaining third one-dimensional vector data with local features through the operation of a full connection layer in the third sub-network model; inputting the third one-dimensional vector data into a deformation layer in a third sub-network model to obtain first two-dimensional image data; inputting the first two-dimensional image data into a two-dimensional convolution neural sub-network in a third sub-network model to obtain a third confidence coefficient of the structured data for each data type;

the fourth confidence coefficient obtaining submodule is used for inputting the structured data to be classified into a fourth sub-network model, and obtaining fourth one-dimensional vector data with local features through the operation of a full connection layer in the fourth sub-network model; inputting the fourth one-dimensional vector data into a deformation layer in a fourth sub-network model to obtain second two-dimensional image data; inputting the second two-dimensional image data into a two-dimensional convolution neural sub-network in a fourth sub-network model to obtain a fourth confidence coefficient of the structured data for each data type;

Further, the comprehensive confidence coefficient obtaining sub-module is specifically configured to calculate, according to the first confidence coefficient, the second confidence coefficient, the third confidence coefficient, and the fourth confidence coefficient, a comprehensive confidence coefficient of the structured data to be classified for each data type by a method of obtaining a confidence coefficient average value of the structured data to be classified for each data type, and using a preset relational expression, where the preset relational expression is:

And the classification result obtaining submodule is specifically used for determining the data type corresponding to the maximum comprehensive confidence coefficient in the comprehensive confidence coefficients as the data type of the structured data to be classified.

In the structured data classification device provided by the embodiment of the invention, structured data to be classified is acquired by a data to be classified acquisition module 301; the classification result obtaining module 302 inputs the structural data to be classified obtained by the data to be classified obtaining module 301 into a convolutional neural network model obtained by pre-training, so as to obtain a classification result of the structural data to be classified, where the convolutional neural network model includes a full connection layer and a convolutional neural subnetwork, and the full connection layer is a first layer in the convolutional neural network model. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

Based on the same inventive concept, according to the structured data classification method provided by the above embodiment of the present invention, correspondingly, the embodiment of the present invention further provides an electronic device, as shown in fig. 6, including a processor 401, a communication interface 402, a memory 403, and a communication bus 404, where the processor 401, the communication interface 402, and the memory 403 complete mutual communication through the communication bus 404.

A memory 403 for storing a computer program;

the processor 401, when executing the program stored in the memory 403, at least implements the following steps:

acquiring structured data to be classified;

and inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural subnetwork, and the full connection layer is the first layer in the convolutional neural network model.

Further, other processing flows in the above structured data classification method provided by the embodiment of the present invention may also be included, and are not described in detail here.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Further, the memory may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In the electronic device provided by the embodiment of the invention, the adopted method is as follows: obtaining structured data to be classified; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified, wherein: the convolutional neural network model comprises a fully connected layer and a convolutional neural sub-network. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to perform any of the above-mentioned structured data classification methods in the above-mentioned embodiments.

In a computer-readable storage medium provided in an embodiment of the present invention, a method is adopted that: obtaining structured data to be classified; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified, wherein: the convolutional neural network model comprises a fully connected layer and a convolutional neural sub-network. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

In yet another embodiment, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to perform any of the above structured data classification methods in the above embodiments.

In the computer program product including instructions provided by the embodiment of the present invention, the method adopted is: obtaining structured data to be classified; inputting the structured data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structured data to be classified, wherein: the convolutional neural network model comprises a fully connected layer and a convolutional neural sub-network. According to the scheme provided by the embodiment of the invention, after structured data are input into the convolutional neural network, firstly, the structured data without local features are converted into the data with local features through the full connection layer, so that the converted data are subjected to local feature extraction through the trained convolutional neural network, the classification result of the structured data to be classified is finally obtained, the classification of the structured data to be classified is completed, and the accuracy of the convolutional neural network model for classifying the structured data is improved.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described above in accordance with the embodiments of the invention may be generated, in whole or in part, when the computer program instructions described above are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic cable, Digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a Digital Video Disc (DVD)), a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the group consisting of additional identical elements in the process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the storage medium, and the computer program product embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method of structured data classification, comprising:

acquiring structured data to be classified;

inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural sub-network, and the full connection layer is a first layer in the convolutional neural network model;

the convolutional neural network model comprises a first sub-network model, a second sub-network model, a third sub-network model and a fourth sub-network model; wherein the models of the first and second sub-network models each comprise: a full-connection layer, a one-dimensional convolution neural subnetwork; the models of the third and fourth sub-network models each include: the system comprises a full connection layer, a deformation layer and a two-dimensional convolution neural sub-network;

2. The method according to claim 1, wherein the calculating a combined confidence of the structured data to be classified for each data type according to the first confidence, the second confidence, the third confidence and the fourth confidence comprises:

wherein: y is_·iThe comprehensive confidence coefficient of the structured data to be classified aiming at the data type corresponding to the i is obtained; y is_ijInputting the structured data to be classified into a jth sub-network model obtained by pre-training to obtain a confidence coefficient aiming at the data type corresponding to the i;

3. A structured data classification apparatus, comprising:

the classification result acquisition module is used for inputting the structural data to be classified into a convolutional neural network model obtained by pre-training to obtain a classification result of the structural data to be classified, wherein the convolutional neural network model comprises a full connection layer and a convolutional neural subnetwork, and the full connection layer is a first layer in the convolutional neural network model;

4. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1-2 when executing a program stored in the memory.

5. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-2.