CN114218462A - Data classification method, device, equipment and storage medium based on LSTM - Google Patents

Data classification method, device, equipment and storage medium based on LSTM Download PDF

Info

Publication number
CN114218462A
CN114218462A CN202111623061.XA CN202111623061A CN114218462A CN 114218462 A CN114218462 A CN 114218462A CN 202111623061 A CN202111623061 A CN 202111623061A CN 114218462 A CN114218462 A CN 114218462A
Authority
CN
China
Prior art keywords
data
preset
vector
lstm
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111623061.XA
Other languages
Chinese (zh)
Inventor
陈锦泉
戴立明
钟致民
孔勇平
黄龙飞
陈博
李小刚
曾祥宇
任勇强
杨剑
叶青
王一博
万红阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi IoT Technology Co Ltd
Original Assignee
Tianyi IoT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi IoT Technology Co Ltd filed Critical Tianyi IoT Technology Co Ltd
Priority to CN202111623061.XA priority Critical patent/CN114218462A/en
Publication of CN114218462A publication Critical patent/CN114218462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data classification method, a device, equipment and a storage medium based on LSTM. The method comprises the following steps: acquiring data to be classified; determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified. According to the scheme, the data are automatically classified through the field feature extractor and the super-parameter LSTM classifier, and the data classification efficiency is improved.

Description

Data classification method, device, equipment and storage medium based on LSTM
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a data classification method, apparatus, device, and storage medium based on LSTM.
Background
In order to ensure the security of data, different management methods are generally required to be adopted for data with different security levels to ensure the security of data.
In the prior art, the grade of the corresponding data is generally determined by classifying the acquired data, specifically, by manually constructing data classification criteria, and then determining the grade of the acquired data by manually analyzing the category to which the data belongs in the data classification criteria.
However, a large amount of data of a general enterprise needs to be classified, and when the large amount of data needs to be classified, the efficiency of classifying by manpower is very low.
Disclosure of Invention
The embodiment of the application provides a data classification method, a device, equipment and a storage medium based on LSTM, which can improve the efficiency of data classification.
In a first aspect, an embodiment of the present application provides an LSTM-based data classification method, which includes:
acquiring data to be classified;
determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor;
inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively;
determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient;
and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
In a second aspect, an embodiment of the present application further provides an LSTM-based data classification apparatus, which includes:
the acquiring unit is used for acquiring data to be classified;
the processing unit is used for determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to all preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
In a third aspect, an embodiment of the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the above method when executing the computer program.
In a fourth aspect, the present application also provides a computer-readable storage medium, in which a computer program is stored, the computer program including program instructions, which when executed by a processor, implement the above method.
The embodiment of the application provides a data classification method, a device, equipment and a storage medium based on LSTM. Wherein the method comprises the following steps: acquiring data to be classified; determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified. According to the scheme, the data are automatically classified through the field feature extractor and the super-parameter LSTM classifier, and the data classification efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of an application scenario of an LSTM-based data classification method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a data classification method based on LSTM according to an embodiment of the present application;
FIG. 3 is a sub-flow diagram of a method for classifying data based on LSTM according to an embodiment of the present application;
FIG. 4 is a block diagram of a flow framework for a field feature extractor provided in an embodiment of the present application;
FIG. 5 is a block diagram of a flow framework of the hyper-parametric LSTM classifier provided in an embodiment of the present application;
FIG. 6 is a block diagram of an overall process framework provided by an embodiment of the present application;
FIG. 7 is a schematic block diagram of an LSTM-based data classification apparatus provided by an embodiment of the present application;
fig. 8 is a schematic block diagram of a computer device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
The embodiment of the application provides a data classification method, a device, equipment and a storage medium based on LSTM.
The executing body of the LSTM-based data classification method may be the LSTM-based data classification device provided in this embodiment of the application, or a computer device integrated with the LSTM-based data classification device, where the LSTM-based data classification device may be implemented in a hardware or software manner, the computer device may be a terminal or a server, and the terminal may be a smart phone, a tablet computer, a palm computer, or a notebook computer, etc.
Referring to fig. 1, fig. 1 is a schematic view of an application scenario of the LSTM-based data classification method according to an embodiment of the present application. The data classification method based on the LSTM is applied to the computer device 10 in FIG. 1, and the computer device 10 acquires data to be classified; determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset trained super-parameter Long-Short Term Memory (LSTM) classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
Fig. 2 is a schematic flowchart of an LSTM-based data classification method according to an embodiment of the present application. As shown in fig. 2, the method includes the following steps S110-150.
And S110, acquiring data to be classified.
The data to be classified may be enterprise data assets or other data that needs to be classified, and is not limited herein.
And S120, determining sentence vectors corresponding to the data to be classified through a preset trained field feature extractor.
In some embodiments, in the database design process, the purpose of each field of the data to be classified is remarked, so that the characteristics of the field can be obtained by performing characteristic extraction on remark information, wherein the remark is described by using an sql statement.
At this time, the field remark in the field needs to be intercepted by the trained field feature extractor, and then the sentence vector corresponding to the remark is calculated.
Referring to fig. 3, in some embodiments, step S120 includes:
and S121, performing word segmentation processing on the data to be classified to obtain a plurality of words.
For example, the remarks of the data to be classified are intercepted, and then word segmentation processing is performed on the remarks of the data to be classified, so that a plurality of word segments are obtained.
It should be noted that, in this embodiment, word segmentation processing may also be directly performed on the data to be classified.
And S122, obtaining word segmentation vectors corresponding to the plurality of word segmentations respectively through the embedding layer of the trained field feature extractor.
For example, after obtaining a plurality of participles corresponding to remarks of data to be classified, the participles are input into the embedding layer to obtain participle vectors corresponding to the participles respectively.
And S123, generating the sentence vector according to the word segmentation vector.
In some embodiments, specifically, after the word segmentation vector corresponding to each word segmentation is obtained, the actual sentence length of the data to be classified (or the corresponding remark) is compared with a preset standard sentence length; if the actual sentence length is longer than the standard sentence length, carrying out truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector; generating the sentence vector according to the segmented word segmentation vector; if the standard sentence length is longer than the actual sentence length, filling the participle vector according to the standard sentence length to obtain a filled participle vector; and generating the sentence vector according to the filled word segmentation vector.
The actual length in this embodiment is a mask vector generated after the trained field feature extractor acquires the corresponding data to be classified (or the corresponding remarks), and the mask vector is used for marking the actual length of each input text.
Wherein, the segmenting the word segmentation vector according to the standard sentence length to obtain the segmented word segmentation vector comprises: respectively determining the importance degrees of the word segmentation vectors according to a preset importance determination rule; and performing truncation processing on the word segmentation vector according to the importance degree and the standard sentence length to obtain the truncated word segmentation vector.
That is, for question texts of different lengths, the question texts are supplemented and truncated to be of the same length. If the length is too short, the space is filled, and if the length is too long, the space is cut off. Thus, a sentence vector input trained hyperparameter LSTM classifier with consistent dimensions is constructed.
Wherein, before step S120, the method further comprises: acquiring an extractor training sample set and an extractor verification sample set; and training a preset field feature extractor according to the extractor training sample set and the extractor verification sample set to obtain the trained field feature extractor.
For easy understanding, please refer to fig. 4, where fig. 4 is a schematic view of a flow framework of the field feature extractor in this embodiment, after the field feature extractor acquires a field remark of data to be classified, the field remark is segmented to obtain a segmented sentence, then word vectors of the segmented words are generated, and finally a sentence vector is generated according to the word vectors.
S130, inputting the sentence vector into a preset post-training hyper-parameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively.
For easy understanding, referring to fig. 5, fig. 5 is a schematic diagram of a flow framework of the hyper-parametric LSTM classifier in the present embodiment, after a sentence vector is input into the trained hyper-parametric LSTM classifier, vectors h of n hidden LSTM neural units are obtained through a time sequence, and after the vectors pass through a meaningpooling layer, a vector is obtained, and then a Softsign layer is followed to obtain confidence degrees corresponding to a plurality of preset categories, respectively.
The preset category may be a grade category or other classification categories, which is not limited herein.
And S140, determining the confidence coefficient with the maximum confidence coefficient median value as a target confidence coefficient.
And after obtaining the confidence degrees corresponding to the multiple preset categories respectively through the Softsign layer, determining the confidence degree with the maximum value as a target confidence degree.
S150, determining the preset category corresponding to the target confidence degree as the target category of the data to be classified.
Since each confidence degree corresponds to a preset category, the preset category corresponding to the target confidence degree is determined as the target category of the data to be classified in the embodiment.
Wherein, steps S140-150 are also implemented by the trained hyper-parametric LSTM classifier.
In some embodiments, prior to step S130, the method further comprises: acquiring a standard LSTM classifier; replacing a hyperbolic tangent function of the standard LSTM classifier with a Softsign function, and applying a data standardization function, an MSE (mean square error) loss function and an identity activation function to regression and Xavier weight initialization in the standard LSTM classifier to obtain a super-parameter LSTM classifier; obtaining a classifier test set; and training the super-parameter LSTM classifier according to the test set to obtain the trained super-parameter LSTM classifier.
Specifically, the present embodiment sets a separate test set to train the network; training through a plurality of epochs (the algorithm traverses the training data set); evaluating test set performance after each epoch to finalize an optimal stopping time; using a faster, less saturated Softsign instead of a hyperbolic tangent function; and (3) using data standardization, MSE loss function and identity activation function for regression and Xavier weight initialization.
In some embodiments, after step S130, the method further comprises: inputting the confidence coefficient into a preset classification performance evaluator to obtain the performance parameters of the trained hyperparameter LSTM classifier; and calibrating the trained hyperparameter LSTM classifier according to the performance parameters.
The performance parameters include, but are not limited to, accuracy, precision, recall, f1_ score (f1 score), confusion matrix, ks (Kolmogorov-Smirnov), ks curve, receiver operating characteristic curve (ROC curve), and/or stability index (PSI).
Therefore, for performance curve evaluation indexes commonly used by a classification model, the method can be used for realizing performance parameter generation methods including but not limited to accuracy, precision, recall, f1_ score, confusion matrix, ks curve, ROC curve, PSI and the like through a Python-based mainstream framework sklern. And feeding back each index parameter of the evaluator as a model calibration parameter. The embodiment can optimize the super-parameter LSTM classifier through the classification performance evaluator, and improve the classification performance of the super-parameter LSTM classifier.
For easy understanding, referring to fig. 6, fig. 6 is a schematic diagram of a general flow chart framework of the present embodiment. After the data classification system acquires the data to be classified, the data is input into a field characteristic extractor, the corresponding sentence vector is acquired by the field characteristic extractor, the sentence vector is input into a super-parameter LSTM classifier, then the super-parameter LSTM classifier is researched and judged to output a classification result, the output classification is input into a classification performance evaluator according to the classification performance, the classification performance evaluator performs feedback calibration processing on the super-parameter LSTM classifier,
in summary, the scheme has the following advantages:
1. the universality of the classification model is improved: based on the trained hyper-parameter LSTM classifier, the category with the maximum probability value is automatically obtained and taken as the final prediction result through deep learning of n hidden LSTM neural network units obtained through a time sequence.
2. And (3) ensuring data acquisition consistency: the field feature extractor embeds words into the input and represents each word as a numeric word vector. The process constructs model sentence vector input with consistent dimension through 'truncation' and 'supplement'.
3. Improving the objectivity of the classification model: the classification model is self-calibrated based on machine learning, and higher classification studying and judging accuracy than manual work is obtained.
4. And (4) safety guarantee: the classification process has less manual participation to the maximum extent, and the data confidentiality of enterprises is guaranteed.
In summary, the present embodiment obtains data to be classified; determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified. According to the scheme, the data are automatically classified through the field feature extractor and the super-parameter LSTM classifier, and the data classification efficiency is improved.
Fig. 7 is a schematic block diagram of an LSTM-based data classification apparatus according to an embodiment of the present application. As shown in fig. 7, the present application also provides an LSTM-based data classification apparatus corresponding to the above LSTM-based data classification method. The LSTM-based data classification apparatus includes a unit for performing the above LSTM-based data classification method, and the apparatus may be configured in a desktop computer, a tablet computer, a portable computer, or the like. Specifically, referring to fig. 7, the LSTM-based data classification apparatus includes an obtaining unit 701 and a processing unit 702.
An obtaining unit 701, configured to obtain data to be classified;
a processing unit 702, configured to determine, by using a preset post-training field feature extractor, a sentence vector corresponding to the data to be classified; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to all preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
In some embodiments, when the step of determining the sentence vector corresponding to the data to be classified by the preset post-training field feature extractor is executed, the processing unit 702 is specifically configured to:
performing word segmentation processing on the data to be classified to obtain a plurality of words;
obtaining word segmentation vectors corresponding to the plurality of word segmentations respectively through an embedding layer of the trained field feature extractor;
and generating the sentence vector according to the word segmentation vector.
In some embodiments, when executing the step of generating the sentence vector according to the word segmentation vector, the processing unit 702 is specifically configured to:
comparing the actual sentence length of the data to be classified with a preset standard sentence length;
if the actual sentence length is longer than the standard sentence length, carrying out truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector;
generating the sentence vector according to the segmented word segmentation vector;
if the standard sentence length is longer than the actual sentence length, filling the participle vector according to the standard sentence length to obtain a filled participle vector;
and generating the sentence vector according to the filled word segmentation vector.
In some embodiments, when the step of performing the truncation processing on the segmentation vector according to the standard sentence length to obtain a truncated segmentation vector is executed by the processing unit 702, the processing unit is specifically configured to:
respectively determining the importance degrees of the word segmentation vectors according to a preset importance determination rule;
and performing truncation processing on the word segmentation vector according to the importance degree and the standard sentence length to obtain the truncated word segmentation vector.
In some embodiments, after the step of inputting the sentence vector into a preset trained hyper-parametric LSTM classifier to obtain confidence levels corresponding to a plurality of preset categories, the processing unit 702 is further configured to:
inputting the confidence coefficient into a preset classification performance evaluator to obtain the performance parameters of the trained hyperparameter LSTM classifier;
and calibrating the trained hyperparameter LSTM classifier according to the performance parameters.
In some embodiments, before the step of determining the sentence vector corresponding to the data to be classified by the preset post-training field feature extractor is executed, the processing unit 702 is further configured to:
acquiring an extractor training sample set and an extractor verification sample set;
and training a preset field feature extractor according to the extractor training sample set and the extractor verification sample set to obtain the trained field feature extractor.
In some embodiments, before the step of inputting the sentence vector into a preset trained hyper-parametric LSTM classifier to obtain confidence levels corresponding to a plurality of preset categories, the processing unit 702 is further configured to:
acquiring a standard LSTM classifier;
replacing a hyperbolic tangent function of the standard LSTM classifier with a Softsign function, and applying a data standardization function, an MSE loss function and an identity activation function to regression and Xavier weight initialization in the standard LSTM classifier to obtain a super-parameter LSTM classifier;
obtaining a classifier test set;
and training the super-parameter LSTM classifier according to the test set to obtain the trained super-parameter LSTM classifier.
It should be noted that, as can be clearly understood by those skilled in the art, the specific implementation process of the data classification device and each unit based on the LSTM may refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, no further description is provided herein.
The LSTM-based data sorting apparatus described above may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 8.
Referring to fig. 8, fig. 8 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 800 may be a terminal or a server, where the terminal may be an electronic device with a communication function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device. The server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 8, the computer device 800 includes a processor 802, memory and network interface 805 connected by a system bus 801, wherein the memory may include a non-volatile storage medium 803 and an internal memory 804.
The non-volatile storage medium 803 may store an operating system 8031 and computer programs 8032. The computer program 8032 includes program instructions that, when executed, cause the processor 802 to perform an LSTM-based data classification method.
The processor 802 is used to provide computing and control capabilities to support the operation of the overall computer device 800.
The internal memory 804 provides an environment for the operation of a computer program 8032 on the non-volatile storage medium 803, which computer program 8032, when executed by the processor 802, causes the processor 802 to perform an LSTM-based data classification method.
The network interface 805 is used for network communication with other devices. Those skilled in the art will appreciate that the configuration shown in fig. 8 is a block diagram of only a portion of the configuration associated with the present application and does not constitute a limitation of the computing device 800 to which the present application is applied, and that a particular computing device 800 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 802 is configured to execute a computer program 8032 stored in the memory to implement the steps of:
acquiring data to be classified;
determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor;
inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively;
determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient;
and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
In some embodiments, when the processor 802 implements the step of determining the sentence vector corresponding to the data to be classified by the preset post-training field feature extractor, the following steps are implemented:
performing word segmentation processing on the data to be classified to obtain a plurality of words;
obtaining word segmentation vectors corresponding to the plurality of word segmentations respectively through an embedding layer of the trained field feature extractor;
and generating the sentence vector according to the word segmentation vector.
In some embodiments, when the processor 802 implements the step of generating the sentence vector according to the word segmentation vector, the following steps are specifically implemented:
comparing the actual sentence length of the data to be classified with a preset standard sentence length;
if the actual sentence length is longer than the standard sentence length, carrying out truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector;
generating the sentence vector according to the segmented word segmentation vector;
if the standard sentence length is longer than the actual sentence length, filling the participle vector according to the standard sentence length to obtain a filled participle vector;
and generating the sentence vector according to the filled word segmentation vector.
In some embodiments, when the processor 802 implements the step of performing truncation processing on the segmentation vector according to the standard sentence length to obtain a truncated segmentation vector, the following steps are specifically implemented:
respectively determining the importance degrees of the word segmentation vectors according to a preset importance determination rule;
and performing truncation processing on the word segmentation vector according to the importance degree and the standard sentence length to obtain the truncated word segmentation vector.
In some embodiments, after the processor 802 performs the step of inputting the sentence vector into the preset post-training hyper-parameter LSTM classifier to obtain the confidence degrees corresponding to the plurality of preset categories, the following steps are further performed:
inputting the confidence coefficient into a preset classification performance evaluator to obtain the performance parameters of the trained hyperparameter LSTM classifier;
and calibrating the trained hyperparameter LSTM classifier according to the performance parameters.
In some embodiments, before implementing the step of determining the sentence vector corresponding to the data to be classified by the preset post-training field feature extractor, the processor 802 further implements the following steps:
acquiring an extractor training sample set and an extractor verification sample set;
and training a preset field feature extractor according to the extractor training sample set and the extractor verification sample set to obtain the trained field feature extractor.
In some embodiments, before implementing the step of inputting the sentence vector into the preset post-training hyper-parameter LSTM classifier to obtain the confidence levels corresponding to the multiple preset categories, the processor 802 further implements the following steps:
acquiring a standard LSTM classifier;
replacing a hyperbolic tangent function of the standard LSTM classifier with a Softsign function, and applying a data standardization function, an MSE loss function and an identity activation function to regression and Xavier weight initialization in the standard LSTM classifier to obtain a super-parameter LSTM classifier;
obtaining a classifier test set;
and training the super-parameter LSTM classifier according to the test set to obtain the trained super-parameter LSTM classifier.
It should be understood that in the present embodiment, the Processor 802 may be a Central Processing Unit (CPU), and the Processor 802 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will be understood by those skilled in the art that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program instructing associated hardware. The computer program includes program instructions, and the computer program may be stored in a storage medium, which is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present application also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program, wherein the computer program comprises program instructions. The program instructions, when executed by the processor, cause the processor to perform the steps of:
acquiring data to be classified;
determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor;
inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively;
determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient;
and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
In some embodiments, when the processor executes the program instructions to implement the step of determining the sentence vector corresponding to the data to be classified by the preset post-training field feature extractor, the following steps are specifically implemented:
performing word segmentation processing on the data to be classified to obtain a plurality of words;
obtaining word segmentation vectors corresponding to the plurality of word segmentations respectively through an embedding layer of the trained field feature extractor;
and generating the sentence vector according to the word segmentation vector.
In some embodiments, when the processor executes the program instructions to implement the step of generating the sentence vector according to the word segmentation vector, the following steps are specifically implemented:
comparing the actual sentence length of the data to be classified with a preset standard sentence length;
if the actual sentence length is longer than the standard sentence length, carrying out truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector;
generating the sentence vector according to the segmented word segmentation vector;
if the standard sentence length is longer than the actual sentence length, filling the participle vector according to the standard sentence length to obtain a filled participle vector;
and generating the sentence vector according to the filled word segmentation vector.
In some embodiments, when the processor executes the program instruction to implement the step of performing truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector, the following steps are specifically implemented:
respectively determining the importance degrees of the word segmentation vectors according to a preset importance determination rule;
and performing truncation processing on the word segmentation vector according to the importance degree and the standard sentence length to obtain the truncated word segmentation vector.
In some embodiments, after the processor executes the program instructions to implement the step of inputting the sentence vector into a preset post-training hyper-parametric LSTM classifier to obtain confidence levels corresponding to a plurality of preset categories, the processor further implements the following steps:
inputting the confidence coefficient into a preset classification performance evaluator to obtain the performance parameters of the trained hyperparameter LSTM classifier;
and calibrating the trained hyperparameter LSTM classifier according to the performance parameters.
In some embodiments, before the step of determining, by the preset post-training field feature extractor, the sentence vector corresponding to the data to be classified is implemented by the processor through executing the program instructions, the following steps are further implemented:
acquiring an extractor training sample set and an extractor verification sample set;
and training a preset field feature extractor according to the extractor training sample set and the extractor verification sample set to obtain the trained field feature extractor.
In some embodiments, before the processor executes the program instructions to implement the step of inputting the sentence vector into a preset post-training hyper-parametric LSTM classifier to obtain confidence levels corresponding to a plurality of preset categories, the processor further implements the following steps: acquiring a standard LSTM classifier;
replacing a hyperbolic tangent function of the standard LSTM classifier with a Softsign function, and applying a data standardization function, an MSE loss function and an identity activation function to regression and Xavier weight initialization in the standard LSTM classifier to obtain a super-parameter LSTM classifier;
obtaining a classifier test set;
and training the super-parameter LSTM classifier according to the test set to obtain the trained super-parameter LSTM classifier.
The storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, which can store various computer readable storage media.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the application can be combined, divided and deleted according to actual needs. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present application may be substantially or partially implemented in the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A data classification method based on LSTM is characterized by comprising the following steps:
acquiring data to be classified;
determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor;
inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to a plurality of preset categories respectively;
determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient;
and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
2. The method according to claim 1, wherein the determining, by a preset post-training field feature extractor, a sentence vector corresponding to the data to be classified comprises:
performing word segmentation processing on the data to be classified to obtain a plurality of words;
obtaining word segmentation vectors corresponding to the plurality of word segmentations respectively through an embedding layer of the trained field feature extractor;
and generating the sentence vector according to the word segmentation vector.
3. The method of claim 2, wherein the generating the sentence vector from the participle vector comprises:
comparing the actual sentence length of the data to be classified with a preset standard sentence length;
if the actual sentence length is longer than the standard sentence length, carrying out truncation processing on the participle vector according to the standard sentence length to obtain a truncated participle vector;
generating the sentence vector according to the segmented word segmentation vector;
if the standard sentence length is longer than the actual sentence length, filling the participle vector according to the standard sentence length to obtain a filled participle vector;
and generating the sentence vector according to the filled word segmentation vector.
4. The method according to claim 3, wherein the segmenting the participle vector according to the standard sentence length to obtain a segmented participle vector comprises:
respectively determining the importance degrees of the word segmentation vectors according to a preset importance determination rule;
and performing truncation processing on the word segmentation vector according to the importance degree and the standard sentence length to obtain the truncated word segmentation vector.
5. The method according to claim 1, wherein after inputting the sentence vector into a preset post-training hyper-parameter LSTM classifier and obtaining confidence levels corresponding to a plurality of preset categories, the method further comprises:
inputting the confidence coefficient into a preset classification performance evaluator to obtain the performance parameters of the trained hyperparameter LSTM classifier;
and calibrating the trained hyperparameter LSTM classifier according to the performance parameters.
6. The method according to claim 1, wherein before the sentence vector corresponding to the data to be classified is determined by a preset post-training field feature extractor, the method further comprises:
acquiring an extractor training sample set and an extractor verification sample set;
and training a preset field feature extractor according to the extractor training sample set and the extractor verification sample set to obtain the trained field feature extractor.
7. The method according to any one of claims 1 to 6, wherein before inputting the sentence vector into a preset post-training hyper-parametric LSTM classifier and obtaining confidence levels corresponding to a plurality of preset categories, the method further comprises:
acquiring a standard LSTM classifier;
replacing a hyperbolic tangent function of the standard LSTM classifier with a Softsign function, and applying a data standardization function, an MSE loss function and an identity activation function to regression and Xavier weight initialization in the standard LSTM classifier to obtain a super-parameter LSTM classifier;
obtaining a classifier test set;
and training the super-parameter LSTM classifier according to the test set to obtain the trained super-parameter LSTM classifier.
8. An LSTM-based data classification apparatus, comprising:
the acquiring unit is used for acquiring data to be classified;
the processing unit is used for determining sentence vectors corresponding to the data to be classified through a preset post-training field feature extractor; inputting the sentence vector into a preset post-training hyperparameter LSTM classifier to obtain confidence degrees corresponding to all preset categories respectively; determining the confidence coefficient with the maximum median confidence coefficient as a target confidence coefficient; and determining the preset category corresponding to the target confidence coefficient as the target category of the data to be classified.
9. A computer arrangement, characterized in that the computer arrangement comprises a memory having stored thereon a computer program and a processor implementing the method according to any of claims 1-7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the storage medium stores a computer program comprising program instructions which, when executed by a processor, implement the method according to any one of claims 1-7.
CN202111623061.XA 2021-12-28 2021-12-28 Data classification method, device, equipment and storage medium based on LSTM Pending CN114218462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111623061.XA CN114218462A (en) 2021-12-28 2021-12-28 Data classification method, device, equipment and storage medium based on LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111623061.XA CN114218462A (en) 2021-12-28 2021-12-28 Data classification method, device, equipment and storage medium based on LSTM

Publications (1)

Publication Number Publication Date
CN114218462A true CN114218462A (en) 2022-03-22

Family

ID=80706465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111623061.XA Pending CN114218462A (en) 2021-12-28 2021-12-28 Data classification method, device, equipment and storage medium based on LSTM

Country Status (1)

Country Link
CN (1) CN114218462A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455666A (en) * 2023-10-16 2024-01-26 厦门国际银行股份有限公司 Transaction technical index prediction method, device and equipment based on neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455666A (en) * 2023-10-16 2024-01-26 厦门国际银行股份有限公司 Transaction technical index prediction method, device and equipment based on neural network

Similar Documents

Publication Publication Date Title
CN111309912B (en) Text classification method, apparatus, computer device and storage medium
CN109271521B (en) Text classification method and device
US10339468B1 (en) Curating training data for incremental re-training of a predictive model
CN111639516B (en) Analysis platform based on machine learning
CN109726391B (en) Method, device and terminal for emotion classification of text
CN112395500A (en) Content data recommendation method and device, computer equipment and storage medium
CN111984792A (en) Website classification method and device, computer equipment and storage medium
CN112418320B (en) Enterprise association relation identification method, device and storage medium
CN110765757A (en) Text recognition method, computer-readable storage medium, and computer device
CN112948823A (en) Data leakage risk assessment method
CN112100374A (en) Text clustering method and device, electronic equipment and storage medium
CN110705489A (en) Training method and device of target recognition network, computer equipment and storage medium
CN110781673B (en) Document acceptance method and device, computer equipment and storage medium
CN110968664A (en) Document retrieval method, device, equipment and medium
CN113239697B (en) Entity recognition model training method and device, computer equipment and storage medium
CN114218462A (en) Data classification method, device, equipment and storage medium based on LSTM
CN114781532A (en) Evaluation method and device of machine learning model, computer equipment and medium
CN114626524A (en) Target service network determining method, service processing method and device
WO2021004118A1 (en) Correlation value determination method and apparatus
CN109657710B (en) Data screening method and device, server and storage medium
CN113011532A (en) Classification model training method and device, computing equipment and storage medium
CN113515625A (en) Test result classification model training method, classification method and device
CN110008972B (en) Method and apparatus for data enhancement
CN112445914A (en) Text classification method, device, computer equipment and medium
CN111597336A (en) Processing method and device of training text, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination