CN113139057A

CN113139057A - Domain-adaptive chemical potential safety hazard short text classification method and system

Info

Publication number: CN113139057A
Application number: CN202110511224.9A
Authority: CN
Inventors: 杜军威; 朱孟帅; 李浩杰; 胡强; 于旭; 江峰; 陈卓
Original assignee: Qingdao University of Science and Technology
Current assignee: Qingdao University of Science and Technology
Priority date: 2021-05-11
Filing date: 2021-05-11
Publication date: 2021-07-20

Abstract

The invention discloses a domain-adaptive chemical potential safety hazard short text classification method and system, which are used for acquiring a plurality of short texts to be classified in the field of chemical potential safety hazard investigation; vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified; and inputting the initial text vectors corresponding to all the texts to be classified into the trained short text classification model, and outputting the short text classification result. The GRU + HAN learning short text is adopted to be represented in different levels of character, word and sentence information fusion in a specific field, the field information deviation problem of the general corpus short text is solved, and a better classification effect is shown in a classification task of chemical engineering potential safety hazard investigation.

Description

Domain-adaptive chemical potential safety hazard short text classification method and system

Technical Field

The invention relates to the technical field of short text classification, in particular to a domain-adaptive method and system for classifying chemical potential safety hazards into short texts.

Background

The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.

With the rapid development of deep learning technology, many researchers try to solve the text classification problem by using deep learning, and particularly in terms of CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network), many novel and fruitful classification methods are appeared. The method for classifying the texts can well solve the problems of internet news classification, emotion analysis and the like, but in the application of the method in specific related fields, due to the fact that the text characteristics of the fields are different, the practical problems of professional terms, abbreviations, non-standard terms and the like exist, and the practical application effect is general.

Especially, in a potential safety hazard text summarized in a chemical potential safety hazard troubleshooting process, a troubleshooting report text provided by a worker often contains a large number of professional terms, numbers, Chinese and English mixed professional nouns and irregular language expressions, and mostly is sentences with large length changes, and a mainstream text classification model is difficult to capture more accurate classification characteristic information through a text lacking context information, so that the classification of the potential safety hazard is often inaccurate. Therefore, strengthening the domain semantic information in the short text is the key for effectively solving the problem of text classification of potential safety hazards in the chemical industry domain, and has important significance for safety management early warning and potential safety hazard investigation of chemical enterprises.

The short text classification problem is an important research direction of natural language processing tasks, and the difficulty is mainly represented in that sentences are short and short in expression, each word can have rich meanings, and the semantic expression of the sentences is closely related. In many text classification tasks in natural language processing, a traditional classification method is like a classification mode of a naive Bayes model, attributes of the model are assumed to be independent, context association information of a text is not considered, and the support effect on semantic features of a short text is poor.

Applying CNN to the text classification task proposed by KIM Y utilizes a plurality of convolution kernels of different sizes to capture feature information of local correlation, but the size of different windows also determines that the length of the CNN that can extract context dependence is relatively fixed.

The RNN proposed by LAI S et al can use the information of the context words in the sentence to splice word embedding vectors with each word, thereby effectively relieving the problem that the CNN cannot dynamically change the window size to adapt to the context lengths of different texts, but simultaneously bringing about the problems of gradient disappearance and gradient explosion in the training process.

The LSTM (Long Short-term Memory) network proposed by Nguyen et al is a further extension on RNN that addresses the Long-term dependence problem by increasing the multicellular state, with better performance for training of Long-sequence text.

The GRU (Gated Recurrent Unit) proposed by Cho et al improves LSTM, and mixes the cell state and the hidden state to greatly improve the operation performance of the model.

In the process of implementing the invention, the inventor finds that the following technical problems exist in the prior art:

the short text has the characteristics of large text length difference, missing context information, sparse text features, obvious domain dependent features of word semantics and the like, and the general short text classification technology has low classification accuracy due to the difficulty in capturing domain related feature information of the short text.

Disclosure of Invention

In order to solve the defects of the prior art, the invention provides a domain-adaptive chemical potential safety hazard short text classification method and system;

in a first aspect, the invention provides a domain-adaptive chemical potential safety hazard short text classification method;

a domain-adaptive chemical potential safety hazard short text classification method comprises the following steps:

acquiring a plurality of short texts to be classified in the field of chemical potential safety hazard investigation;

vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified;

and inputting the initial text vectors corresponding to all the texts to be classified into the trained short text classification model, and outputting the short text classification result.

In a second aspect, the invention provides a domain-adapted chemical potential safety hazard short text classification system;

a domain-adapted chemical potential safety hazard short text classification system comprises:

an acquisition module configured to: acquiring a plurality of short texts to be classified in the field of chemical potential safety hazard investigation;

an extraction module configured to: vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified;

a classification module configured to: and inputting the initial text vectors corresponding to all the texts to be classified into the trained short text classification model, and outputting the short text classification result.

In a third aspect, the present invention further provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first aspect.

In a fourth aspect, the present invention also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.

Compared with the prior art, the invention has the beneficial effects that:

providing a GRU (Gated recovery Unit) short text classification model fused with a Hierarchical Attention Network (HAN), generating word vector Representation of general knowledge of short texts based on BERT (Bidirectional encoding with pre-training model) and enhancing general feature Representation of short text words and sequences; the GRU + HAN learning short text is adopted to be represented in different levels of character, word and sentence information fusion in a specific field, the field information deviation problem of the general corpus short text is solved, and a better classification effect is shown in a classification task of chemical engineering potential safety hazard investigation.

The invention converts the attention mechanism of the sentence level, transfers the attention branch between the sentences into the implicit attention expression between the sentences, keeps the semantic features contained in the hierarchical attention sentences, and simultaneously aggregates the text feature expression of which the BERT is more divergent. Compared with the short text, the classification problem of the long text can extract semantic information from the context of a sentence and has good classification effect, but the chemical engineering potential safety hazard text has the long text and also has a great number of short sequence texts, and how to automatically capture semantic features of different levels from different text levels by using one model is the core problem of solving the classification of the chemical engineering potential safety hazard.

The invention provides a short text classification model with domain adaptation, which can effectively solve the problems. GRU-HAN takes the BERT as the Word Embedding method of the text, effectively combines the pre-training result of the BERT on the massive Chinese text data set, and can obtain the embedded expression of the common knowledge of the short text. Potential safety hazard texts and labeled classifications thereof in the chemical field are selected as training and testing data sets, and the method is superior to the mainstream text classification method.

Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.

FIG. 1 is a flow chart of a method of the first embodiment;

FIG. 2 is a GRU-HAN overall network model architecture of the first embodiment;

FIG. 3 is three levels of the BERT model of the first embodiment;

FIG. 4 is a diagram of a GRU model structure of the first embodiment;

fig. 5 is a diagram illustrating a connection relationship between a GRU and an HAN according to the first embodiment.

Detailed Description

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.

Example one

The embodiment provides a domain-adaptive chemical potential safety hazard short text classification method;

as shown in fig. 1, a domain-adaptive chemical potential safety hazard short text classification method includes:

s101: acquiring a plurality of short texts to be classified in the field of chemical potential safety hazard investigation;

s102: vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified;

s103: and inputting the initial text vectors corresponding to all the texts to be classified into the trained short text classification model, and outputting the short text classification result.

Further, the S102: vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified; the method specifically comprises the following steps:

and based on the BERT model, vector extraction is carried out on each short text to be classified to obtain an initial text vector corresponding to each short text to be classified.

Further, the inputting the initial text vectors corresponding to all the to-be-classified sections of texts into the trained short text classification model and outputting the short text classification result specifically includes:

s1031: the trained short text classification model encodes each initial text vector to obtain text vectors considering the association of the front time sequence and the rear time sequence;

s1032: the trained short text classification model gives the weight between words to each text vector considering the time sequence association before and after the training to obtain a text vector after the first weighting;

s1033: the trained short text classification model splices the text vectors after the first weighting to obtain sentence embedding vectors;

s1034: based on the sentence embedding vector, a trained short text classification model is utilized to endow each text vector considering the time sequence association before and after the consideration with the weight between words and sentences; obtaining a text vector after the second weighting;

s1035: the trained short text classification model splices the text vectors after the second weighting to obtain vectors to be classified;

s1036: and the trained short text classification model classifies the vectors to be classified to obtain the classification result of each short text to be classified.

Further, the short text classification model has a network structure comprising:

the Word embedding structure layer BERT, the Word Encoder, the Word and Word attention mechanism layer, the first splicing unit, the Word and sentence attention mechanism layer, the second splicing unit and the Softmax classification layer are sequentially connected.

Wherein, the word embeds the structure layer BERT, and the theory of operation is: vector extraction is carried out on each short text to be classified, and a word vector (Token entries), a Segment vector (Segment entries) and a Position vector (Position entries) corresponding to each short text to be classified are obtained;

the word vector (Token entries), Segment vector (Segment entries), and Position vector (Position entries) are fused into an initial text vector.

Wherein the word vector represents a text feature vector; the segment vector represents a context feature vector of the text; the position vector represents a vector of where the text is located.

The Word Encoder is structurally characterized in that a GRU unit is added on the basis of the Word Encoder of the HAN model.

As shown in fig. 5, the Word Encoder has a structure of:

firstly, the Word Encoder of the original HAN model is assumed to comprise:

coding units connected from left to right in sequence

Coding unit

Coding unit

… coding unit

… and coding unit

Coding unit

Coding unit

And the combination of (a) and (b),

coding units connected from right to left in sequence

Coding unit

Coding unit

… coding unit

… and coding unit

Coding unit

And a coding unit

Wherein the coding unit

Is used for inputting the coding unit

The output value of (d); coding unit

For inputting the p-th output value of the word embedding construction layer BERT;

coding unit

The first output end of the control unit is used for being connected with the input end of the p-th Concat unit; the output end of the p-th Concat unit outputs a text vector considering the front and rear time sequence association; coding unit

Second output terminal and

the input ends of the two-way valve are connected;

wherein the coding unit

Is used for inputting the coding unit

The output value of (d); coding unit

For inputting the q-th output value of the word embedding construction layer BERT;

coding unit

Is used for being connected with the input end of the q-th Concat unit; the output end of the qth Concat unit outputs a text vector considering front and rear time sequence association; coding unit

Second output terminal and encoding unit

Is connected to the input terminal of the controller.

Wherein, the attention mechanism layer of word and word, its theory of operation is: and giving weights among words to each text vector considering the time sequence association before and after the word is considered, so as to obtain the text vector after the first weighting.

Wherein, first concatenation unit specifically is: and (4) splicing in series.

Wherein, the attention mechanism layer of word and sentence, its theory of operation is: giving a weight between a word and a sentence to each text vector considering the time sequence association before and after the word is considered; and obtaining the text vector after the second weighting.

Wherein, the second concatenation unit specifically is: and (4) splicing in series.

Further, the training of the trained short text classification model comprises:

constructing a training set, wherein the training set is a plurality of short texts to be classified in the chemical potential safety hazard troubleshooting field of known classification labels;

and inputting the training set into a short text classification model, training the short text classification model, and stopping training when the loss function reaches the minimum value or the training reaches the set iteration number to obtain the trained short text classification model.

Further, the S1031: the trained short text classification model encodes each initial text vector to obtain text vectors considering the association of the front time sequence and the rear time sequence; the method specifically comprises the following steps:

and inputting the initial text vectors corresponding to all the texts to be classified into a Word Encoder of the trained short text classification model, and encoding each initial text vector by the Word Encoder of the trained short text classification model to obtain the text vectors considering the time sequence association before and after the encoding.

Further, the S1032: the trained short text classification model gives the weight between words to each text vector considering the time sequence association before and after the training to obtain a text vector after the first weighting; the method specifically comprises the following steps:

and the attention mechanism layer of the words of the trained short text classification model gives weights between the words to each text vector considering the time sequence association before and after the training to obtain the text vector after the first weighting.

Further, the S1032: the trained short text classification model gives the weight between words to each text vector considering the time sequence association before and after the training to obtain a text vector after the first weighting; the specific working principle comprises:

wherein exp is an exponential function with a natural constant e as a base, and the input parameter

Where u represents the word weight matrix, u_tThen it represents the t-th word weight matrix, where u is taken_tIs transferred to

u_wIs a randomly initialized context vector to finally obtain a normalized weighted text vector matrix alpha_t。

Further, the S1033: the trained short text classification model splices the text vectors after the first weighting to obtain sentence embedding vectors; the method specifically comprises the following steps:

and the trained short text classification model is used for serially splicing the text vectors after the first weighting to obtain sentence embedding vectors.

Further, the S1033: the trained short text classification model splices the text vectors after the first weighting to obtain sentence embedding vectors; the working principle comprises the following steps:

S＝Concat(α₁，…,α_t,…α_n) (7)

wherein the Concat function is used for splicing vectors, and the weight matrix alpha obtained in the last step is used₁To alpha_nAnd carrying out splicing operation to synthesize sentence vector S.

Further, the S1034: based on the sentence embedding vector, a trained short text classification model is utilized to endow each text vector considering the time sequence association before and after the consideration with the weight between words and sentences; obtaining a text vector after the second weighting; the method specifically comprises the following steps:

based on the sentence embedding vector, giving a weight between a word and a sentence to each text vector considering the time sequence association before and after the consideration by using the attention mechanism layer of the word and the sentence of the trained short text classification model; and obtaining the text vector after the second weighting.

Further, the S1034: based on the sentence embedding vector, a trained short text classification model is utilized to endow each text vector considering the time sequence association before and after the consideration with the weight between words and sentences; obtaining a text vector after the second weighting; the working principle comprises the following steps:

wherein u is_tThen it represents the t-th word weight matrix and the transposed implicit representation of the word

No longer with u_wPerforming correlation operation, namely performing product operation on the sentence vectors S after splicing, and normalizing the sentence vectors S through an exp function to obtain beta_t，β_tThen the second weighted text vector for the t-th word is referred to.

Further, the S1035: the trained short text classification model splices the text vectors after the second weighting to obtain vectors to be classified; the method specifically comprises the following steps:

and the trained short text classification model is used for serially splicing the text vectors after the second weighting to obtain the vectors to be classified.

Further, the S1035: the trained short text classification model splices the text vectors after the second weighting to obtain vectors to be classified; the working principle comprises the following steps:

h_irepresenting hidden implicit vectors;

context feature vector beta fusing all words in short text and semantically associating words with sentences_iFusing, wherein each accumulated semantic feature also carries h generated by a single-layer perceptron_iAn implicit vector.

Further, the S1036: the trained short text classification model classifies vectors to be classified to obtain a classification result of each short text to be classified; the method specifically comprises the following steps:

and the Softmax classification layer of the trained short text classification model classifies the text vectors weighted for the second time to obtain the classification result of each short text to be classified.

Further, the S1036: the trained short text classification model classifies vectors to be classified to obtain a classification result of each short text to be classified; the working principle comprises the following steps:

p＝softmax(W_cβ+b_c) (10)

wherein the matrix coefficient W is included_cAnd b_cAnd inputting a beta context vector generated by the text, and mapping the final classification result through a softmax function to obtain a classification score matrix p.

Further, assume that probability distribution p is the desired output and probability distribution q is the actual output.

Further, the loss function H (p, q) is:

where N denotes the number of samples of a batch, M denotes the number of classes, p (x)_ij) Representing the desired variable output if the class is associated with the class of sample iIf the values are the same, the value is 1, otherwise, the value is 0. q (x)_ij) Representing the predicted probability that the observed sample i belongs to the class j, the operation here uses its log function value.

The trained short text classification model comprises a GRU-HAN network model, wherein the GRU-HAN network model fuses an improved HAN and a GRU network, as shown in figure 1, Word embedding of a text uses a BERT model construction method, each generated Word vector models information between Word vectors and Sentence vectors in an input text sequence through a Word-Word Level assignment of words and sentences in the HAN, and an implicit semantic vector concerned by the HAN is fed back to a classifier network and characteristic classification information is output through Softmax.

The GRU-HAN network model includes a deep GRU network with a variable number of sequences of control units, using the inputs of the GRU cells to implement an attention connection from the encoder network to the hierarchy, the attention mechanism of the present invention connecting the bottom layer of the decoder to the top layer of the encoder in order to improve parallelism and reduce training time. In order to accelerate the model fitting speed, the coding level of the sentence is kept unchanged in the conversation process of a sentence sequence.

In GRU-HAN, the semantic richness of the enhanced text is different in two respects:

(1) performing Word embedding construction on a text by using BERT, performing bidirectional semantic coding by using GRU (generalized regression Unit), and obtaining potential semantics and front and back time sequence association memory information of a text vector through two-layer Encoder coding as shown in a Word Encoder in figure 2, and simultaneously coding and fusing contexts of contexts while enhancing the self semantics;

(2) the Word with higher rank and higher weight in all the words is sensed in the HAN Level through Word-Word Level Attention, the Word vectors are spliced to obtain Sentence embedding vectors, and then context sensing of the Word-sequence Level Attention and the global text is continuously focused, so that the semantic ambiguity problem of BERT in a specific field can be effectively relieved by combining Attention sensing of two layers. The GRU has strong language sequence information capturing capability, and the time sequence information captured by the GRU can be faded through a layered attention mechanism, so that the freely expressed text has more semantic interpretation modes.

The traditional natural language processing task adopts character coding of static semantic information to carry out vector expression, such as Word Embedding modes of Word2Vec, One-Hot and the like. These vector encoding methods do not consider context information, each word is mapped to a unique dense vector, and the word ambiguity problem cannot be solved. In an actual short text classification task, a single word often has multiple meanings, and the traditional word vector cannot well represent semantic features of the word in a short text, so that a better text feature needs to be learned through a deep model.

The pretraining model of the BERT is obtained through the self-supervision learning training of a large amount of linguistic data, the meaning of the word vector of the pretraining model is fused with the text characteristics of the large amount of linguistic data, and the pretraining model can be well applied to a word embedding characteristic representation method of each text task.

The feature representation of the BERT model is divided into three levels, a character information vector (Token entries), a Segment information vector (Segment entries), and a Position information vector (Position entries). As shown in FIG. 3, Token entries represents each word after segmentation by vector, segment entries labels according to the segmentation information in the sentence and marks with [ CLS ] and [ SEP ], and Position entries adds Position timing information to each Input unit. And finally, superposing the representation information of the three layers of vectors to obtain the word vector represented by the BERT model.

To learn the multiple feature expressions for each word in the short text, the word embedding layer performs a linear mapping of the word vectors after they are constructed using BERT:

inputting a text sequence X with the length n ═ X₁,x₂,…x_n]The generated word embedding matrix B ═ B₁,b₂,…b_n]Establishing a query matrix Q, a key matrix K, a value matrix V and other word embedded vectors to establish a hidden semantic association relationIs described.

The GRU model is proposed to solve the problems of long-term memory and gradient in back propagation, and as shown in fig. 4, the GRU model passes through the last transmitted Hidden State h_t-1Input b to the current node_tTo obtain two Gate states (Gate states) as Gate units (Gate units), such input combination contains both the history information of the previous node and the information of the current node.

Controlling updated gating z after receiving input information_tData are converted into values in the range of 0-1 by sigma (Sigmoid function) and used as gating signals, including:

z_t＝σ(W_z·[h_t-₁，b_t]) (1)

controlling reset gating r_tThe following are also available:

r_t＝σ(W_r·[h_t-1，b_t]) (2)

after a gating signal is obtained, splicing the reset gating and input information, and zooming the data to the range of-1 to 1 through a tanh activation function to obtain the gating signal

Herein, the

The method includes various signal data, and the signals added into the data memorize the state of the current time t. The following memory updating stage simultaneously performs two processes of forgetting and memorizing:

beforeUpdating gating z_tThe range is 0-1, with values closer to 1 representing more data being memorized and conversely more data being forgotten. The two processes are carried out simultaneously, so that the GRU has fewer parameters than the LSTM and has higher operation efficiency. In the above formula, (1-z). times.h_t-1Indicating selective forgetting of an otherwise unimportant hidden state,

indicating that pairs contain current node information

And performing selective memory, wherein the forgetting (1-z) and the memory z are linked, and certain memory compensation is performed on the forgetting weight in a constant state.

After the vectors containing the sequence context memory information and the rich word meaning information are generated by the BERT and the GRU, as the word vectors generated by the BERT have massive linguistic characteristics, the powerful generalization capability ensures that the BERT can not focus on semantic characteristics of a certain aspect in a specific field.

Not all words contribute equally to the expression of a sentence meaning. Therefore, we introduce a mechanism of attention to extract words that are important to the meaning of a sentence by dividing h_tInput to a single-layer perceptron (MLP)_tAs h_tImplicit representation of (c):

u_t＝tanh(W_wh_t+b_w) (5)

meanwhile, in order to compare the importance of words in the text, the invention uses u_tAnd a randomly initialized context vector u_wIs expressed by the similarity of the weights, and then obtains a normalized Attention weight matrix alpha through the softmax operation_tRepresenting the weight of the tth word in the text.

Obtaining the Word-Word Level orientation weight matrix of the above formula, and the semantic focus characteristics between words in the text sequence can be retained in the matrix by generating alpha in the text_tConcat summarize to form a global sentence feature vector S of the text sequence:

S＝Concat(α₁，...，α_t，...α_n) (7)

the attention association of the words and the sentences is established through the word vectors and the sentence vectors, so that the attention feedback mechanism is established between the words and the whole sentences, the importance of the words to the whole text is measured, and the semantic association characteristics between the text and the sentences are enhanced. Similarly, the implicit representation of a word u_tContext vector u no longer associated with random initialization_wSimilarity comparison is carried out, and the similarity comparison is balanced with the Sentence feature vector S to obtain a Word-sequence Level Attention weight matrix (u needs to be calculated for ensuring the matrix operation_tVector represents unsqueeze extension dimension):

and splicing and combining the attention matrixes generated by each word and each sentence to form a combined word and a word, carrying out semantic association on the words and the sentences by using a context feature vector beta, and carrying out text classification on the features by using softmax.

p＝softmax(W_cβ+b_c) (10)

The cross entropy is mainly used for judging the closeness degree of the actual output and the expected output, and is characterized in that the distance between the actual output (probability) and the expected output (probability) is smaller, namely the smaller the value of the cross entropy is, the closer the two probability distributions of the actual output and the expected output are.

Assuming that the probability distribution p is an expected output and the probability distribution q is an actual output, the cross entropy loss function cross entropy loss calculation method is as follows:

the method can be well adapted to multi-label classification tasks by adopting a cross entropy loss calculation method, in formula 11, N represents the number of samples of a batch, M represents the number of categories, and p (x)_ij) Representing the desired variable output, takes the value 1 if the class is the same as the class of sample i, and 0 otherwise. q (x)_ij) Representing the predicted probability that the observed sample i belongs to the class j.

The second embodiment provides a domain-adaptive chemical potential safety hazard short text classification system;

It should be noted here that the above-mentioned obtaining module, extracting module and classifying module correspond to steps S101 to S103 in the first embodiment, and the above-mentioned modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to what is disclosed in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.

In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.

EXAMPLE III

The present embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first embodiment.

It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.

The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Example four

The present embodiments also provide a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the method of the first embodiment.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A domain-adaptive chemical potential safety hazard short text classification method is characterized by comprising the following steps:

2. The method as claimed in claim 1, wherein the step of inputting the initial text vectors corresponding to all the texts to be classified into the trained short text classification model and outputting the short text classification result specifically comprises:

the trained short text classification model encodes each initial text vector to obtain text vectors considering the association of the front time sequence and the rear time sequence;

the trained short text classification model gives the weight between words to each text vector considering the time sequence association before and after the training to obtain a text vector after the first weighting;

the trained short text classification model splices the text vectors after the first weighting to obtain sentence embedding vectors;

based on the sentence embedding vector, a trained short text classification model is utilized to endow each text vector considering the time sequence association before and after the consideration with the weight between words and sentences; obtaining a text vector after the second weighting;

the trained short text classification model splices the text vectors after the second weighting to obtain vectors to be classified;

and the trained short text classification model classifies the vectors to be classified to obtain the classification result of each short text to be classified.

3. The method as claimed in claim 1 or 2, wherein the short text classification model and network structure comprises:

the word embedding structure layer, the word encoder, the word and word attention mechanism layer, the first splicing unit, the word and sentence attention mechanism layer, the second splicing unit and the classification layer are sequentially connected.

4. The method as claimed in claim 3, wherein the word encoder is structured as follows:

firstly, the Word Encoder of the original HAN model is assumed to comprise:

coding units connected from left to right in sequence

Coding unit

Coding unit

… coding unit

… and coding unit

Coding unit

Coding unit

And the combination of (a) and (b),

coding units connected from right to left in sequence

Coding unit

Coding unit

… coding unit

… and coding unit

Coding unit

And a coding unit

Wherein the coding unit

Is used for inputting the coding unit

The output value of (d); coding unit

coding unit

Second output terminal and

the input ends of the two-way valve are connected;

wherein the coding unit

Is used for inputting the coding unit

The output value of (d); coding sheetYuan

coding unit

Second output terminal and encoding unit

Is connected to the input terminal of the controller.

5. The method as claimed in claim 2, wherein the trained short text classification model encodes each initial text vector to obtain text vectors considering sequential association before and after the training; the method specifically comprises the following steps:

inputting the initial text vectors corresponding to all the texts to be classified into a word encoder of a short text classification model after training, and encoding each initial text vector by the word encoder of the short text classification model after training to obtain text vectors considering front and rear time sequence association;

alternatively, the first and second electrodes may be,

the trained short text classification model gives the weight between words to each text vector considering the time sequence association before and after the training to obtain a text vector after the first weighting; the method specifically comprises the following steps:

6. The method for classifying the short texts of the chemical safety hazards in a domain adaptation manner as claimed in claim 2, wherein the trained short text classification model is used for splicing the text vectors after the first weighting to obtain sentence embedding vectors; the method specifically comprises the following steps:

the trained short text classification model is used for serially splicing the text vectors weighted for the first time to obtain sentence embedding vectors;

alternatively, the first and second electrodes may be,

based on the sentence embedding vector, a trained short text classification model is utilized to endow each text vector considering the time sequence association before and after the consideration with the weight between words and sentences; obtaining a text vector after the second weighting; the method specifically comprises the following steps:

7. The method for classifying the short texts of the chemical safety hazards in a domain adaptation manner as claimed in claim 2, wherein the trained short text classification model is used for splicing the text vectors after the second weighting to obtain the vectors to be classified; the method specifically comprises the following steps:

the trained short text classification model is used for serially splicing the text vectors weighted for the second time to obtain vectors to be classified;

alternatively, the first and second electrodes may be,

the trained short text classification model classifies vectors to be classified to obtain a classification result of each short text to be classified; the method specifically comprises the following steps:

and the classification layer of the trained short text classification model classifies the text vectors weighted for the second time to obtain the classification result of each short text to be classified.

8. The utility model provides a chemical industry potential safety hazard short text classification system of domain adaptation, characterized by includes:

9. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform the method of any of the preceding claims 1-7.

10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 7.