CN112131887A

CN112131887A - Semantic element-based low-resource text recognition algorithm

Info

Publication number: CN112131887A
Application number: CN202011001618.1A
Authority: CN
Inventors: 付勇; 井友鼎; 杜创胜; 王旭峰; 甘志芳; 王顺智
Original assignee: Henan Hezhongwei Qiyunzhi Technology Co ltd
Current assignee: Henan Hezhongwei Qiyunzhi Technology Co ltd
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2020-12-25
Anticipated expiration: 2040-09-22
Also published as: CN112131887B

Abstract

The invention provides a low-resource text recognition algorithm based on semantic elements, and belongs to the technical field of natural language understanding. The method comprises the following steps: acquiring a text sentence, and coding the text sentence to obtain a tensor expression of the coded sentence; carrying out semantic element recognition processing on the sentence tensor expression to obtain a semantic element recognition result; carrying out zooming processing on the sentence tensor expression by using the semantic element recognition result; processing the scaled sentence tensor representation by using a mean pooling method to obtain semantic element vector representation; processing the sentence tensor representation by using a mean pooling method to obtain sentence vector representation; carrying out splicing processing on the sentence vector representation and the semantic element vector representation to obtain final sentence representation; and processing the final sentence representation to obtain the final text type probability. The invention introduces the semantic element recognition task to enable the model to have the capability of recognizing different semantic elements, thereby greatly reducing the learning difficulty of the instruction text classification task.

Description

Semantic element-based low-resource text recognition algorithm

Technical Field

The invention belongs to the technical field of natural language understanding, and particularly relates to a low-resource text recognition algorithm based on semantic elements.

Background

In recent years, deep learning models have achieved remarkable results on many natural language understanding tasks, however, deep learning-based methods often require a large amount of labeled data. And the application of natural language understanding has stronger scene characteristics, and cannot directly use public corpus resources to directly develop and apply, and the labeling cost of the corpus is higher under different scenes in many fields.

The voice command control is a man-machine interaction mode utilizing a voice control system, and a common realization method is to convert voice information into a text by using a voice recognition technology and then recognize a command intention of the text by using a text classification technology. Language expressions in such a scenario are relatively few, but the accuracy requirement on the classification model is high. At present, most of the technicians in the field use neural network language models such as BERT for processing, the models can ensure the convergence and generalization of the models under less data, but the accuracy requirements of the models are difficult to meet, and meanwhile, the models are too large and cannot be deployed offline at a mobile terminal.

Disclosure of Invention

The invention aims to solve the technical problem that aiming at the defects of the prior art, a low-resource text recognition algorithm based on semantic elements is provided, the generalization capability of a model is increased by utilizing semantic information, and meanwhile, certain accuracy can be ensured.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a semantic element based low-resource text recognition algorithm, the method comprising:

s1, acquiring a text sentence, and coding the text sentence to obtain a tensor expression of the coded sentence;

s2, performing semantic element recognition processing on the sentence tensor expression obtained in the step S1 to obtain a semantic element recognition result;

s3, scaling the sentence tensor representation obtained in step S1 by the semantic element recognition result obtained in step S2;

s4, processing the scaled sentence tensor representation in the step S3 by using a mean value pooling method to obtain semantic element vector representation;

s5, processing the sentence tensor expression obtained in the step S1 by using a mean pooling method to obtain sentence vector expression;

s6, carrying out splicing processing on the sentence vector representation obtained in the step S5 and the semantic element vector representation obtained in the step S4 to obtain final sentence representation;

and S7, processing the final sentence representation obtained in the step S6 to obtain the final text type probability.

As a further scheme of the invention: in step S1, the text sentence is encoded using LSTM or Transformer.

As a further scheme of the invention: the semantic element identification processing method in the step S2 includes a sigmoid function.

As a further scheme of the invention: the scaling processing method in step S3 is element level multiplication.

As a further scheme of the invention: the processing method for final expression of the sentence in the step S7 includes a softmax function or a sigmoid function.

Compared with the prior art, the invention has the beneficial effects that: the sentence representation method based on semantic elements greatly reduces the learning difficulty of the instruction text classification task, compared with the sentence representation method based on words, the semantic element has coarser granularity and less combination, and the model mainly carries out reasoning classification on the text from the recognition condition of the semantic elements due to the isolation of the semantic element recognition task and the text classification task; compared with text classification based on BERT and other neural network language models, the network structure designed by the invention has higher interpretability, developers can analyze inference logic of a text classification task through a semantic element recognition result, and further provide a model text classification effect through a method for optimizing semantic element recognition accuracy; binary results based on implicit sentence representation rather than semantic element recognition can be modeled to non-co-occurring semantic relationships.

Drawings

The present invention will be described in further detail with reference to the accompanying drawings.

FIG. 1: the invention is a flow diagram.

FIG. 2: the flow chart of the invention.

Detailed Description

For a better understanding of the invention, the following description is given in conjunction with the examples and the accompanying drawings, but the invention is not limited to the examples. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without one or more of these specific details.

As shown in fig. 1 and fig. 2, the implementation process of the present invention includes the following steps:

1. step S1, a text sentence is obtained, and the text sentence is encoded to obtain an encoded sentence tensor expression E.

Step S1 first encodes the text sentence using LSTM or Transformer. The LSTM or Transformer algorithm is a current popular sequence information processing method, saves computing resources relative to the number of fully-connected network parameters, and belongs to the conventional technology in the field.

The correlation formula is as follows:

in formula (1), V represents a sequence of indexes of a single character in a text sentence, the shape is N × 1, N represents the length of the text sentence, and E represents a tensor representation after the sentence is encoded. And (2) reshaping the tensor expression E by using a reshape function in a Python NumPy scientific computation library, wherein K represents the number of semantic element types, and H represents the size of a hidden layer of the model. The purpose of reshaping is to translate semantic element recognition into a binary classification task. The tensor represents the sequential dimension of the sequence in a first dimension of E, the class dimension of the semantic elements in a second dimension, and the numerical dimension of the tensor in a third dimension. The result of the reshaping is a sentence tensor representation E.

2. In step S2, semantic element recognition processing is performed on the sentence tensor expression E obtained in step S1 to obtain a semantic element recognition result B.

Step S2 introduces a subtask module of semantic element recognition so that the model can capture different meaning elements in the sentence. The semantic element recognition adopts an entity recognition method.

The correlation formula is as follows:

f in the formula (3) is a linear transformation function, sigma represents a sigmoid activation function, and B represents a semantic element identification result. The semantic element recognition result B is a tensor with a shape of nxkx 1, and the form of the semantic element data label is 0 or 1, where 1 indicates that the current character belongs to the semantic element, and 0 indicates that the current character does not belong to the semantic element. And slicing the second dimension and the semantic element type dimension of the semantic element identification result B to obtain identification results of different semantic elements.

3. In step S3, the sentence tensor representation E obtained in step S1 is scaled by the semantic element recognition result B obtained in step S2, so that a scaled tensor E' is obtained.

The semantic element recognition result B is a probabilistic semantic element recognition result. The sentence tensor representation E is encoded using the semantic element recognition result B, and the words of the non-semantic elements can be scaled to a smaller numerical representation. So that the distribution of data on the slice in each semantic element type dimension is more relevant to the vector representation of the current semantic element.

The correlation formula is as follows:

equation (4) performs the scaling process using a multiplication method at an element level. The last dimension of the semantic element recognition result B is smaller than the sentence tensor expression E, and the missing part can be supplemented by using an array broadcasting mode in a Python NumPy scientific calculation library.

4. And step S4, processing the sentence tensor expression E' zoomed in the step S3 by using a mean value pooling method to obtain a semantic element vector expression M.

Pooling (Pooling) is an important concept in convolutional neural networks, and Pooling can perform dimensionality reduction (down-sampling) operations, reduce information redundancy, promote invariance of scale and rotation of models, and prevent overfitting. Mean Pooling (Mean Pooling) is one of the nonlinear Pooling functions, and can reduce the shift of the estimated Mean and improve the robustness of the model. Mean pooling is a routine technique in the art.

The scaled sentence tensor representation E' is transformed in a mean pooling manner in order to reduce the model size and reduce the computational complexity of the subsequent linear transformation.

The correlation formula is as follows:

equation (5) transposes the sequence order of the tensor representation E' and the semantic element dimension. And (6) performing mean value reduction on the sequence order dimension to obtain the semantic element vector expression M. And (4) reshaping the semantic element vector expression M by using a formula (7), and integrating the semantic element category dimension into the numerical dimension.

5. And step S5, processing the sentence tensor E representation obtained in the step S1 by using a mean pooling method to obtain a sentence vector representation S.

The purpose of step S5 is to encode the words of non-semantic elements that may appear in the sentence, such as "of", "bar", "o", etc. The appearance of these words at different locations has classification results that may change the text, for example: "operational failure of the handset" and "operational failure of the handset" describe two different failures that may be described.

The correlation formula is as follows:

formula (8) uses a mean pooling method to obtain a vector representation S of the sentence. The sentence vector representation S is then reshaped into a 1 × H shape, which can be concatenated with the semantic element vector representation M.

6. Step S6, the sentence vector representation S obtained in step S5 and the semantic element vector M obtained in step S4 are spliced to obtain a final sentence representation M'.

The correlation formula is as follows:

formula (9) splices the semantic element vector M and the word-based sentence vector representation S using the concat function to obtain the final representation M' of the sentence.

7. Step S7, the final representation of the sentence obtained in step S6 is processed to obtain the final text type probability.

Step S7 calculates the probability of the text type using a linear transformation plus an activation function.

The correlation formula is as follows:

f' in equation (10) represents a linear transfer function. According to the mutual exclusivity of the text type, sigma' selects softmax or sigmoid to perform probability conversion.

The deep learning text classification method commonly used in the art uses LSTM, Transformer, etc. to convert the text into sentence vectors, and then uses some linear transformation method to obtain the probability estimation of the text type. Sentences composed of words at granularity D without considering grammar and semantic rules^NDifferent combinations are possible, where D represents the vocabulary and N represents the sentence length. Therefore, sentence classification with word granularity requires more training corpora and as many expression ways including various intentions as possible. However, for example, after learning "turn on" from the "turn on bluetooth" sentence, it is natural to infer that "turn on the flashlight" is an intention to turn on the flashlight "without recognizing the new expression.

Considering that instructional languages generally have obvious predicates and description objects, for example, a predicate "turn on" statement in "turn on a flashlight" describes that an object is "flashlight", and a relatively complicated sentence such as "i want to execute a to-do work order" has modifiers "i want" and "to do" in addition to the predicate "execute" and the description object "work order". The present invention refers to these predicates, modifiers, and the like as semantic elements, and often, an instructional language is a diverse combination of these semantic elements.

Reasons for complicating these linguistic expressions are: the combination diversity and the element expression diversity, however, the diversity of the language element combination is far lower than that of the word combination. The present invention thus reduces the difficulty of resource text classification by converting a word-based sentence representation to a semantic element-based semantic element representation.

The semantic element recognition task introduced by the invention enables the model to have the capability of recognizing different semantic elements, and by utilizing the characteristics of less corpus resources and small training data amount under different scenes of the instructional language, the entity labeling of the data does not cost too much time and cost, and the consumption of the model training and reasoning on the computing resources is not large. The semantic elements provide modular sentence representation for sentence text recognition, so that the model can make correct reasoning when encountering unknown sentences, the isolation between two tasks of semantic element recognition and text classification is ensured, and the model focuses more on the difference between the semantic elements and the semantic elements rather than the difference of different semantic expressions.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. A low-resource text recognition algorithm based on semantic elements is characterized in that: the algorithm comprises the following steps:

2. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: in step S1, the text sentence is encoded using LSTM or Transformer.

3. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the semantic element identification processing method in the step S2 includes a sigmoid function.

4. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the scaling processing method in step S3 is element level multiplication.

5. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the processing method for final expression of the sentence in the step S7 includes a softmax function or a sigmoid function.