CN112131887B

CN112131887B - Low-resource text recognition algorithm based on semantic elements

Info

Publication number: CN112131887B
Application number: CN202011001618.1A
Authority: CN
Inventors: 付勇; 井友鼎; 杜创胜; 王旭峰; 甘志芳; 王顺智
Original assignee: Henan Hezhongwei Qiyunzhi Technology Co ltd
Current assignee: Henan Hezhongwei Qiyunzhi Technology Co ltd
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2024-03-08
Anticipated expiration: 2040-09-22
Also published as: CN112131887A

Abstract

The invention provides a low-resource text recognition algorithm based on semantic elements, and belongs to the technical field of natural language understanding. The method comprises the following steps: acquiring a text sentence, and performing coding processing on the text sentence to obtain a coded sentence tensor representation; carrying out semantic element recognition processing on the sentence tensor representation to obtain a semantic element recognition result; scaling the sentence tensor representation by using the semantic element recognition result; processing the scaled sentence tensor representation by using a mean pooling method to obtain a semantic element vector representation; processing the sentence tensor representation by using a mean value pooling method to obtain sentence vector representation; splicing the sentence vector representation and the semantic element vector representation to obtain a final sentence representation; and processing the final sentence representation to obtain the final text type probability. According to the invention, a semantic element recognition task is introduced, so that the model has the capability of recognizing different semantic elements, and the learning difficulty of the instruction text classification task is greatly reduced.

Description

Low-resource text recognition algorithm based on semantic elements

Technical Field

The invention belongs to the technical field of natural language understanding, and particularly relates to a low-resource text recognition algorithm based on semantic elements.

Background

In recent years, deep learning models have achieved significant results in many natural language understanding tasks, however, deep learning-based methods often require relatively large amounts of marking data. Moreover, the application of natural language understanding has stronger scenerization characteristic, the application cannot be directly developed by directly using public corpus resources, and the corpus marking cost is higher in different scenes in a plurality of fields.

Voice command manipulation is a man-machine interaction method using a voice control system, and a common implementation method is to use voice recognition technology to convert voice information into text, and then to recognize command intention of the text through text classification technology. The language expression mode in the scene is less, but the accuracy requirement on the classification model is higher. At present, a person skilled in the art mostly uses neural network language models such as BERT to process, and the models can ensure that the models are converged and generalized under less data, but the accuracy requirements of the models are hardly met, and meanwhile, the models are too huge and cannot be deployed offline at a mobile terminal.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a low-resource text recognition algorithm based on semantic elements, which utilizes semantic information to increase the generalization capability of a model and can also ensure certain accuracy.

In order to solve the technical problems, the invention adopts the following technical scheme: a semantic element-based low-resource text recognition algorithm, the method comprising:

s1, acquiring a text sentence, and performing coding processing on the text sentence to obtain a coded sentence tensor representation;

s2, carrying out semantic element recognition processing on the sentence tensor representation obtained in the step S1 to obtain a semantic element recognition result;

s3, scaling the sentence tensor representation obtained in the step S1 by using the semantic element recognition result obtained in the step S2;

s4, processing the scaled sentence tensor representation in the step S3 by using a mean value pooling method to obtain semantic element vector representation;

s5, processing the sentence tensor representation obtained in the step S1 by using a mean value pooling method to obtain sentence vector representation;

s6, performing splicing processing on the sentence vector representation obtained in the step S5 and the semantic element vector representation obtained in the step S4 to obtain a final sentence representation;

s7, processing the final sentence representation obtained in the step S6 to obtain the final text type probability.

As a further scheme of the invention: in the step S1, the LSTM or the Transformer is used to encode the text sentence.

As a further scheme of the invention: the semantic element identification processing method in the step S2 comprises a sigmoid function.

As a further scheme of the invention: the scaling processing method in the step S3 is multiplication at element level.

As a further scheme of the invention: the processing method for final sentence representation in step S7 includes a softmax function or a sigmoid function.

Compared with the prior art, the invention has the beneficial effects that: the sentence representation method based on the semantic elements greatly reduces the learning difficulty of the instruction text classification task, compared with the sentence representation method based on words, the semantic elements are coarser in granularity and less in combination, and the model mainly performs reasoning classification on the text from the identification condition of the semantic elements due to the isolation of the semantic element identification task and the text classification task; compared with text classification based on neural network language models such as BERT, the network structure designed by the invention has higher interpretability, and a developer can analyze the reasoning logic of the text classification task through the result of semantic element recognition, so that the effect of model text classification is provided by a method for optimizing the semantic element recognition accuracy; binary results based on implicit sentence representation rather than semantic element recognition can be modeled to non-co-occurrence semantic relationships.

Drawings

The present invention will be described in further detail with reference to the accompanying drawings.

Fig. 1: the flow chart of the invention is shown schematically.

Fig. 2: the flow chart of the invention.

Detailed Description

For a better understanding of the present invention, the content of the present invention will be further clarified below with reference to the examples and the accompanying drawings, but the scope of the present invention is not limited to the following examples only. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced without one or more of these details.

As shown in fig. 1 and 2, the implementation process of the present invention includes the following steps:

1. step S1, a text sentence is obtained, and encoding processing is carried out on the text sentence, so that an encoded sentence tensor representation E is obtained.

Step S1 first encodes the text sentence using LSTM or Transformer. The LSTM or transducer algorithm is a currently popular sequence information processing method, and is a conventional technology in the field, and the calculation resource is saved relative to the total network parameter number.

The correlation formula is as follows:

in formula (1), V represents a sequence of indexes of individual characters in a text sentence, the shape is n×1, N represents the length of the text sentence, and E represents a tensor representation of the sentence after encoding. And (2) using a reshape function in a Python NumPy scientific calculation library to remodel the tensor representation E, wherein K represents the semantic element type number and H represents the model hidden layer size. The purpose of the remodelling is to translate semantic element recognition into a binary classification task. The tensor represents the sequential dimension of the sequence, the second dimension represents the class dimension of the semantic element, and the third dimension represents the numerical dimension of the tensor. The result of the reshaping is a sentence tensor representation E.

2. And S2, carrying out semantic element recognition processing on the sentence tensor representation E obtained in the step S1 to obtain a semantic element recognition result B.

Step S2 introduces a subtask module for semantic element recognition so that the model can capture different meaning elements in the sentence. The semantic element recognition adopts an entity recognition method.

The correlation formula is as follows:

f in formula (3) is a linear transformation function, σ represents a sigmoid activation function, and B represents the result of semantic element recognition. The semantic element recognition result B is a tensor with the shape of NxKx1, and the semantic element data is marked in the form of 0 or 1, wherein 1 indicates that the current character belongs to the semantic element, and 0 indicates that the current character does not belong to the semantic element. Slicing the second dimension of the semantic element recognition result B and the semantic element type dimension to obtain recognition results of different semantic elements.

3. And step S3, scaling the tensor representation E of the sentence obtained in the step S1 by using the semantic element recognition result B obtained in the step S2 to obtain a scaled tensor E'.

Since the semantic element recognition result B is a probabilistic semantic element recognition result. Encoding the sentence tensor representation E using the semantic element recognition result B may scale the words of the non-semantic elements into a smaller numerical representation. Such that the data distribution on the slice in each semantic element type dimension is more relevant to the vector representation of the current semantic element.

The correlation formula is as follows:

equation (4) performs scaling processing by multiplying at the element level. The final dimension of the semantic element recognition result B is smaller than the sentence tensor representation E, and the missing part can be supplemented by using an array broadcasting mode in a Python NumPy scientific calculation library.

4. And S4, processing the scaled sentence tensor representation E' in the step S3 by using a mean value pooling method to obtain a semantic element vector representation M.

Pooling (Pooling) is an important concept in convolutional neural networks, and can perform dimension reduction (downsampling) operation, reduce information redundancy, improve scale and rotation invariance of the model, and prevent overfitting. Mean Pooling (Mean Pooling) is a nonlinear Pooling function of one of the methods, and can reduce the offset of the estimated Mean value and improve the robustness of the model. Averaging is a routine technique in the art.

The scaled sentence tensor representation E' is transformed in a mean-pooling manner in order to reduce the model size and reduce the computational complexity of the subsequent linear transformation.

The correlation formula is as follows:

equation (5) transposes the sequential order and semantic element dimensions of the tensor representation E'. And (3) carrying out mean reduction on the sequence order dimension by the formula (6) to obtain the semantic element vector representation M. The formula (7) remodels the semantic element vector representation M, merging the semantic element class dimension into the numerical dimension.

5. And S5, processing the sentence tensor E representation obtained in the step S1 by using a mean value pooling method to obtain a sentence vector representation S.

The purpose of step S5 is to encode words of non-semantic elements, such as "bar", "o", etc., that may occur in the sentence. The presence of these words at different locations may change the classification result of the text, for example: the "failure to operate of the handset" and "failure to operate of the handset" describe the possible descriptions of two different failures.

The correlation formula is as follows:

equation (8) adopts a mean pooling method to obtain a vector representation S of the sentence. The sentence vector representation S is then reshaped into a 1 xh shape, which can be concatenated with the semantic element vector representation M.

6. And step S6, performing splicing processing on the sentence vector representation S obtained in the step S5 and the semantic element vector M representation obtained in the step S4 to obtain a sentence final representation M'.

The correlation formula is as follows:

the formula (9) uses a concat function to splice the semantic element vector M and the sentence vector representation S based on words to obtain the final representation M' of the sentence.

7. And step S7, processing the final sentence representation obtained in the step S6 to obtain the final text type probability.

Step S7 calculates the probability of the text type using the linear transformation plus the activation function.

The correlation formula is as follows:

in formula (10), f' represents a linear conversion function. Depending on the mutual exclusivity of text types, σ' may choose softmax or sigmoid for probability conversion.

The current deep learning text classification method commonly used in the art uses LSTM, transformer and other methods to convert the text into sentence vectors, and then uses some linear transformation methods to obtain probability estimates for the text types. Sentences composed of words as granularity have D without considering grammar and semantic rules ^N A different combination, where D represents the vocabulary N represents the sentence length. Sentence classification with word granularity requires a relatively large amount of training corpus and requires as many representations as possible including various intentions. However by peopleFor example, learning another expression of "on" from the "on Bluetooth" sentence would naturally infer that "on flashlight" is the intent of "on flashlight" without having to re-recognize this new expression.

In view of the fact that the instructional language generally has obvious predicate and description objects, for example, the object described by the predicate "on" statement in "on torch" is "torch", and more complex, for example, "I want to execute to Do worksheets" has some modifiers "I want" and "to Do" in addition to the predicate "execute" and description object "worksheets". The invention refers to predicates, modifiers and the like as semantic elements, and the instructional language is often a diversity combination of the semantic elements.

The reasons for complicating these language expressions are: the diversity of combinations and element expressions is much lower than that of word combinations, however, the diversity of language element combinations is much lower. The present invention thus reduces the difficulty of classifying resource text by converting word-based sentence representations into semantic element representations based on semantic elements.

The semantic element recognition task introduced by the invention enables the model to have the capability of recognizing different semantic elements, and the characteristics of small corpus resources and small training data volume under different scenes of the instructive language are utilized, so that the entity labeling of the data does not take too much time cost, and the model training and reasoning consume little computing resources. The semantic elements provide modularized sentence representation for sentence text recognition, so that a model can make correct reasoning when encountering sentences with unknown expressions, isolation between two tasks of semantic element recognition and text classification is ensured, and the model focuses more on distinction between semantic elements rather than distinction between different semantic expressions.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims

1. A low-resource text recognition algorithm based on semantic elements, characterized in that: the algorithm comprises the following steps:

s1, acquiring a text sentence, and performing coding processing on the text sentence to obtain a coded sentence tensor representation; sentence tensor representationWhereinNRepresenting the length of the text sentence,Kthe number of types representing the semantic elements,Hrepresenting model hidden layer size;N、K、Hthe method comprises the steps of respectively corresponding to three dimensions of sentence tensor representation, wherein the first dimension is a sequence dimension of a sequence, the second dimension is a type dimension of a semantic element, and the third dimension is a numerical dimension of the tensor;

s2, carrying out semantic element recognition processing on the sentence tensor representation obtained in the step S1 to obtain a semantic element recognition result; semantic element recognition resultsA relation between each character in the text sentence and different semantic element types is represented by 0 or 1;

s3, scaling the sentence tensor representation obtained in the step S1 by using the semantic element recognition result obtained in the step S2, and scaling the elements of the non-semantic elements in the sentence tensor representation into smaller numerical representation; scaled sentence tensor representationE'＝B·E，；

2. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: in the step S1, the LSTM or the Transformer is used to encode the text sentence.

3. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the semantic element identification processing method in the step S2 comprises a sigmoid function.

4. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the scaling processing method in the step S3 is multiplication at element level.

5. The semantic element-based low-resource text recognition algorithm of claim 1, wherein: the processing method for final sentence representation in step S7 includes a softmax function or a sigmoid function.