CN113822019A

CN113822019A - Text normalization method, related equipment and readable storage medium

Info

Publication number: CN113822019A
Application number: CN202111108530.4A
Authority: CN
Inventors: 戚婷; 万根顺; 高建清; 王智国; 刘聪; 胡国平
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2021-12-21
Anticipated expiration: 2041-09-22
Also published as: CN113822019B

Abstract

The application discloses a text normalization method, related equipment and a readable storage medium. And for the text to be normalized, determining global features for representing the incidence relation of each sentence in the text, and normalizing the text based on the global features of the text to obtain the normalized text. According to the scheme, when the text is regulated, the incidence relation among sentences in the text is considered, so that the text regulation effect can be improved.

Description

Text normalization method, related equipment and readable storage medium

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a text normalization method, a related device, and a readable storage medium.

Background

With the rapid development of internet technology, people are confronted with more and more texts, such as texts in various web pages, texts obtained after speech recognition and the like, and a large amount of spoken descriptions usually exist in the texts. For example, repeated content caused by unsmooth speaking or other factors when the user speaks and meaningless content caused by spoken language change (e.g., words, answering words, spoken buddhist, etc.) in a daily speaking manner when the user speaks often exist in a text obtained after speech recognition. Due to the existence of the spoken description, the text is not sufficiently written.

Therefore, how to organize the text to remove the spoken description in the text and make the organized text more written becomes a technical problem to be solved by those skilled in the art.

Disclosure of Invention

In view of the foregoing problems, the present application provides a text normalization method, a related device and a readable storage medium. The specific scheme is as follows:

a method of text normalization, the method comprising:

acquiring a text to be structured;

determining global features of the text; the global features are used for representing the incidence relation among sentences in the text;

and based on the global features of the text, the text is normalized to obtain a normalized text.

Optionally, the determining the global feature of the text includes:

determining a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text, wherein the sentence-level feature of each sentence in the text is used for representing the association relationship between the sentence and other sentences except the sentence in the text;

and splicing the word level characteristics of each sentence in the text, the sentence level characteristics of each sentence in the text and the semantic type characteristics of each sentence in the text according to sentence corresponding relations to obtain the global characteristics of the text.

Optionally, the determining the word-level features of each sentence in the text, the sentence-level features of each sentence in the text, and the semantic type features of each sentence in the text includes:

inputting the text into a text sentence classification model, wherein the text sentence classification model carries out feature extraction on the text to obtain word level features of each sentence in the text, sentence level features of each sentence in the text and semantic type features of each sentence in the text;

the text sentence classification model is obtained by training by taking training texts before normalization as training samples and taking sentence labels of each sentence in the training texts before normalization as sample labels, wherein the sentence labels are used for representing semantic types of the sentences.

Optionally, the inputting the text into a text-sentence classification model, where the text-sentence classification model performs feature extraction on the text to obtain a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text, includes:

the word coding network in the text sentence classification model carries out word level coding on each sentence in the text to obtain the word level characteristics of each sentence in the text;

a text sentence feature extraction module in the text sentence classification model, which extracts sentence-level features of each sentence in the text based on the word-level features of each sentence in the text to obtain the sentence-level features of each sentence in the text;

and the text sentence classification module in the text sentence classification model identifies sentence-level characteristics of each sentence in the text to obtain semantic type characteristics of each sentence in the text.

Optionally, the text sentence feature extraction module in the text sentence classification model performs sentence-level feature extraction on each sentence in the text based on the word-level feature of each sentence in the text to obtain the sentence-level feature of each sentence in the text, and the sentence-level feature extraction module includes:

compressing the word level characteristics of each sentence in the text to obtain first characteristics of each sentence in the text;

aggregating the first features of each sentence in the text by adopting an attention mechanism to obtain a second feature of each sentence in the text;

aggregating the first characteristics of the sentences in the text based on the interactive information among the sentences in the text to obtain third characteristics of each sentence in the text;

and splicing the first characteristic of each sentence in the text, the second characteristic of each sentence in the text and the third characteristic of each sentence in the text according to the sentence corresponding relation to obtain the sentence-level characteristic of each sentence in the text.

Optionally, the aggregating the first features of the sentences in the text by using an attention mechanism to obtain the second features of each sentence in the text includes:

constructing a graph attention network among the texts and sentences by taking the first characteristic of each sentence in the text as a node;

for each node in the text inter-sentence attention network, taking the node as a query in the attention mechanism, taking other nodes except the node in the text inter-sentence attention network as keys in the attention mechanism, and calculating attention coefficients of the nodes on the other nodes; weighting other nodes by the attention coefficients of the nodes on the other nodes to obtain new characteristics of the nodes;

and the new characteristics of each node in the text sentence-to-sentence attention network are the second characteristics of each sentence in the text.

Optionally, the aggregating the first features of the sentences in the text based on the mutual information of the sentences in the text to obtain the third feature of each sentence in the text includes:

constructing a text sentence-sentence graph interaction network by taking the first characteristic of each sentence in the text as a node;

for each node in the text sentence-to-sentence graph interaction network, calculating interaction information of the node and other nodes except the node in the text sentence-to-sentence graph interaction network; compressing the interaction information of the nodes and other nodes except the nodes in the text sentence-to-sentence graph interaction network to obtain new characteristics of the nodes;

and the new characteristics of each node in the text sentence-sentence graph interaction network are the third characteristics of each sentence in the text.

Optionally, the warping the text based on the global feature of the text to obtain a warped text, including:

inputting the global features of the text into a text normalization model, and coding and decoding the global features of the text by the text normalization model to obtain a normalized text;

the text normalization model is obtained by training by taking the global features of the training text before normalization as training samples and taking the training text after normalization, which is output by the model and approaches to the training text before normalization, as a training target.

A text normalization apparatus, the apparatus comprising:

the acquiring unit is used for acquiring a text to be structured;

a determining unit, configured to determine a global feature of the text; the global features are used for representing the incidence relation among sentences in the text;

and the normalization unit is used for normalizing the text based on the global features of the text to obtain a normalized text.

Optionally, the determining unit includes:

a multi-level feature determination unit, configured to determine a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text, where the sentence-level feature of each sentence in the text is used to represent an association relationship between the sentence and another sentence in the text except the sentence;

and the first splicing unit is used for splicing the word level characteristics of each sentence in the text, the sentence level characteristics of each sentence in the text and the semantic type characteristics of each sentence in the text according to the sentence corresponding relation to obtain the global characteristics of the text.

Optionally, the multi-stage feature determination unit includes:

a text sentence classification model processing unit, configured to input the text into a text sentence classification model, where the text sentence classification model performs feature extraction on the text to obtain a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text;

Optionally, the text sentence classification model processing unit includes:

a word level feature determination unit, configured to perform word level coding on each sentence in the text to obtain a word level feature of each sentence in the text, in the word coding network in the text sentence classification model;

a sentence-level feature determining unit, configured to perform sentence-level feature extraction on each sentence in the text based on a word-level feature of each sentence in the text to obtain a sentence-level feature of each sentence in the text, where the sentence-level feature is extracted from each sentence in the text;

and the semantic type characteristic determining unit is used for a text sentence classification module in the text sentence classification model to identify sentence-level characteristics of each sentence in the text to obtain the semantic type characteristics of each sentence in the text.

Optionally, the sentence-level feature determination unit includes:

the compression unit is used for compressing the word-level characteristics of each sentence in the text to obtain first characteristics of each sentence in the text;

the first aggregation unit is used for aggregating the first characteristics of each sentence in the text by adopting an attention mechanism to obtain the second characteristics of each sentence in the text;

the second aggregation unit is used for aggregating the first characteristics of the sentences in the text based on the interactive information among the sentences in the text to obtain third characteristics of each sentence in the text;

and the second splicing unit is used for splicing the first characteristic of each sentence in the text, the second characteristic of each sentence in the text and the third characteristic of each sentence in the text according to the sentence corresponding relation to obtain the sentence-level characteristic of each sentence in the text.

Optionally, the first polymerization unit comprises:

the text inter-sentence attention network construction unit is used for constructing a text inter-sentence attention network by taking the first characteristic of each sentence in the text as a node;

a second feature determination unit, configured to, for each node in the inter-text sentence graph attention network, calculate attention coefficients of the node on other nodes by using the node as a query in the attention mechanism and using other nodes except the node in the inter-text sentence graph attention network as keys in the attention mechanism; weighting other nodes by the attention coefficients of the nodes on the other nodes to obtain new characteristics of the nodes; and the new characteristics of each node in the text sentence-to-sentence attention network are the second characteristics of each sentence in the text.

Optionally, the second polymerization unit comprises:

the text sentence-to-sentence graph interactive network construction unit is used for constructing a text sentence-to-sentence graph interactive network by taking the first characteristic of each sentence in the text as a node;

a third feature determination unit, configured to calculate, for each node in the text sentence-to-sentence graph interaction network, interaction information between the node and a node other than the node in the text sentence-to-sentence graph interaction network; compressing the interaction information of the nodes and other nodes except the nodes in the text sentence-to-sentence graph interaction network to obtain new characteristics of the nodes; and the new characteristics of each node in the text sentence-sentence graph interaction network are the third characteristics of each sentence in the text.

Optionally, the warping unit comprises:

the text normalization model processing unit is used for inputting the global features of the text into a text normalization model, and the text normalization model encodes and decodes the global features of the text to obtain a normalized text;

A text normalization apparatus comprising a memory and a processor;

the memory is used for storing programs;

the processor is configured to execute the program to implement the steps of the text normalization method.

A readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the text-warping method as described above.

By means of the technical scheme, the application discloses a text normalization method, related equipment and a readable storage medium. And for the text to be normalized, determining global features for representing the incidence relation of each sentence in the text, and normalizing the text based on the global features of the text to obtain the normalized text. According to the scheme, when the text is regulated, the incidence relation among sentences in the text is considered, so that the text regulation effect can be improved.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, throughout the drawings, like reference characters will be used to refer to like parts. In the drawings:

FIG. 1 is a schematic flow chart of a text normalization method disclosed in an embodiment of the present application;

FIG. 2 is a schematic diagram of determining global features of a text disclosed in an embodiment of the present application;

fig. 3 is a schematic structural diagram of a text sentence classification model disclosed in an embodiment of the present application;

fig. 4 is a schematic diagram of a process in which a feature extraction module in a text sentence classification model disclosed in an embodiment of the present application performs sentence-level feature extraction on each sentence in the text based on a word-level feature of each sentence in the text to obtain a sentence-level feature of each sentence in the text;

FIG. 5 is a schematic structural diagram of a text normalization model disclosed in an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a text normalization device disclosed in an embodiment of the present application;

fig. 7 is a block diagram of a hardware structure of a text normalization apparatus disclosed in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In order to normalize a text to remove spoken descriptions in the text and make the normalized text more written, the inventors of the present invention studied, and the original idea was: and regulating the texts sentence by sentence. However, the sentence-by-sentence regularization method ignores the association relationship between sentences in the text, so that the regularization effect is limited.

In view of the problems in the above thought, the inventors of the present application have conducted intensive research, and finally have proposed a text normalization method, which considers the association relationship between each sentence in the text when normalizing the text, and improves the text normalization effect. Next, the text normalization method provided in the present application will be described by the following examples.

Referring to fig. 1, fig. 1 is a schematic flow chart of a text normalization method disclosed in an embodiment of the present application, where the method may include:

step S101: and acquiring a text to be normalized.

In the application, the text to be normalized is mainly a text containing a spoken language description, for example, a text obtained by recognizing spoken voice data may be a text in a general field, or a text customized according to an application requirement, and the present application is not limited thereto.

Step S102: global features of the text are determined.

In the present application, the global features of the text are used to characterize the association between sentences in the text.

As an implementation manner, the global features of the text include global features of each sentence in the text, and the global features of each sentence in the text are used for representing association relations between the sentence and other sentences except the sentence in the text.

Step S103: and based on the global features of the text, the text is normalized to obtain a normalized text.

In the application, after the global feature of the text is obtained, the text can be normalized based on the global feature of the text, so that the normalized text is obtained.

In the method and the device, the global features of the text can be encoded and decoded based on the neural network model, the text can be normalized, and the normalized text can be obtained. The specific implementation will be described in detail by the following examples.

The embodiment discloses a text normalization method, which determines a global feature for representing an association relation of each sentence in a text for the text to be normalized, and normalizes the text based on the global feature of the text to obtain a normalized text. When the method is used for regulating the text, the incidence relation among sentences in the text is considered, so that the regulating effect of the text can be improved.

In an embodiment of the present application, a specific implementation process of determining the global feature of the text in step S102 is described, and the process may include the following steps:

step S201: determining a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text.

It should be noted that, in the present application, the sentence-level feature of each sentence in the text is used to represent the association relationship between the sentence and the other sentences except the sentence in the text, and is obtained by aggregating the sentence and the other sentences except the sentence in the text.

In the present application, word-level features of each sentence in the text, sentence-level features of each sentence in the text, and semantic type features of each sentence in the text may be determined based on a neural network. The specific implementation will be described in detail by the following examples.

It should be noted that the semantic type feature of a sentence is used to represent the semantic type to which the sentence belongs, in the present application, the semantic type may be predefined, and as an implementable manner, a semantic type classification scheme is provided in the present application, specifically, in the scheme, the semantic type of the sentence includes five types, i.e., an insertion language, a semantic confusion, a semantic truncation, a redundancy, and a normal sentence. For ease of understanding, the following explanation of the above semantic types is provided in this application.

The insertion words refer to insertion words and questions which are irrelevant to the appearance of the textAnd external words such as the content of calls made by participants in the midway of a meeting or words which are inserted by other people and are unrelated to the overall text theme. Such as: "The wind is strong. What did i just say?For example, the consumer may be able to more systematically recognize what the aesthetic is, "; (the scribed portion is the insert);

semantic confusion refers to semantic confusion caused by the fact that the semantics cannot be understood or the sentence transcription result has problems and the corresponding audio content is unclear. For example, the teacher has a relation of what relatives are related to the country or nothing;

redundancy, meaning the presence of repeated expressions in the text or later amendments to the preamble. If' one or one piece is read, then one piece is read. "or" our Friday, not, or Saturday about Bar. "

Semantic truncation refers to a phenomenon that sentence semantics in a text is incomplete and truncated due to factors such as hesitation, thinking or equipment failure caused by insufficient fluency of a speaker during expression. Such as "he. "We plan a tomorrow meeting, then. "

Normal sentences, which refer to text sentences having definite semantics and being easy to understand without redundancy, in particular allow the existence of linguistic words in the sentences that do not have any practical meaning, such as: heu, ao, hiccup, hey, cha, heu, hao, , hao, wa, etc. allowing the existence of pause words without practical meaning in sentences, such as this, that, is, good, then, that, so, then, etc.

It should be noted that the semantic type classification scheme is a feasible specific classification scheme provided by the present application, and is a classification scheme with a good effect obtained by analyzing and summarizing a large number of texts collected in advance, but the classification scheme is not a theoretically only feasible scheme. For example, 5 types given in the semantic type classification scheme may be refined, for example, semantic confusion may further perform fine-grained division on a sentence text semantic type according to a reason of the semantic confusion, and other types, such as inversion of a sentence, ambiguity of reference, component missing, and the like, may be additionally added in addition to the 5 types given in the sentence text semantic type classification scheme.

Step S202: and splicing the word level characteristics of each sentence in the text, the sentence level characteristics of each sentence in the text and the semantic type characteristics of each sentence in the text according to sentence corresponding relations to obtain the global characteristics of the text.

For convenience of understanding, referring to fig. 2, fig. 2 is a schematic diagram of determining a global feature of a text disclosed in an embodiment of the present application, and as can be seen from fig. 2, a word-level feature of each sentence in the text shown on the left side of fig. 2, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text are spliced according to a sentence correspondence relationship, so that the global feature of the text shown on the right side of fig. 2 is obtained. It should be noted that, for each word in the ith sentence, the corresponding sentence-level features are all c_iThe semantic type features are all p_i。

In the present application, a large amount of pre-structured texts may be collected in advance, for example, a large amount of spoken voice data (e.g., interview voice data) may be collected, corresponding texts may be obtained after voice recognition is performed, or texts containing spoken language descriptions, such as texts containing spoken language descriptions of characters in a novel, may be directly collected. And regularizing the collected text before regularization (e.g., removing the spoken language description in the text before regularization) to obtain a regularized text. And selecting part or all of the collected large amount of the front regularization texts as front regularization training texts, selecting back regularization training texts corresponding to the front regularization training texts from the back regularization texts corresponding to the collected large amount of the front regularization texts, and training the text sentence classification model and the text regularization model.

In the present application, before training the text sentence classification model and the text normalization model based on the training text before normalization and the corresponding training text after normalization, the training text before normalization and the training text after normalization need to be preprocessed.

The preprocessing of the training text before normalization comprises the steps of carrying out sentence segmentation processing on the training text before normalization, processing each sentence in the training text before normalization into an input sequence form of a participle, and adding a sentence label to each sentence in the training text before normalization, wherein different sentence labels represent different semantic types. The preprocessing of the warped training text includes processing each sentence in the warped training text into an input sequence of segmented words.

In the application, the training text before normalization is subjected to sentence division processing, sub-sentences or whole sentences can be divided according to punctuation marks, and the training text can also be divided according to different VAD (Voice Activity Detection) sections according to identification VAD (Voice Activity Detection), and the scheme does not make specific requirements. The input sequence form of the participle is processed for each sentence in the training text before and after training, the existing participle technology can be adopted, and the details are not described here.

For ease of understanding, training text before structured is "for kay, all of the other brands, but this, he. He does this. What all that said, this software is more beautiful flower, so he can cut that I add the intermediate form and finish. One in each case, or one in each case. "the following sentence is obtained after sentence division processing:

"other brands are all because of a kah, but this is him, this is him. "

"he this. "

What's more, the software is more beautiful, so that the user can delete the added intermediate form and finish the process. "

"have one, read one, have one read one. "

The sentence "for kay, all of the other brands are, but this is, he is. For example, the input sequence form of the participle is:

"because/Thy/,/other/Brand/is/all/has/The/but/he/this/,/he/this/. ".

After preprocessing of the training text before and after warping is completed, a text sentence classification model and a text warping model can be trained, wherein the text sentence classification model is obtained by training with the training text before warping as a training sample and sentence labels of each sentence in the training text before warping as sample labels, a cross entropy based loss function, a negative log likelihood loss function and the like can be adopted, the specific implementation mode is consistent with the existing process, and the scheme is not limited.

As an implementation manner, the determining the word-level feature of each sentence in the text, the sentence-level feature of each sentence in the text, and the semantic type feature of each sentence in the text based on the trained text sentence classification model may be: inputting the text into a text sentence classification model, and performing feature extraction on the text by the text sentence classification model to obtain word level features of each sentence in the text, sentence level features of each sentence in the text, and semantic type features of each sentence in the text.

It should be noted that before the text is input into the text sentence classification model, the text needs to be preprocessed, specifically, the text needs to be divided into sentences, and each sentence in the text is processed into an input sequence form of a word. For a specific manner, reference may be made to the foregoing detailed description of the preprocessing manner of the training text before warping, and details are not described here again.

As an implementable mode, based on a text normalization model obtained by training, the process of normalizing the text based on the global features of the text to obtain a normalized text may be as follows:

and inputting the global features of the text into a text normalization model, and coding and decoding the global features of the text by the text normalization model to obtain a normalized text.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a text sentence classification model disclosed in an embodiment of the present application, where the model includes a word coding network, a text sentence feature extraction module, and a text sentence classification module. Wherein:

the word coding network inputs the preprocessed pre-structured text and outputs the pre-structured text as word-level features of each sentence in the pre-structured text, wherein the word-level features of each sentence comprise the word features of each word in the sentence. The network structure of the word coding network can utilize an encoder partial model under a Transformer scheme, and the specific process is the same as the prior art and is not detailed herein.

The text sentence characteristic extraction module inputs the word-level characteristics of each sentence in the text before normalization and outputs the sentence-level characteristics of each sentence in the text before normalization. The specific implementation of the text sentence feature extraction module will be described in detail through the following embodiments.

The text sentence classification module inputs the sentence-level characteristics of each sentence in the text before normalization and outputs the semantic type characteristics of each sentence in the text before normalization.

In the present application, the semantic type characteristic of each sentence in the text before normalization is the probability that each sentence corresponds to a predefined sentence tag. The network structure of the text sentence classification module is not limited in the application, and the prior art such as a nonlinear transformation and a softmax layer can be adopted.

As an implementation manner, based on the text sentence classification model shown in fig. 3, the process of inputting the text into the text sentence classification model, where the text sentence classification model performs feature extraction on the text to obtain a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text may include the following steps:

step 301: and the word coding network in the text sentence classification model carries out word level coding on each sentence in the text to obtain the word level characteristics of each sentence in the text.

Step 302: and the text sentence characteristic extraction module in the text sentence classification model extracts sentence-level characteristics of each sentence in the text based on the word-level characteristics of each sentence in the text to obtain the sentence-level characteristics of each sentence in the text.

Step 303: and the text sentence classification module in the text sentence classification model identifies sentence-level characteristics of each sentence in the text to obtain semantic type characteristics of each sentence in the text.

In an embodiment of the present application, a process of performing sentence-level feature extraction on each sentence in the text based on the word-level features of each sentence in the text by the in-text-sentence feature extraction module in the text-sentence classification model in step S302 to obtain the sentence-level features of each sentence in the text is described in detail, for easy understanding, referring to fig. 4, which is a schematic diagram of a process of performing sentence-level feature extraction on each sentence in the text based on the word-level features of each sentence in the text by the in-text-sentence feature extraction module in the text-sentence classification model disclosed in the embodiment of the present application to obtain the sentence-level features of each sentence in the text, and in conjunction with fig. 4, performing sentence-level feature extraction on each sentence in the text based on the word-level features of each sentence in the text by the in step S302, the steps involved in the process of obtaining sentence-level features for each sentence in the text are explained as follows:

step S401: and compressing the word-level characteristics of each sentence in the text to obtain the first characteristics of each sentence in the text.

W in FIG. 4₁₁,W₁₂…,W_m1Representing word characteristics of each word in the 1 st sentence in the text, where m₁Denotes the number of words contained in the 1 st sentence, and, similarly, W₂₁,W₂₂…,W_m2Representing word characteristics of each word in the 2 nd sentence in the text, where m₂Representing the number of words contained in the 2 nd sentence, up to m in the n-th sentence_nThe characteristics of the individual words are set,

n denotes a total of n sentences after the text preprocessing.

In the present application, the word-level features of each sentence in the input text may be compressed to obtain the first features with fixed length, which corresponds to all the word features W of the 1 st sentence in fig. 4₁₁,W₁₂…,W_mAfter compression, the first feature s of the 1 st sentence is obtained₁. By analogy, the first feature of the 1 st sentence to the nth sentence in the text can be represented as a sequence s₁,s₂…,s_n. The network structure for compressing the word characteristics of a plurality of words is not limited, and the prior art such as attention mechanism or pooling can be adopted.

Step S402: and aggregating the first features of the sentences in the text by adopting an attention mechanism to obtain second features of each sentence in the text.

There may be many associations between different sentences in the text, and it is difficult to describe all the associations with a single attention. In the application, the first characteristic of each sentence in the text can be used as a node to construct a graph attention network among the sentences of the text; for each node in the text inter-sentence attention network, taking the node as a query in the attention mechanism, taking other nodes except the node in the text inter-sentence attention network as keys in the attention mechanism, and calculating attention coefficients of the nodes on the other nodes; weighting other nodes by the attention coefficients of the nodes on the other nodes to obtain new characteristics of the nodes; and the new characteristics of each node in the text sentence-to-sentence attention network are the second characteristics of each sentence in the text.

For ease of understanding, referring to FIG. 4, a network of graphical attention between sentences of text is formed by relating the first feature s of each sentence in the text_iA fully connected graph is established by regarding the nodes in the graph neural network. Aggregating information of neighbor nodes by adopting an attention mechanism for each node, and finally outputting a second characteristic h concerning other sentences for each sentence in the text₁,h₂…,h_n. Take sentence i as an example, take s_iFor query in the attention mechanism, the first features of other n-1 sentences are used as keys in the attention mechanism, the attention coefficient on each other sentence is calculated, finally the first features of the other sentences are weighted by the calculated attention coefficient, and finally the second feature h of the ith sentence is output by the text sentence inter-sentence attention network_i. The attention mechanism calculation method and the network structure are not specifically required in the present application, and a multi-head attention or an additive attention based on a product can be adopted, and the specific implementation manner is the same as that of the prior art, and is not described in detail herein.

Step S403: and aggregating the first characteristics of the sentences in the text based on the interactive information among the sentences in the text to obtain the third characteristics of each sentence in the text.

Specifically, in the application, the first feature of each sentence in the text can be used as a node to construct a text inter-sentence graph interaction network; for each node in the text sentence-to-sentence graph interaction network, calculating interaction information of the node and other nodes except the node in the text sentence-to-sentence graph interaction network; compressing the interaction information of the nodes and other nodes except the nodes in the text sentence-to-sentence graph interaction network to obtain new characteristics of the nodes; and the new characteristics of each node in the text sentence-sentence graph interaction network are the third characteristics of each sentence in the text.

The text sentence-to-sentence graph interaction network, similar to the text sentence-to-sentence graph attention network described above, still combines the first feature s of each sentence in the text_iA fully connected graph is established by regarding the nodes in the graph neural network. The difference is that the inter-text sentence graph attention network uses graph-based attention weights to aggregate the first features of all sentences in the text, whereas the inter-text sentence graph interaction network explicitly describes the association between different sentences in the text by computing the interaction of every two sentence features. The interactive information between the ith sentence and the jth sentence is obtained through splicing, difference and product, and calculation is carried outThe formula is as follows:

u_ij＝RELU(W_u[s_i；s_j；s_i-s_j；dot(s_i,s_j)])

wherein u is_ijRepresenting the interactive information between the ith sentence and the jth sentence calculated by the graph interactive network between the text sentences, RELU (a) representing an activation function, W_uRepresenting network learnable linear transformation parameters, s_i；s_jA concatenation, s, representing a first characteristic of clauses i and j_i-s_jDifference, dot(s), representing the first characteristic of clauses i and j_i,s_j) A dot product representing the first features of clause i and clause j.

As shown in FIG. 4, the text inter-sentence graph interaction network finally obtains a second characteristic u of the ith sentence for each sentence_iIs represented by mixing n-1 u_ij，j＝[1,n]J ≠ i, which is a feature of fixed length obtained by compressing n-1 u_ijThe network structure for compression is not limited, and the prior art such as attention mechanism or pooling can be adopted.

Step S404: and splicing the first characteristic of each sentence in the text, the second characteristic of each sentence in the text and the third characteristic of each sentence in the text according to the sentence corresponding relation to obtain the sentence-level characteristic of each sentence in the text.

As shown in fig. 4, through the processing of step S401, step S402, and step S403, the first feature S is extracted for each ith sentence in the text_iSecond feature h_iAnd a third feature u_iThrough the splicing operation, the sentence-level characteristics c of the sentence can be obtained_iNamely: c. C_i＝[s_i,h_i,u_i]。

Referring to fig. 5, fig. 5 is a schematic structural diagram of a text-normalization model disclosed in an embodiment of the present application, and as shown in fig. 5, the model includes a text-normalization coding module and a text-normalization decoding module, and the input of the model is a global feature of a text determined based on the above embodiment. The model output is a structured text.

The following describes a text-normalization device disclosed in an embodiment of the present application, and the text-normalization device described below and the text-normalization method described above may be referred to correspondingly.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a text normalization device disclosed in an embodiment of the present application. As shown in fig. 6, the text-normalization apparatus may include:

an obtaining unit 61, configured to obtain a text to be structured;

a determining unit 62 for determining global features of the text; the global features are used for representing the incidence relation among sentences in the text;

and a regularizing unit 63, configured to regularize the text based on the global features of the text to obtain a regularized text.

As an implementable manner, the determining unit includes:

As an implementation, the multi-stage feature determination unit includes:

As an implementable manner, the text sentence classification model processing unit includes:

As an implementable embodiment, the sentence-level feature determination unit includes:

As an embodiment, the first polymerization unit includes:

As an embodiment, the second polymerization unit comprises:

As an embodiment, the warping unit comprises:

It should be noted that, for the detailed description of the functions of the above units, reference may be made to the related description of the method embodiment, and details are not repeated here.

Referring to fig. 7, fig. 7 is a block diagram of a hardware structure of a text normalization device according to an embodiment of the present application, and referring to fig. 7, the hardware structure of the text normalization device may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;

in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete mutual communication through the communication bus 4;

the processor 1 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present invention, etc.;

the memory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;

wherein the memory stores a program and the processor can call the program stored in the memory, the program for:

acquiring a text to be structured;

Alternatively, the detailed function and the extended function of the program may be as described above.

Embodiments of the present application further provide a readable storage medium, where a program suitable for being executed by a processor may be stored, where the program is configured to:

acquiring a text to be structured;

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for text normalization, the method comprising:

acquiring a text to be structured;

2. The method of claim 1, wherein the determining the global feature of the text comprises:

3. The method of claim 2, wherein determining the word-level features of each sentence in the text, the sentence-level features of each sentence in the text, and the semantic type features of each sentence in the text comprises:

4. The method of claim 3, wherein inputting the text into a text sentence classification model, wherein the text sentence classification model performs feature extraction on the text to obtain a word-level feature of each sentence in the text, a sentence-level feature of each sentence in the text, and a semantic type feature of each sentence in the text, comprises:

5. The method of claim 4, wherein the step of extracting the sentence-level features of each sentence in the text based on the word-level features of each sentence in the text by the text-sentence feature extraction module in the text-sentence classification model comprises:

6. The method of claim 5, wherein the aggregating the first features of the sentences in the text using the attention mechanism to obtain the second features of each sentence in the text comprises:

7. The method according to claim 5, wherein the aggregating the first features of the sentences in the text based on the mutual information of the sentences in the text to obtain the third features of each sentence in the text comprises:

8. The method according to claim 1, wherein the warping the text based on the global feature of the text to obtain warped text comprises:

9. A text normalization apparatus, the apparatus comprising:

the acquiring unit is used for acquiring a text to be structured;

10. A text normalization apparatus comprising a memory and a processor;

the memory is used for storing programs;

the processor, configured to execute the program, implementing the steps of the text normalization method according to any one of claims 1 to 8.

11. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the text normalization method according to any one of claims 1 to 8.