CN110874408B

CN110874408B - Model training method, text recognition device and computing equipment

Info

Publication number: CN110874408B
Application number: CN201810996981.8A
Authority: CN
Inventors: 任巨伟; 赵伟朋; 周伟
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-08-29
Filing date: 2018-08-29
Publication date: 2023-05-26
Anticipated expiration: 2038-08-29
Also published as: CN110874408A

Abstract

The embodiment of the application provides a model training method, a text recognition device and computing equipment. When the model is trained, semantic level features of a target training text are generated based on semantic information of the target training text, so that word level features and semantic level features of the target training text are fused to obtain text features, a text recognition model is trained based on the text features of the target training text, the text recognition model obtained through training can specifically perform text recognition based on the text features of the text to be processed, the text features of the text to be processed are obtained by fusing the semantic level features and the word level features of the text to be processed, and the semantic level features are added on the basis of the original word level features in the embodiment of the application, so that the text recognition model can perform text recognition in terms of semantics, and the model recognition accuracy is improved.

Description

Model training method, text recognition device and computing equipment

Technical Field

The embodiment of the application relates to the technical field of computer application, in particular to a model training method, a text recognition device and computing equipment.

Background

With the development of man-machine interaction technology, man-machine dialogue is widely applied in many scenes, namely, based on user input sentences, the man-machine dialogue can intelligently output corresponding response contents, and the man-machine dialogue looks like the dialogue between a user and equipment.

In the current scheme for realizing man-machine conversation, the user input sentence is usually matched with the < Q, A > data in the knowledge base, wherein Q is a knowledge point in the knowledge base, namely, a standard text expressed by a standard term is adopted, A is response content corresponding to the knowledge point, and the knowledge point matched with the user input sentence can be searched from the knowledge base based on similarity, so that the corresponding response content can be found.

Therefore, how to accurately identify knowledge points matched with a sentence input by a user is a key technology for improving the accuracy of man-machine conversation, one existing method is to use a machine learning model to identify, when text identification is performed by using the machine learning model, the text needs to be converted into vector representation, a common method is to divide the text into words, and then encode the words by using a method such as one-hot (single hot encoding) to obtain word level features, wherein the word level features are taken as the vector representation of the text and input into the machine learning model, but the word level features often ignore the association between words, so that the accuracy of model identification can be affected.

Disclosure of Invention

The embodiment of the application provides a model training method, a text recognition device and computing equipment, which are used for solving the technical problem of low model recognition accuracy in the prior art.

In a first aspect, an embodiment of the present application provides a model training method, including:

determining word level characteristics corresponding to the target training text;

determining semantic level features of the target training text based on semantic information of the target training text;

fusing the semantic level features and word level features of the target training text to obtain text features of the target training text;

and training a text recognition model by utilizing the text characteristics of the target training text.

In a second aspect, an embodiment of the present application provides a text recognition method, including:

determining word level characteristics of a text to be processed;

determining semantic level features of the text to be processed based on semantic information of the text to be processed;

fusing word level features and semantic level features of the text to be processed to obtain text features of the text to be processed;

identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing word level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In a third aspect, an embodiment of the present application provides a model training method, including:

determining character level features of a target training text based on characters of the target training text;

fusing the semantic level features and character level features of the target training text to obtain text features of the target training text;

In a fourth aspect, an embodiment of the present application provides a text recognition method, including:

determining character level characteristics of a text to be processed based on characters of the text to be processed;

fusing the character level features and the semantic level features of the text to be processed to obtain text features of the text to be processed;

identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing character level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In a fifth aspect, in an embodiment of the present application, a model training method is provided, including:

determining N-element model level characteristics of a target training text based on N-element word segmentation of the target training text;

In a sixth aspect, an embodiment of the present application provides a text recognition method, including:

determining N-element model level characteristics of a text to be processed based on N-element word segmentation of the text to be processed;

fusing the N-element model level features and the semantic level features of the text to be processed to obtain text features of the text to be processed;

identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing N-element model level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In a seventh aspect, in an embodiment of the present application, there is provided a model training apparatus, including:

the first training feature determining module is used for determining word level features corresponding to the target training text;

the second training feature determining module is used for determining semantic level features of the target training text based on semantic information of the target training text;

the training feature fusion module is used for fusing the semantic level features and the word level features of the target training text to obtain text features of the target training text;

and the model training module is used for training a text recognition model by utilizing the text characteristics of the target training text.

In an eighth aspect, in an embodiment of the present application, there is provided a text recognition apparatus, including:

the first text feature determining module is used for determining word level features of the text to be processed;

the second text feature determining module is used for determining semantic level features of the text to be processed based on semantic information of the text to be processed;

the text feature fusion module is used for fusing word level features and semantic level features of the text to be processed to obtain text features of the text to be processed;

the text recognition module is used for recognizing the text to be processed by using a text recognition model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing word level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In a ninth aspect, embodiments of the present application provide a computing device, including a processing component and a storage component;

the storage component stores one or more computer instructions; the one or more computer instructions are to be invoked for execution by the processing component;

the processing assembly is configured to:

In a tenth aspect, embodiments of the present application provide a computing device, including a processing component and a storage component;

the processing assembly is configured to:

determining word level characteristics of a text to be processed;

In the embodiment of the application, when model training is performed, for a target training text, semantic level features of the target training text are generated based on semantic information of the target training text, so that word level features and semantic level features of the target training text are fused to obtain text features, and a text recognition model is trained based on the text features of the target training text, so that the text recognition model obtained through training can be specifically recognized based on the text features of the text to be processed, and the text features of the text to be processed are fused to obtain the text features by the semantic level features and the word level features.

These and other aspects of the present application will be more readily apparent from the following description of the embodiments.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 illustrates a flow chart of one embodiment of a model training method provided herein;

FIG. 2 is a schematic diagram of a network structure of a text recognition model in a practical application according to an embodiment of the present application;

FIG. 3 illustrates a flow chart of one embodiment of a text recognition method provided herein;

FIG. 4 illustrates a flow chart of yet another embodiment of a text recognition method provided herein;

FIG. 5 is a schematic diagram of one embodiment of a model training apparatus provided herein;

FIG. 6 illustrates a schematic diagram of one embodiment of a computing device provided herein;

FIG. 7 is a schematic diagram illustrating the construction of one embodiment of a text recognition device provided herein;

fig. 8 illustrates a schematic diagram of a configuration of yet another embodiment of a computing device provided herein.

Detailed Description

In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application.

In some of the flows described in the specification and claims of this application and in the foregoing figures, a number of operations are included that occur in a particular order, but it should be understood that the operations may be performed in other than the order in which they occur or in parallel, that the order of operations such as 101, 102, etc. is merely for distinguishing between the various operations, and that the order of execution is not by itself represented by any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.

The technical scheme of the embodiment of the application can be applied to text recognition application scenes such as text matching, text classification and the like, and the text matching can be applied to scenes such as man-machine dialogue, information retrieval, problem discovery, public opinion monitoring and the like, wherein the man-machine dialogue can comprise intelligent question answering, robot customer service, chat robots and the like, and the text classification can be applied to scenes such as garbage filtering, news classification, part-of-speech labeling, intention recognition and the like.

In the above-mentioned text recognition scenario, text recognition is usually performed by using a text recognition model at present, where the text recognition model is a machine learning model, for example, it may use various neural network models, and the input of the text recognition model is a vector representation of text, and in the prior art, the text vector is usually a word level feature of the text, that is, words in the text are encoded by using an encoding mode such as one-hot. However, the word level feature is generated based on words of the text, is a word bag model, words are independent of each other, and no relation between the words is considered, so that the word level feature is discrete and sparse, and the words in the text in practical application are generally mutually influenced, so that the model recognition accuracy is also influenced.

The inventor finds that, because word level features are discrete and sparse and words are mutually independent, the word level features lose context information, so that the word level features cannot accurately express semantic information of a text, core information of the text is possibly lost, and model recognition inaccuracy is caused. Therefore, in order to improve the accuracy of model recognition, the inventor proposes a technical scheme of the application through a series of researches, in the embodiment of the application, when model training is performed, semantic level features of a target training text are generated based on semantic information of the target training text, so that word level features and semantic level features of the target training text are fused to obtain text features, and text features of the target training text are based on the text features of the target training text, a text recognition model is trained, so that the text recognition model can specifically recognize the text features of the text to be processed, the text features of the text to be processed are obtained by fusing the semantic level features and word level features of the text to be processed, and in the embodiment of the application, the semantic level features are added on the basis of original word level features, so that the text features can express the semantic information of the text more accurately, and therefore the recognition accuracy of the text recognition model can be improved.

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

Fig. 1 is a flowchart of one embodiment of a model training method provided in the embodiments of the present application, where the method may include the following steps:

101: and determining word level characteristics corresponding to the target training text.

The training of the text recognition model usually requires a large number of training texts, and the target training text may refer to any training text in the large number of training texts participating in the training text recognition model, that is, each training text needs to extract its text features according to the technical scheme described in the embodiment, and then participates in model training.

The word level features are obtained based on word codes of the target training text, alternatively, the words of the target training text can be used as discrete features, and the word level features are obtained through codes. In practical application, the method can be obtained by adopting a one-hot (single hot coding) mode, and certainly, the specific coding mode is not limited in the embodiment of the application, and only any mode for obtaining the vector representation representing the target training text based on the words of the target training text can be adopted.

The word of the target training text can be obtained by word segmentation of the target training text by using a word segmentation technology, the specific word segmentation mode is the same as that of the prior art, and the embodiment of the application is not limited in detail.

102: and determining semantic level features of the target training text based on the semantic information of the target training text.

In the embodiment of the application, the semantic information of the target training text is determined, and the semantic level characteristics of the target training text are obtained through encoding based on the semantic information of the target training text. The semantic information of the text can be represented by the semantic level features, the semantic information can express the core meaning or concept of the target training text, the core meaning or concept can be formed by the core keywords or core phrases of the target training text, and the like, and the association relations among words are represented, so that the semantic level features determined based on the semantic information can express the association relations among words.

103: and fusing the semantic level features and the word level features of the target training text to obtain the text features of the target training text.

104: and training a text recognition model by utilizing the text characteristics of the target training text.

In the embodiment of the application, the semantic level features and the word level features are fused to obtain the text features, and the semantic level features can express semantic information of the target training text, so that the text features can also contain semantic information established among words instead of the features of single words, the model obtained through training is more accurate, the model can carry out text recognition from the semantic level, and the model recognition accuracy is improved.

Alternatively, the semantic level features and the word level features may be spliced to obtain text features of the target training text.

Since both the semantic level features and the word level features are vector representations, vector stitching can be performed from the dimension to obtain the text features.

Of course, the embodiment of the present application does not specifically limit the fusion manner, and any other manner that can be fused, such as vector addition, multiplication, etc., besides the splicing manner, should fall within the protection scope of the embodiment of the present application.

In the text classification scene, the text recognition model may be a classifier for recognizing the category or the domain to which the text to be processed belongs. The text recognition model may be implemented in a variety of ways, such as a neural network framework, SVM ((Support Vector Machine, support vector machine), etc.

In a text matching scenario, a text recognition model is used to recognize a target text matching a text to be processed, such as in a human-machine dialogue scenario, to recognize knowledge points matching the text to be processed. The text recognition model may be a deep learning model implemented using a neural network framework, e.g., the text recognition model may be a CNN (Convolutional Neural Network ), RNN (Recurrent Neural Network, recurrent neural network), or DSSM (Deep Structured Semantic Models, deep structured semantic model), or the like.

As a possible implementation manner, the determining, based on the semantic information of the target training text, the semantic level feature of the target training text may include:

determining at least one semantic unit in the target training text;

determining semantic level features of the target training text based on the at least one semantic unit.

The semantic units in the target training text are composed of phrases or phrases in the target training text, wherein the phrases or phrases can express core information or concepts of the target training text.

Optionally, the determining at least one semantic unit in the target training text may include:

Determining a plurality of training texts corresponding to the target categories to which the target training texts belong;

taking words in the training texts as terms, and carrying out frequent term set mining on the training texts to obtain at least one frequent term set corresponding to the target category;

determining at least one target frequent item set for the target training text hit from the at least one frequent item set;

combining the items in each target frequent item set to obtain a semantic unit to obtain at least one semantic unit representing the target training text semantic information.

The target class may refer to a type, a field, or other classification characteristics of the target training text in the text classification application scenario.

In the text matching application scene, the target category may refer to a category corresponding to a standard text matched with the target training text, and the plurality of training texts corresponding to the standard text may refer to texts matched with the standard text, because in practical application, usually, the same meaning will have a plurality of expression descriptions, the plurality of training texts and the standard text are description sentences expressing the same semantic information, the standard text adopts professional term expressions, such as knowledge points in a knowledge base in a man-machine conversation scene, and the plurality of training texts may be popular expressions of meanings conveyed by the standard text, so in the man-machine conversation scene, the text recognition model of the embodiment of the application can be used for recognizing sentences input by popular expression of a user, finding knowledge points matched with popular expression sentences of the user, and further finding corresponding response contents.

In this possible implementation manner, a frequent item set mining technology is adopted to mine a frequent item set from multiple training texts, and since multiple training texts belong to the same category and may contain the same semantic information, phrases or phrases with higher occurrence frequency in multiple training texts can represent the semantic information of the multiple training texts, at least one frequent item set can be obtained through the frequent item set mining.

When frequent item set mining is carried out, specifically, each training text is taken as a transaction, and words in the training text are taken as items in the transaction for mining.

Wherein, specifically, FP-Growth (Frequent Pattern Growth ) algorithm may be used to perform frequent item set mining on the plurality of training texts. The FP-Growth algorithm is a frequent item set mining algorithm that adopts a divide-and-conquer strategy to mine frequent item sets by constructing an FP-tree (frequent pattern tree) structure.

In an optional manner, the mining the frequent item set of the plurality of training texts with the words in the plurality of training texts as terms to obtain at least one frequent item set corresponding to the target category may include:

taking words in the training texts as terms, and carrying out frequent term set mining on the training texts to obtain at least one alternative frequent term set;

And screening at least one frequent item set with the mining support degree larger than a first preset value and the mining frequency larger than a second preset value from the at least one alternative frequent item set.

The mining frequency may refer to the number of co-occurrences of a plurality of items in the frequent item set in a plurality of training texts, and the second preset value may be 50, for example.

The mining support may refer to a ratio of a product of a number of co-occurrences of a plurality of items in the frequent item set and a number of occurrences of each item in the plurality of training texts, for example, assuming that one frequent item set includes two items a and B, a occurs in the plurality of training texts for a number of occurrences a, B occurs in the plurality of training texts, and a and B occur in the plurality of training texts for c, the mining support is c/(a×b).

In yet another alternative, mining the frequent item set of the plurality of training texts with the words in the plurality of training texts as items to obtain at least one frequent item set corresponding to the target category may include:

And screening at least one frequent item set with the number of items greater than a specific threshold from the at least one alternative frequent item set.

Wherein, optionally, the method may be to screen at least one frequent item set with the mining support degree greater than a first preset value, the mining frequency greater than a second preset value, and the number of items greater than a specific threshold from the at least one alternative frequent item set.

To avoid frequent item sets of individual words, the specific threshold may be, for example, 1.

taking words in the training texts as terms, and carrying out frequent term set mining on the training texts to obtain at least one alternative frequent term set corresponding to the target category;

determining an Information Gain (IG) of each candidate frequent item set relative to the at least one candidate frequent item set;

at least one frequent item set is obtained by screening from the at least one frequent item set based on the information gain of each alternate frequent item set.

According to the technical scheme of the embodiment of the application, at least one alternative frequent item set can be obtained by mining aiming at a plurality of training texts under each category, and the same alternative frequent item set can correspond to one or a plurality of categories. In order to distinguish different categories, in the embodiment of the present application, an information gain of each candidate frequent item set relative to at least one candidate frequent item set corresponding to the target category may be calculated, where a larger information gain indicates a larger distinguishing property of the candidate frequent item set relative to the target category, so that at least one frequent item set may be obtained by screening from the at least one candidate frequent item set according to an order of from a larger information gain to a smaller information gain.

Specifically, top-k frequent item sets before screening can be adopted, and k is a positive integer greater than 0.

The information gain can be calculated specifically according to the following formula:

IG(T)＝H(C)-H(C|T)

wherein T represents an alternative frequent item set, and H (C) represents the information amount when the at least one alternative frequent item set contains T; h (C|T) represents the information amount when the at least one alternative frequent item set does not contain T, and the difference between the two represents the information gain caused by the T as the at least one alternative frequent item set.

Additionally, in some embodiments, because word level features not only fail to express word-to-word context information, but also generally lose word order information, optionally, the combining the items in each target frequent item set to obtain one semantic unit to obtain at least one semantic unit representing semantic information of the target training text includes:

Combining the items in each target frequent item set according to the appearance sequence in the target training text to obtain a semantic unit so as to obtain at least one semantic unit representing the semantic information of the target training text.

Further, in some embodiments, determining at least one target frequent item set for the target training text hit from the at least one frequent item set may include:

and selecting the frequent item set, each item of which is contained in the target training text, from the at least one frequent item set as a target frequent item set hit by the target training text so as to obtain at least one target frequent item set.

That is, each item in the target frequent item set hit in the target training text appears in the target training text, and each item in the target frequent item set is referred to as a term.

In this embodiment of the present application, besides determining the semantic unit of the target training text by using a frequent item set manner, other manners, such as semantic mining, may be used to identify, as the semantic unit, a keyword group or phrase that expresses core information of the target training text.

The target training text may correspond to one or more semantic units, each of which may be composed of a phrase or phrase.

In some embodiments, determining semantic level features of the target training text based on the at least one semantic unit may include:

and taking each semantic unit as a discrete feature, and encoding to obtain the semantic level feature of the target training text. For example, the method can be obtained by using a one-hot mode coding.

The corpus corresponding to the one-hot mode can be composed of characters, words, phrases and/or the like, so that the corpus can code words of the target training text and can code semantic units.

Furthermore, in addition to extracting semantic units representing semantic information of the target training text, other ways of mining semantic expressions of the target training text may be adopted, so as a further possible implementation manner, the determining semantic level features corresponding to the semantic information of the target training text may include:

identifying Topic distribution probabilities corresponding to the target training text by using a Topic Model (Topic Model);

and taking the topic distribution probability as a semantic level characteristic of the target training text.

Semantic association between words can be mined through the topic model, and topic distribution probability can represent semantic information of the target training text.

The text can be automatically analyzed by using the topic model, words in the text are counted, which topics are contained in the text and the proportion of each topic are judged according to the counted information, and the topic distribution probability comprises the proportion value corresponding to each topic.

Since the topic distribution probability includes a proportion value corresponding to each topic, the topic distribution probability can be expressed as a vector, and each proportion value is the proportion of the target training text corresponding to the topic, so that the topic distribution probability can be directly used as a semantic level feature and fused with a word level feature to obtain a text feature.

Alternatively, the topic model may be implemented using an LDA (Latent Dirichlet Allocation, implicit dirichlet distribution) model.

Additionally, to further improve model recognition accuracy, optionally, in some embodiments, the method may further include:

determining character level features of the target training text based on characters of the target training text;

if the target training text is Chinese, the character may refer to a single word in the target training text. If the target training text is a language consisting of letters, such as english, the character may refer to a single letter or a combination of letters that cut the word, etc.

The fusing the semantic level features and the word level features of the target training text to obtain text features of the target training text may include:

and fusing the semantic level features, the character level features and the word level features of the target training text to obtain the text features of the target training text.

In certain embodiments, the method may further comprise:

determining N-gram level features of an N-gram model of the target training text based on N-gram segmentation of the target training text;

where N-Gram is a language model that assumes that the occurrence of the N-th word in a text is related to only the preceding N-1 words, but not to any other words, N may be an integer of 2 or more, and is typically binary Bi-Gram and ternary Tri-Gram.

And by adopting the N-gram, the target training text can be segmented to obtain N-element segmentation.

For example, taking binary Bi-Gram as an example, for a text "do you vacate today", its binary word segmentation is in turn: "younger, today, holiday, fake, did.

And fusing the semantic level features, the N-gram level features and the word level features of the target training text to obtain the text features of the target training text. In certain embodiments, the method may further comprise:

determining N-gram level characteristics of the target training text based on N-gram word segmentation of the target training text;

the step of fusing the semantic level features and the word level features of the target training text to obtain the text features of the target training text comprises the following steps:

and fusing the semantic level features, the character level features, the N-gram level features and the word level features of the target training text to obtain the text features of the target training text.

By adding character level features and N-gram level features, the accuracy of text features can be further ensured, so that the text can be more accurately represented.

In addition, based on the character level features, the embodiment of the application also provides a model training method, which can include:

Namely, aiming at the target training text, the character level features and the semantic level features of the target training text can be extracted, and the character level features and the semantic level features are fused to obtain text features. Particularly, when the target training text is Chinese, the characters are single characters, and the word segmentation result is uncontrollable due to various modes of Chinese word segmentation, so that character level characteristics obtained based on single character coding can be adopted.

In addition, based on the N-gram level features, the embodiment of the application also provides a model training method, which can comprise the following steps:

determining N-gram level characteristics of a target training text based on N-gram word segmentation of the target training text;

Namely, aiming at a target training text, the N-gram level features and the semantic level features of the target training text can be extracted, and the N-gram level features and the semantic level features are fused to obtain text features.

Of course, as another embodiment, the semantic level feature, the character level feature and the N-gram level feature of the target training text may be fused to obtain the text feature of the target training text, and then the text feature of the target training text may be utilized to train the text recognition model.

That is, in the embodiment of the present application, the semantic level feature of the target training text may be fused with one or more of the word level feature, the character level feature, and the N-gram level feature of the target training text, so as to obtain the text feature of the target training text, and further, the text recognition model may be trained by using the text feature of the target training text. The semantic level features are determined in the same way, see in particular above.

Alternatively, the determining the word level feature of the target training text may refer to:

taking words of the target training text as discrete features, and obtaining word level features by encoding;

The determining the character level feature of the target training text based on the character of the target training text may refer to:

taking the characters of the target training text as discrete features, and encoding to obtain character level features;

the determining the N-gram level feature of the target training text based on the N-gram word segmentation of the target training text may refer to:

and taking the N-element segmentation of the target training text as discrete features, and encoding to obtain N-gram level features.

The encoding method may specifically be one-hot encoding method or the like.

In an actual application, the technical scheme of the embodiment of the application can be applied to a text matching scene, such as searching a knowledge point matched with a sentence input by a user in a man-machine conversation scene.

In a text matching scenario, a text recognition model is used to find target text from a text library that matches the text to be processed.

The text library stores a large number of standard texts, the text recognition model can be realized by adopting a DSSM model, and the like, and when model training is carried out, in order to improve the model training accuracy, the text characteristics of the standard texts can be used as training samples to train the text recognition model.

Thus, in certain embodiments, the method may further comprise:

determining a predetermined number of standard texts in a text library;

determining word level features of each standard text;

determining semantic level features of each standard text based on the semantic information of each standard text;

fusing the semantic level features and word level features of each standard text to obtain text features of each standard text;

the training text recognition model comprises the following steps of:

taking the text characteristics of the target training text and the text characteristics of the preset number of standard texts as input samples, and taking the matching probabilities of the target training text and the preset number of standard texts as training results, and training to obtain a text recognition model;

the text recognition model is used for recognizing target text matched with the text to be processed from the text library.

The text library can comprise a large amount of standard texts, and in a man-machine conversation scene, the text library can be a knowledge base, namely knowledge points in the knowledge base, and each knowledge point corresponds to one response content.

The target training text is a training text with semantic matching of one standard text in the preset number of standard texts, and when model training is performed, the matching probability of the target training text and the standard text with semantic matching can be set to be 1, and the matching probability of the target training text and other standard texts can be set to be 0.

That is, it is assumed that a predetermined number of standard texts includes n, denoted as D1, D2, … … Dn, respectively;

each standard text corresponds to a plurality of training texts. The matching probability of the target training text M matched with the standard text D2 is 1, and the matching probability of the target training text M matched with other standard texts is 0, and can be expressed as M & fwdarw [0,1,0,0,0,0,0,0,0].

Wherein the predetermined number of standard texts in the text library may refer to all standard texts in the text library.

Of course, to improve model training performance, in some embodiments, the determining the predetermined number of standard texts in the text library may include:

and screening a preset number of standard texts with similarity meeting the similarity condition with the target training text from a text library.

The similarity condition may, for example, mean that the similarity is greater than a similarity threshold.

Alternatively, a predetermined number of standard texts may be screened, in particular, in order of high similarity to the target training text.

The similarity between the target training text and the standard text may be obtained by calculating according to the edit distance between words, such as cosine distance, euclidean distance, etc., which are the same as those in the prior art, and will not be described herein.

Further, in some implementations, the predetermined number may be 1, and the predetermined number of standard texts is referred to as one standard text that matches the target training text, so determining the predetermined number of standard texts in the text library may include:

a standard text in the text library is determined for which the target training text matches.

Further, as an alternative, determining the semantic level features of each standard text based on the semantic information of each standard text may include:

determining at least one target frequent item set hit by each standard text from at least one frequent item set corresponding to each standard text;

combining the items in each target frequent item set of each standard text hit to obtain a semantic unit so as to obtain at least one semantic unit representing semantic information of each standard text;

The semantic level features of each standard text are determined based on at least one semantic unit corresponding to each standard text.

The at least one frequent item set corresponding to each standard text refers to at least one frequent item set corresponding to a category to which each standard text belongs.

At least one frequent item set corresponding to the category to which each standard text belongs is obtained by performing frequent item set mining on a plurality of training texts corresponding to the category to which each standard text belongs, namely, words in the plurality of training texts are used as items, and frequent item set mining is performed on the plurality of training texts.

As another alternative, determining semantic level features for each standard text based on semantic information for each standard text may include:

identifying topic distribution probability corresponding to each standard text by using a topic model;

and taking the topic distribution probability corresponding to each standard text as the semantic level characteristic of each standard text.

Further, optionally, character level features and/or N-gram level features, etc., of each standard text may also be determined.

Thus, the semantic level feature, the character level feature, the N-gram level feature, and the word level feature of each standard text may be fused to obtain the text feature of each standard text.

For ease of understanding, the model training process is described below using a human-machine dialogue scenario as an example and using a text recognition model as a DSSM model as an example.

Fig. 2 is a schematic diagram of a network structure of a DSSM model, where the DSSM model is composed of an input layer, one or more hidden layers, and an output layer, and in an alternative manner, two hidden layers may be constructed, such as the input layer x, the hidden layers L1 and L2, and the output layer y shown in fig. 2.

And (5) storing (Q, A) data in a knowledge base, wherein Q is a knowledge point, and in an intelligent question-answering scene, Q is a user standard problem.

When model training is performed, a plurality of training texts corresponding to each knowledge point are needed to be prepared first, and the plurality of training texts corresponding to each knowledge point are a plurality of expression sentences with the same meaning but different description modes. Thus, each training text is assigned to one standard text, and multiple training texts assigned to the same standard text are assigned to the same category.

Because each training text is utilized one by one to perform model training when model training is performed, the currently input training text is named as a target training text for convenience of description distinction.

The samples participating in the DSSM model training include the target training text and a predetermined number of standard texts corresponding to the target training text.

The predetermined number of standard texts at least comprises one standard text matched with the target training text, and the predetermined number of standard texts can be screened from the knowledge base according to the order of the similarity with the target training text from high to low.

In this practical application, as shown in fig. 2, the predetermined number of standard texts is assumed to include 4, denoted as D1, D2, D3, and D4, and one standard text to which the target training text M is matched is D4, and when model training is performed, the matching probability of the target training text M and the standard text D4 is set to 1, and the matching probability of the target training text M and the standard texts D1, D2, and D3 may be set to 0.

Firstly, determining semantic level characteristics A1 corresponding to semantic information of a target training text, determining a plurality of training texts corresponding to target categories to which the target training text belongs by adopting a frequent item set mining mode, taking words in the plurality of training texts as items, and mining the plurality of training texts by using the frequent item sets to obtain at least one frequent item set corresponding to the target categories. The target category is a category corresponding to the standard text matched with the target training text, and the plurality of training texts corresponding to the target category comprise the target training text.

Then, from the at least one frequent item set, at least one target frequent item set for target training text hits may be determined; combining the items in each target frequent item set to obtain a semantic unit so as to obtain at least one semantic unit representing the semantic information of the target training text;

and taking the semantic units of the target training text as discrete features, namely, encoding to obtain semantic level features, wherein the encoding can be performed in a one-hot mode.

Then, word level feature A2, character level feature A3 and N-gram level feature A4 of the target training text can be determined, and the word level feature A2, character level feature A3 and N-gram level feature A4 are fused with semantic level features to obtain text features Q [ mul-gram ], and the text features are input to the input layer x as input samples.

The method comprises the steps of carrying out vector splicing on all levels of features, carrying out vector compression on the text features obtained through vector splicing, and inputting the compressed text features to an input layer x.

For each standard text, the corresponding semantic level feature, word level feature, character level feature and N-gram level feature are determined in the same way, and the text features of each standard text, such as D1[ mul-gram ], D2[ mul-gram ], D3[ mul-gram ] and D4[ mul-gram ] in FIG. 2, are obtained through fusion, and are input into the input layer x as input samples.

When model training is performed, the matching probability P (D _i M) as training results, training the DSSM model to obtain model coefficients in the DSSM model, where i=1, 2, 3, 4.

The model coefficients include, for example, those shown in FIG. 2 (W _j ，b _j ) Wherein W is _j The weight coefficient representing the j-th network layer to the j+1th network layer, b _j Represents the bias coefficients of the j-th to j+1-th network layers, where j=1, 2, 3, 4. The 1 st network layer is the input layer, the 2 nd network layer and the 3 rd network layer are the hidden layers, and the 4 th network layer is the output layer.

The text recognition model obtained based on the model training scheme is deployed at the server, and text recognition can be performed. As shown in fig. 3, the embodiment of the present application further provides a text recognition method, which may include the following steps:

301: word level features of the text to be processed are determined.

302: and determining semantic level features of the text to be processed based on the semantic information of the text to be processed.

303: and fusing the word level features and the semantic level features of the text to be processed to obtain the text features of the text to be processed.

304: and identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed.

The text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing word level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

The specific training manner of the text recognition model may be referred to in the above embodiments, and will not be described herein.

In this embodiment, since the text recognition model is obtained based on the text feature training of the training text, and the text feature of the training text merges the semantic level feature corresponding to the semantic information of the training text, the model can recognize the text to be processed from the semantic level, not only considering the word level feature, and the recognition accuracy of the model is improved.

Further, as yet another embodiment, the text recognition model is obtained based on text feature training of the training text; when the text features of the training text are obtained by fusion of the character level features and the semantic level features of the training text, the embodiment of the application also provides a text recognition method, which comprises the following steps:

and identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed.

The text recognition model is different from the text recognition model obtained by training the text features obtained by fusion of word-level features only in that the text features are obtained by fusion of character-level features and semantic-level features, and other identical or corresponding steps will not be repeated.

Further, as yet another embodiment, the text recognition model is obtained based on text feature training of the training text; the text features of the training text are obtained by fusing N-element model level features and semantic level features of the training text; when the semantic level features of the training text are obtained based on the semantic information of the training text, the embodiment of the application also provides a text recognition method, which comprises the following steps:

The text recognition model is different from the text recognition model obtained by training the text features obtained by fusion of word-level features only in that the text features are obtained by fusion of N-gram-level features and semantic-level features, and other identical or corresponding steps will not be repeated.

In addition, if the text feature of the training text is obtained by fusing one or more of the semantic level feature, the word level feature, the character level feature and the N-gram level feature, one or more of the word level feature, the character level feature and the N-gram level feature of the text to be processed can be determined correspondingly, and the text feature of the text to be processed can be obtained by fusing the text feature with the semantic level feature of the text to be processed. And further, based on the text characteristics of the text to be processed, the text to be processed can be identified by using a text identification model.

As an alternative, the determining, based on the semantic information of the text to be processed, the semantic level feature of the text to be processed may include:

determining at least one candidate frequent item set of the text hit to be processed from at least one frequent item set respectively corresponding to different categories;

combining the items in each candidate frequent item set of the text hit to be processed to obtain a semantic unit so as to obtain at least one semantic unit representing semantic information of the text to be processed;

determining semantic level features of the text to be processed based on at least one semantic unit of the text to be processed.

Wherein, in the text matching scene, each category corresponds to a standard text.

Alternatively, combining the items in each candidate frequent item set of the text hit to be processed to obtain a semantic unit may specifically be:

combining the items in each candidate frequent item set of the text to be processed according to the appearance sequence in the text to be processed to obtain a semantic unit so as to obtain at least one semantic unit representing the semantic information of the text to be processed.

Optionally, determining the at least one candidate frequent item set for the pending text hit from the at least one frequent item set respectively corresponding to different categories may include:

And selecting the frequent item set, each item of which is contained in the text to be processed, from at least one frequent item set corresponding to each different category as a candidate frequent item set for target training text hit, so as to obtain at least one candidate frequent item set.

As another alternative, the determining, based on the semantic information of the text to be processed, the semantic level feature of the text to be processed may include:

identifying topic distribution probability corresponding to the text to be processed by using a topic model;

and taking the topic distribution probability corresponding to the text to be processed as the semantic level characteristic of the text to be processed.

Wherein the text recognition model may be used to recognize categories, tags or other characteristics of the text to be processed, etc.

Of course, in a text matching scenario, the text recognition model is used to recognize target text that matches the text to be processed.

The text recognition model is specifically obtained by training based on text features of training texts and text features of standard texts matched with the training texts.

Thus, in some embodiments, the identifying the text to be processed using a text recognition model based on the text characteristics of the text to be processed may include:

Determining a predetermined number of standard texts in a text library;

determining word level features of each standard text;

and inputting the text characteristics of the text to be processed and the text characteristics of each of the predetermined number of standard texts into the text recognition model to determine target texts matched with the text to be processed.

As an alternative, the predetermined number of standard texts in the text library may refer to all standard texts.

Of course, in order to optimize performance, as another alternative, the determining the predetermined number of standard texts in the text library includes:

and determining a predetermined number of standard texts in a text library, wherein the similarity between the standard texts and the text to be processed meets the similarity condition.

Alternatively, a predetermined number of standard texts may be screened in order of the similarity with the text to be processed from high to low.

The similarity between the standard text and the text to be processed may be obtained by calculating according to an edit distance between words, for example, a cosine distance, a euclidean distance, and the like, which are the same as those in the prior art, and will not be described herein.

As an alternative, determining the semantic level features of each standard text based on the semantic information of each standard text may include:

In some embodiments, determining the at least one candidate frequent item set for the pending text hit from the at least one frequent item set respectively corresponding to the different categories may include:

and determining at least one candidate frequent item set of the text hit to be processed from at least one frequent item set respectively corresponding to the predetermined number of standard texts.

Each standard text and a plurality of training texts corresponding to the standard text belong to the same category, each category corresponds to one standard text, frequent item set mining is conducted on a plurality of training texts corresponding to the same category, namely frequent item set mining is conducted on a plurality of training texts corresponding to the same standard text, and therefore at least one frequent item set corresponding to each category is obtained, namely at least one frequent item set corresponding to each standard text.

In addition, to further improve recognition accuracy, in some embodiments, the method may further include:

determining character level characteristics of the text to be processed based on the characters of the text to be processed;

determining N-gram level characteristics of the text to be processed based on N-gram word segmentation of the text to be processed;

the fusing the semantic level features and the word level features of the text to be processed to obtain text features of the text to be processed may include:

and fusing the semantic level features, the character level features, the N-gram level features and the word level features of the text to be processed to obtain the text features of the text to be processed.

Wherein determining the character level feature of the text to be processed based on the character of the text to be processed may include:

taking characters of a text to be processed as discrete features, and encoding to obtain character level features;

based on the N-gram class feature of the text to be processed, determining the N-gram class feature of the text to be processed may include:

taking N-gram level characteristics obtained by coding N-element segmentation words of a text to be processed as discrete characteristics;

the determining word level characteristics of the text to be processed may include:

And taking word level characteristics of the text to be processed as discrete characteristics, and obtaining the word level characteristics by encoding.

In practical application, the text recognition module may be specifically a DSSM model shown in fig. 2, and based on the DSSM model shown in fig. 2, in a man-machine conversation scenario, the text to be processed is a user input sentence, as shown in fig. 4, the text recognition method may include the following steps:

401: a user input statement is received.

402: word level features, character level features, and N-gram level features of the user input sentence are determined.

Alternatively, words of the user input sentence may be encoded as discrete features, with word level features obtained.

Alternatively, the characters of the user input sentence may be encoded as discrete features, with the character-level features obtained.

Alternatively, N-gram level features may be obtained by encoding N-grams of user input sentences as discrete features.

403: based on semantic information of the user input sentence, semantic level features of the user input sentence are determined.

Optionally, at least one semantic unit corresponding to the user input sentence may be determined; and taking the semantic units of the user input sentences as discrete features, and encoding to obtain semantic level features.

404: and fusing word level features, character level features, N-gram level features and semantic level features of the user input sentence to obtain text features of the user input sentence.

405: a predetermined number of knowledge points from the knowledge base that satisfy a similar condition as the user input statement are screened.

Alternatively, a predetermined number of knowledge points may be selected in order of high similarity to the user input sentence.

406: for each knowledge point, word level features, character level features, and N-gram level features for each knowledge point are determined.

The word level feature, the character level feature and the N-gram level feature of each knowledge point are determined in the same manner as the word level feature, the character level feature and the N-gram level feature of the sentence input by the user, and the difference is only that the text content is different, which is not described herein.

407: based on the semantic information of each knowledge point, a semantic level feature of each knowledge point is determined.

The semantic level features of each knowledge point are the same as the determination mode of the semantic level features of the user input sentence, and the difference is only that the text content is different, and the description is omitted here.

408: and fusing the semantic level features, the word level features, the character level features and the N-gram level features of each knowledge point to obtain text features.

409: the text features of the user input sentence, and the text features of each of the predetermined number of knowledge points are input into the text recognition model to determine a target knowledge point that matches the user input sentence.

Optionally, when the text recognition model is a DSSM model, the matching probabilities of the text to be processed and the predetermined number of knowledge points can be obtained through calculation by the text recognition model, so that one knowledge point with the largest matching probability with the text to be processed is the target knowledge point.

410: and feeding back the response content corresponding to the target knowledge point to the user.

In the man-machine conversation scene, the model identification accuracy can be improved, so that the man-machine conversation accuracy can be improved, the man-machine conversation effect can be ensured, and the user experience is improved.

Fig. 5 is a schematic structural diagram of an embodiment of a model training apparatus provided in an embodiment of the present application, where the apparatus may include:

a first training feature determining module 501, configured to determine word level features corresponding to a target training text;

A second training feature determining module 502, configured to determine semantic level features of the target training text based on semantic information of the target training text;

a training feature fusion module 503, configured to fuse semantic level features and word level features of the target training text, so as to obtain text features of the target training text;

optionally, the training feature fusion module may specifically splice the semantic level features and the word level features of the target training text to obtain text features of the target training text.

Model training module 504 is configured to train a text recognition model using text features of the target training text.

As a possible implementation manner, the second training feature determining module may be specifically configured to:

Combining the items in each target frequent item set to obtain a semantic unit so as to obtain at least one semantic unit representing the semantic information of the target training text;

As another possible implementation manner, the second training feature determining module may be specifically configured to:

identifying topic distribution probability corresponding to the target training text by using a topic model;

and taking the topic distribution probability corresponding to the target training text as the semantic level characteristic of the target training text.

In some embodiments, the second training feature determining module uses the words in the plurality of training texts as terms, and performs frequent item set mining on the plurality of training texts to obtain at least one frequent item set corresponding to the target category may specifically be:

determining an information gain of each alternative frequent item set relative to the at least one alternative frequent item set;

In some embodiments, the second training feature determining module combines the items in each target frequent item set to obtain one semantic unit, and the at least one semantic unit to obtain the semantic information representing the target training text may specifically be:

In some embodiments, the second training feature determination module may determine, from the at least one frequent item set, at least one target frequent item set for the target training text hit, specifically:

In some embodiments, the apparatus may further comprise:

the third training feature determining module is used for determining character level features of the target training text based on the characters of the target training text;

the fourth training feature determining module determines N-gram level features of the target training text based on N-gram word segmentation of the target training text;

in some embodiments, the training feature fusion module may be specifically configured to fuse semantic level features, character level features, N-gram level features, and word level features of the target training text to obtain text features of the target training text.

Optionally, the third training feature determining module may specifically take characters of the target training text as discrete features, and encode to obtain character level features;

The fourth training feature determining module may specifically take the N-gram segmentation of the target training text as a discrete feature, and encode the N-gram level feature.

The first training feature determining module may specifically take the word of the target training text as a discrete feature, and encode to obtain a word level feature.

In some embodiments, the apparatus may further comprise:

the first standard feature extraction module is used for determining a preset number of standard texts in the text library; determining word level features of each standard text; determining semantic level features of each standard text based on the semantic information of each standard text; fusing the semantic level features and word level features of each standard text to obtain text features of each standard text;

the model training module is specifically configured to take text features of the target training text and text features of the predetermined number of standard texts as input samples, and take matching probabilities of the target training text and the predetermined number of standard texts as training results, so as to train and obtain a text recognition model;

Optionally, the first standard feature extraction module determines that the predetermined number of standard texts in the text library may specifically screen the predetermined number of standard texts from the text library, where the similarity with the target training text meets a similarity condition.

The model training apparatus shown in fig. 5 may perform the model training method described in the embodiment shown in fig. 1, and its implementation principle and technical effects are not repeated. The specific manner in which the respective modules and units of the model training apparatus in the above embodiment perform operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

In addition, the embodiment of the application also provides a model training device, which comprises:

the third training feature determining module is used for determining character level features of the target training text based on characters of the target training text;

the training feature fusion module is used for fusing the semantic level features and the character level features of the target training text to obtain text features of the target training text;

And the model training model is used for training a text recognition model by utilizing the text characteristics of the target training text.

the fourth training feature determining module is used for determining the N-element model level feature of the target training text based on N-element word segmentation of the target training text;

In one possible design, the model training apparatus of the embodiment shown in FIG. 5 may be implemented as a computing device, as shown in FIG. 6, which may include a storage component 601 and a processing component 602;

the storage component 601 stores one or more computer instructions; the one or more computer instructions are to be invoked for execution by the processing component 602;

The processing component 602 is configured to:

Wherein the processing component 602 may include one or more processors to execute computer instructions to perform all or part of the steps of the methods described above. Of course, the processing component may also be implemented as one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described above.

The storage component 601 is configured to store various types of data to support operations in a computing device. The memory component may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

Of course, the computing device may necessarily include other components, such as input/output interfaces, communication components, and the like.

The embodiment of the application further provides a computer readable storage medium, and a computer program is stored, and when the computer program is executed by a computer, the model training method of the embodiment shown in fig. 1 can be realized.

Fig. 7 is a schematic structural diagram of an embodiment of a text recognition device according to an embodiment of the present application, where the device may include:

a first text feature determining module 701, configured to determine word level features of a text to be processed;

a second text feature determining module 702, configured to determine semantic level features of the text to be processed based on semantic information of the text to be processed;

a text feature fusion module 703, configured to fuse word level features and semantic level features of the text to be processed to obtain text features of the text to be processed;

a text recognition module 704, configured to recognize the text to be processed by using a text recognition model based on text features of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing word level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In some embodiments, the second text feature determination module may be specifically configured to:

The frequent item set is obtained by mining at least one frequent item set corresponding to each category, namely specifically, a plurality of training texts corresponding to each category.

In some embodiments, the text recognition model may be specifically used to:

determining a predetermined number of standard texts in a text library;

Determining word level features of each standard text;

inputting the text characteristics of the text to be processed and the text characteristics of each of the predetermined number of standard texts into the text recognition model to determine target texts matched with the text to be processed from the predetermined number of standard texts.

In some embodiments, the text recognition module determining the predetermined number of standard texts in the text library may be:

In addition, to further improve the recognition accuracy, the apparatus may further include:

the third text feature determining module is used for determining character level features of the text to be processed based on the characters of the text to be processed;

alternatively, the character of the processed text may be encoded as discrete features, with the character level features being obtained.

The fourth text feature determining module is used for determining N-gram level features of the text to be processed based on N-element word segmentation of the text to be processed;

Alternatively, N-gram level features may be obtained by encoding N-grams that process text as discrete features.

The text feature fusion module may be specifically configured to fuse the semantic level feature, the character level feature, the N-gram level feature, and the word level feature of the text to be processed to obtain a text feature of the text to be processed.

The text recognition device shown in fig. 7 may perform the text recognition method shown in the embodiment shown in fig. 3, and its implementation principle and technical effects will not be repeated. The specific manner in which the respective modules and units of the text recognition apparatus in the above embodiments perform operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

In addition, the embodiment of the application also provides a text recognition device, which comprises:

the third text feature determining module is used for determining character level features of the text to be processed based on characters of the text to be processed;

the text feature fusion module is used for fusing the character level features and the semantic level features of the text to be processed to obtain text features of the text to be processed;

The text recognition module is used for recognizing the text to be processed by using a text recognition model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing character level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

the fourth text feature determining module is used for determining the N-element model level feature of the text to be processed based on N-element word segmentation of the text to be processed;

the text feature fusion module is used for fusing the N-element model level features and the semantic level features of the text to be processed to obtain text features of the text to be processed;

the text recognition module is used for recognizing the text to be processed by using a text recognition model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of training texts; the text features of the training text are obtained by fusing N-element model level features and semantic level features of the training text; the semantic level features of the training text are obtained based on semantic information of the training text.

In one possible design, the text recognition apparatus of the embodiment of FIG. 7 may be implemented as a computing device, as shown in FIG. 8, that may include a storage component 801 and a processing component 802;

the storage component 801 stores one or more computer instructions; the one or more computer instructions are to be invoked for execution by the processing component 802;

the processing component 802 is configured to:

determining word level characteristics of a text to be processed;

Wherein the processing component 802 may include one or more processors to execute computer instructions to perform all or part of the steps in the methods described above. Of course, the processing component may also be implemented as one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described above.

The storage component 801 is configured to store various types of data to support operations in the computing device. The memory component may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The embodiment of the application further provides a computer readable storage medium, and a computer program is stored, and when the computer program is executed by a computer, the method for identifying text in the embodiment shown in fig. 3 can be realized.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A method of model training, comprising:

training a text recognition model by utilizing the text characteristics of the target training text;

the determining the semantic level feature of the target training text based on the semantic information of the target training text comprises:

Determining semantic level features of the target training text based on the at least one semantic unit;

the method further comprises the steps of:

determining N-element model level characteristics of the target training text based on N-element word segmentation of the target training text;

and fusing the semantic level features, the character level features, the N-element model level features and the word level features of the target training text to obtain the text features of the target training text.

2. The method of claim 1, wherein mining the plurality of training texts for frequent item sets using words in the plurality of training texts as items to obtain at least one frequent item set corresponding to the target category comprises:

3. The method of claim 1, wherein mining the plurality of training texts for frequent item sets using words in the plurality of training texts as items to obtain at least one frequent item set corresponding to the target category comprises:

4. The method of claim 1, wherein combining the items in each target frequent item set to obtain one semantic unit to obtain at least one semantic unit representing semantic information of the target training text comprises:

5. The method of claim 1, wherein said determining at least one target frequent item set for the target training text hit from the at least one frequent item set comprises:

6. The method of claim 1, wherein determining semantic level features corresponding to semantic information of the target training text comprises:

7. The method of claim 1, wherein said determining word level features of the target training text comprises:

the determining the character level feature of the target training text based on the characters of the target training text comprises:

the determining the N-element model level feature of the target training text based on the N-element word segmentation of the target training text comprises the following steps:

and taking the N-element segmentation of the target training text as discrete features, and encoding to obtain N-element model level features.

8. The method as recited in claim 1, further comprising:

determining a predetermined number of standard texts in a text library;

determining word level features of each standard text;

the training text recognition model comprises the following steps of:

9. The method of claim 8, wherein determining a predetermined number of standard texts in a text library comprises:

10. The method of claim 1, wherein fusing semantic level features of the target training text with word level features to obtain text features of the target training text comprises:

and splicing the semantic level features and the word level features of the target training text to obtain the text features of the target training text.

11. A method of text recognition, comprising:

determining word level characteristics of a text to be processed;

identifying the text to be processed by using a text identification model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of a target training text; the text features of the target training text are obtained by fusing semantic level features, character level features, N-element model level features and word level features of the target training text; the semantic level features of the target training text are determined based on at least one semantic unit representing semantic information of the target training text, wherein one semantic unit is obtained based on item combinations in each target frequent item set in at least one target frequent item set hit by the target training text, the at least one target frequent item set is determined from at least one frequent item set corresponding to a target category to which the target training text belongs, the at least one frequent item set is obtained by mining words in a plurality of training texts corresponding to the target category to which the target training text belongs as items in the plurality of training texts; the character level features of the target training text are determined based on the characters of the target training text; the N-gram model level feature of the target training text is based on N-gram segmentation of the target training text.

12. The method of claim 11, wherein the determining semantic level features of the text to be processed based on semantic information of the text to be processed comprises:

13. The method of claim 11, wherein the determining semantic level features of the text to be processed based on semantic information of the text to be processed comprises:

14. The method of claim 11, wherein the identifying the text to be processed using a text identification model based on text characteristics of the text to be processed comprises:

Determining a predetermined number of standard texts in a text library;

determining word level features of each standard text;

and inputting the text characteristics of the text to be processed and the text characteristics of each of the predetermined number of standard texts into the text recognition model to determine target texts matched with the text to be processed from the predetermined number of standard texts.

15. The method of claim 11, wherein determining a predetermined number of standard texts in a text library comprises:

16. The method as recited in claim 11, further comprising:

the step of fusing the semantic level features and the word level features of the text to be processed to obtain text features of the text to be processed comprises the following steps:

17. A model training device, comprising:

the model training module is used for training a text recognition model by utilizing the text characteristics of the target training text;

the second training feature determining module is specifically configured to determine a plurality of training texts corresponding to a target category to which the target training text belongs; taking words in the training texts as terms, and carrying out frequent term set mining on the training texts to obtain at least one frequent term set corresponding to the target category; determining at least one target frequent item set for the target training text hit from the at least one frequent item set; combining the items in each target frequent item set to obtain a semantic unit so as to obtain at least one semantic unit representing the semantic information of the target training text; determining semantic level features of the target training text based on the at least one semantic unit;

The apparatus further comprises:

the fourth training feature determining module is used for determining N-element model level features of the target training text based on N-element word segmentation of the target training text;

the training feature fusion module is specifically configured to fuse semantic level features, character level features, N-element model level features and word level features of the target training text to obtain text features of the target training text.

18. A text recognition device, comprising:

the text recognition module is used for recognizing the text to be processed by using a text recognition model based on the text characteristics of the text to be processed; the text recognition model is obtained based on text feature training of a target training text; the text features of the target training text are obtained by fusing semantic level features, character level features, N-element model level features and word level features of the target training text; the semantic level features of the target training text are determined based on at least one semantic unit representing semantic information of the target training text, wherein one semantic unit is obtained based on item combinations in each target frequent item set in at least one target frequent item set hit by the target training text, the at least one target frequent item set is determined from at least one frequent item set corresponding to a target category to which the target training text belongs, the at least one frequent item set is obtained by mining words in a plurality of training texts corresponding to the target category to which the target training text belongs as items in the plurality of training texts; the character level features of the target training text are determined based on the characters of the target training text; the N-gram model level feature of the target training text is based on N-gram segmentation of the target training text.

19. A computing device comprising a processing component and a storage component;

the processing assembly is configured to:

the method further comprises the steps of:

20. A computing device comprising a processing component and a storage component;

the processing assembly is configured to:

determining word level characteristics of a text to be processed;