CN113761190A

CN113761190A - Text recognition method and device, computer readable medium and electronic equipment

Info

Publication number: CN113761190A
Application number: CN202110492354.2A
Authority: CN
Inventors: 铁瑞雪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-05-06
Filing date: 2021-05-06
Publication date: 2021-12-07

Abstract

The embodiment of the application provides a text recognition method and device, a computer readable medium and electronic equipment. The text recognition method comprises the following steps: adding a first classification mark in a text to be recognized to generate an input object corresponding to the text to be recognized; inputting the input object into a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeling entity label and a labeling classification label; acquiring a prediction entity label corresponding to each character in the text to be recognized output by the text recognition model and a prediction classification label corresponding to the first classification mark; and generating an entity identification result aiming at the text to be identified according to the predicted entity label, and generating a classification result aiming at the text to be identified according to the predicted classification label. According to the technical scheme, the entity in the text can be extracted while the text is classified, and the accuracy of text recognition can be improved.

Description

Text recognition method and device, computer readable medium and electronic equipment

Technical Field

The present application relates to the field of computer and communication technologies, and in particular, to a text recognition method, an apparatus, a computer-readable medium, and an electronic device.

Background

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. Text recognition, an important application in natural language processing, is widely used for detecting text content.

In order to identify negative information in web text information, related technologies often identify the negative information by manually constructing features or using a large-scale corpus training model. However, the cost of manually constructing the features is high, the features are very easy to be sparse, and a large amount of corpus labels affect the training effect, which easily results in poor effect and low accuracy.

Disclosure of Invention

Embodiments of the present application provide a text recognition method, an apparatus, a computer-readable medium, and an electronic device, so that an entity in a text can be extracted while classifying the text to a certain extent, and the accuracy of text recognition can be improved.

Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.

According to an aspect of an embodiment of the present application, there is provided a text recognition method including: adding a first classification mark in a text to be recognized to generate an input object corresponding to the text to be recognized; inputting the input object into a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added in the target sample text; acquiring a prediction entity label corresponding to each character in the text to be recognized output by the text recognition model and a prediction classification label corresponding to the first classification mark; and generating an entity identification result aiming at the text to be identified according to the predicted entity label, and generating a classification result aiming at the text to be identified according to the predicted classification label.

According to an aspect of an embodiment of the present application, there is provided a text recognition apparatus including: the first adding unit is configured to add a first classification mark in a text to be recognized so as to generate an input object corresponding to the text to be recognized; the first input unit is configured to input the input object to a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added in the target sample text; the obtaining unit is configured to obtain a prediction entity label corresponding to each character in the text to be recognized and output by the text recognition model and a prediction classification label corresponding to the first classification mark; and the generating unit is configured to generate an entity identification result aiming at the text to be identified according to the predicted entity label and generate a classification result aiming at the text to be identified according to the predicted classification label.

In some embodiments of the present application, based on the foregoing solution, the first adding unit is configured to: dividing the text to be recognized by taking characters as units to generate a character sequence corresponding to the text to be recognized; and adding the first classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as an input object corresponding to the text to be recognized.

In some embodiments of the present application, based on the foregoing scheme, the generating unit is configured to: identifying characters which are continuous in position in the text to be identified and correspond to the predicted entity labels and indicate the same entity as the same entity to obtain an entity identification result aiming at the text to be identified; and taking the classification category indicated by the prediction classification label as a classification result for the text to be recognized.

In some embodiments of the present application, based on the foregoing solution, the apparatus further includes: a second adding unit, configured to add the second classification mark in the target sample text to generate a sample input object corresponding to the target sample text; the second input unit is configured to input the sample input object into a model to be trained, and obtain the prediction scores of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label; and the determining unit is configured to determine a loss function according to the labeled entity label, the labeled classification label and the prediction score, and adjust parameters of the model to be trained according to the loss function to obtain the text recognition model.

In some embodiments of the present application, based on the foregoing scheme, the determining unit includes: a prediction score determining subunit configured to determine, according to the labeled entity tag and the labeled classification tag, a target prediction score corresponding to an entity tag that is the same as the labeled entity tag and a classification tag that is the same as the labeled classification tag from the prediction scores; a loss function determination subunit configured to determine the loss function according to the target prediction score.

In some embodiments of the present application, based on the foregoing scheme, the loss function determination subunit is configured to: calculating a ratio between the target prediction score and a sum of the prediction scores; and carrying out logarithmic operation on the ratio to obtain an operation result, and determining the loss function according to the operation result.

In some embodiments of the present application, based on the foregoing scheme, the second adding unit is configured to: dividing the target sample text by taking characters as units to generate a character sequence corresponding to the target sample text; and adding the second classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as a sample input object corresponding to the target sample text.

In some embodiments of the present application, based on the foregoing solution, the apparatus further includes: a keyword acquisition unit configured to acquire a keyword from a keyword library established in advance; the sample text acquisition unit is configured to acquire a sample text containing the keywords according to the keywords; the labeling unit is configured to perform entity label labeling and category label labeling on part of the obtained sample texts to generate initial sample texts; and the processing unit is configured to perform data enhancement processing on the initial sample text to generate the target sample text.

In some embodiments of the present application, based on the foregoing solution, the processing unit is configured to: copying the initial sample text to obtain a copy of the initial sample text; and generating the target sample text according to the initial sample text and the copy of the initial sample text.

In some embodiments of the present application, based on the foregoing solution, the processing unit is configured to: synonym replacement is carried out on the target keywords contained in the initial sample text, so that a sample text containing synonyms of the target keywords is generated; and generating the target sample text according to the initial sample text and the sample text containing the synonyms of the target keywords.

In some embodiments of the present application, based on the foregoing solution, the processing unit is configured to: deleting a part of characters contained in the initial sample text to obtain a processed initial sample text; and generating the target sample text according to the initial sample text and the processed initial sample text.

In some embodiments of the present application, based on the foregoing solution, the processing unit is configured to: randomly inserting characters into the initial sample text to obtain a new initial sample text; and generating the target sample text according to the initial sample text and the new initial sample text.

According to an aspect of embodiments of the present application, there is provided a computer-readable medium on which a computer program is stored, which computer program, when executed by a processor, implements a text recognition method as described in the above embodiments.

According to an aspect of an embodiment of the present application, there is provided an electronic device including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the text recognition method as described in the above embodiments.

According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the text recognition method provided in the various alternative embodiments described above.

In the technical solutions provided in some embodiments of the present application, a first classification label is added to a text to be recognized to generate an input object corresponding to the text to be recognized, then the input object is input to a text recognition model, the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added to the target sample text, further, a predicted entity label corresponding to each character in the text to be recognized and a predicted classification label corresponding to the first classification label output by the text recognition model can be obtained, and finally, an entity recognition result for the text to be recognized can be generated according to the predicted entity label, and according to the predicted classification label, and generating a classification result aiming at the text to be recognized. According to the technical scheme, the text to be recognized is recognized through the text recognition model, the characteristics do not need to be constructed manually, semantic knowledge can be extracted automatically, labor cost is reduced greatly, efficiency of entity recognition and text classification is improved, the text recognition model can extract entities in the text to be recognized while the text to be recognized is classified, a mode of entity recognition task and text classification task combined training is adopted, semantic knowledge is shared by two tasks, attention of the text recognition model to the entities is increased, and accuracy of entity recognition and text classification is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 is a schematic diagram of an application environment of a text recognition method according to an embodiment of the present application;

FIG. 2 is a flow diagram of a text recognition method provided by an embodiment of the present application;

FIG. 3 illustrates a block diagram of a text recognition model;

FIG. 4 is a flow diagram of a method for training a text recognition model according to an embodiment of the present application;

FIG. 5 is a flow chart of a loss function determination method provided by an embodiment of the present application;

FIG. 6 is a flow diagram of a target sample text generation method provided by an embodiment of the present application;

FIG. 7 is a block diagram of a text recognition apparatus provided by an embodiment of the present application;

FIG. 8 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.

It is to be noted that the terms used in the specification and claims of the present application and the above-described drawings are only for describing the embodiments and are not intended to limit the scope of the present application. It will be understood that the terms "comprises," "comprising," "includes," "including," "has," "having," and the like, when used herein, specify the presence of stated features, integers, steps, operations, elements, components, and/or groups thereof, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element without departing from the scope of the present invention. Similarly, a second element may be termed a first element. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Machine Learning (ML for short) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The scheme provided by the embodiment of the application relates to technologies such as artificial intelligence natural language processing and machine learning, and is specifically explained by the following embodiments:

according to an aspect of the embodiments of the present application, a text recognition method is provided, and optionally, as an optional implementation manner, the text recognition method may be applied to, but is not limited to, an environment as shown in fig. 1. The application environment includes a terminal device 101 and a server 102. The terminal device 101 and the server 102 perform data communication through a communication network, optionally, the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network.

The terminal apparatus 101 is an electronic apparatus having a function of realizing a search through a network. The electronic device may be a mobile terminal such as a smart phone, a tablet computer, a laptop portable notebook computer, or the like, or a terminal such as a desktop computer, a projection computer, or the like, which is not limited in this embodiment of the present application.

The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

In an embodiment of the present application, the server 102 is provided with a text recognition model, and the text recognition model is obtained by training according to a target sample text carrying a labeled entity label and a labeled classification label.

After the terminal device 101 sends a text recognition request to the server 102, the server 102 adds a first classification mark to the text to be recognized, generates an input object corresponding to the text to be recognized, uses the input object as an input of a text recognition model, performs text recognition by the text recognition model, and outputs a prediction entity label corresponding to each character in the text to be recognized and a prediction classification label corresponding to the first classification mark.

In other possible embodiments, the text recognition model may also be issued to the terminal device 101 by the server 102, and the terminal device 101 recognizes the text to be recognized by using the text recognition model, which is not limited in this embodiment.

The text recognition model provided by the application can be widely applied to various fields. For example, a text recognition model is applied in the field of recognition of web texts, which may include, for example, public opinion texts, and in particular, a text recognition model is provided in a background server of an application/portal, and is used for recognizing classification (negative or non-negative) of web texts and entities in web texts. After the application program/portal website is started, a user can publish the web text according to the actual situation of the user, the server can acquire the web text published by the user from an open interface of the application program or the portal website, and then the server identifies the web text through the text identification model, extracts entities and relations in the web text while judging whether the web text is negative, so that risks can be found in time.

The identification work of the negative information of the enterprise is important for the wind control task of the enterprise, however, in an actual scene, on one hand, when the enterprise has external risks, the interior of the enterprise is difficult to sense in time, and on the other hand, a large amount of time is usually spent manually and actively to search the negative information.

Current web text recognition schemes can be broadly divided into three categories: (1) and the rule template type triggers judgment through the negative keywords, and the condition that the corresponding rule is met is judged to be negative. (2) A traditional machine learning model is classified and trained through artificially constructing a large number of features based on labeled corpora. (3) The deep learning model does not need to manually construct features, and carries out vector extraction semantic features on the text to further train the classification model.

However, (1) the rule template formula is easy to cause misjudgment, and the rule template library needs to be continuously supplemented and maintained, and the new negative pattern expression cannot be actively discovered depending on historical experience. (2) The traditional machine learning model has high cost for manually constructing the features, and the features are easy to generate sparse phenomenon. (3) The deep learning model depends on large-scale training corpora, the marking quality of the corpora greatly influences the training effect, the web text is usually complex, and the effect is often not good when the classification model is used alone.

Based on the above, the text recognition method can be applied to recognition of web texts, the web texts are recognized through the text recognition model, whether the web texts are negative or not can be recognized through the text recognition model, entities and relations in the web texts can be extracted at the same time, a named entity recognition task and classification task combined training mode is adopted, the attention degree of the model to important entities is increased implicitly, semantic knowledge of two tasks is shared, the problem of risk ambiguity is solved well, and task accuracy is improved.

Of course, the text recognition method in the present application can also be applied to other various fields, which are not illustrated here. In addition, in practical applications, a plurality of text recognition models may be used to recognize the text to be recognized at the same time, for example, a first text recognition model is used to determine the domain to which the text to be recognized belongs, and a text recognition model corresponding to the domain is selected based on the domain to classify and identify the entity of the text to be recognized.

For convenience of description, in the following method embodiments, only the execution subject of each step is described as an example of a computer device, and the computer device may be any electronic device with computing and storage capabilities. For example, the computer device may be the server 102 or the terminal device 101, and it should be noted that in this embodiment of the application, an execution main body of each step may be the same computer device, or may be executed by a plurality of different computer devices in an interactive cooperation manner, which is not limited herein. It should be noted that, in the embodiment of the present application, an execution subject of the following text recognition method may be the same computer device as an execution subject of the following training method of the text recognition model, or may be a different computer device, which is not limited in the embodiment of the present application.

The implementation details of the technical solution of the embodiment of the present application are set forth in detail below:

fig. 2 shows a flow chart of a text recognition method according to an embodiment of the present application, and referring to fig. 2, the text recognition method includes:

step S210, adding a first classification mark in the text to be recognized to generate an input object corresponding to the text to be recognized;

step S220, inputting an input object into a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added in the target sample text;

step S230, acquiring a prediction entity label corresponding to each character in a text to be recognized output by the text recognition model and a prediction classification label corresponding to the first classification mark;

and S240, generating an entity identification result aiming at the text to be identified according to the predicted entity label, and generating a classification result aiming at the text to be identified according to the predicted classification label.

These steps are described in detail below.

In step S210, a first classification mark is added to the text to be recognized to generate an input object corresponding to the text to be recognized.

The text to be recognized refers to the text of an unknown category, i.e., the unclassified text. The text to be recognized may include a plurality of text characters, for example, may include a plurality of words, and may also include a character group composed of words, numbers, letters, or punctuation marks. In the embodiment of the application, the computer equipment acquires the text to be recognized before performing text recognition. Alternatively, the text to be recognized may be text acquired in real time, or may be text acquired previously and stored in the computer device.

In one possible embodiment, the text to be recognized is provided to the computer device actively by the user. Optionally, the user determines the text to be recognized according to the actual situation, and inputs the text to be recognized to the computer device or the associated device of the computer device, and further, the computer device obtains the text to be recognized. The input mode of the text to be recognized may be character input, voice input, image input, gesture input, or the like, which is not limited in the embodiment of the present application.

In another possible embodiment, the text to be recognized is actively acquired by the computer device. Optionally, the computer device may obtain the text to be recognized from the network environment at certain time intervals, in this case, after the computer device classifies the text to be recognized, the computer device may store the text to be recognized to a suitable location, such as a classification database, according to the category of the text to be recognized. Wherein, the time interval can be 1s, 1h, 1 day, 1 week, etc.

After the computer equipment acquires the text to be recognized, a first classification mark can be added in the text to be recognized so as to generate an input object corresponding to the text to be recognized. Compared with the text to be recognized, the first classification mark is added in the input object corresponding to the text to be recognized, and it should be noted that the added first classification mark is only a mark representing classification, and is not a specific classification category of the text to be recognized. The added first classification mark may be added at a start position, a middle position or an end position of the text to be recognized. As an example, the first category label may be "RN", and it should be noted that the first category label may not be limited to the above english alphabet, but may also include, but is not limited to, at least one of the following forms: numbers, words, symbols, etc.

In an illustrative example, the text to be recognized is "how beautiful she is at all", and the computer device adds a first classification mark RN at the start position of the text to be recognized, and generates an input object "how beautiful her is at all" corresponding to the text to be recognized.

In an embodiment of the present application, the computer device may adopt a word segmentation method, and perform division processing on a text to be recognized by taking a word as a unit to obtain a word sequence corresponding to the text to be recognized, add a first classification mark to the word sequence to obtain a new word sequence, and then use the new word sequence as an input object corresponding to the text to be recognized.

In an illustrative example, the text to be recognized is 'my want to listen to rice fragrance of XXX', and the computer device divides the text to be recognized to obtain a text sequence 'i/want/listen/X/rice/fragrance'; the computer equipment adds a first classification mark RN in the character sequence obtained by division to generate a new character sequence of 'I/M/X/X/X/rice/incense/RN'.

In step S220, the input object is input into a text recognition model, where the text recognition model is obtained by training a target sample text carrying labeled entity labels and labeled classification labels, the labeled entity labels are entity labels corresponding to characters in the target sample text, and the labeled classification labels are classification labels corresponding to second classification labels added to the target sample text.

The text recognition model may be a model obtained by training a model to be trained by using a preset target sample text in advance. The trained text recognition model can have both entity recognition capability and text classification capability.

In one embodiment, the model to be trained may be a combined model, which may be: (1) bidirectional Encoder characterization (BERT), Long Short-Term Memory network (LSTM) and Conditional Random Field (CRF) based models. (2) Bi-directional Long Short-Term Memory networks (Bi-LSTM) and Conditional Random Fields (CRF). (3) Bidirectional coders characterize (BERT) and Conditional Random Fields (CRFs) models.

The target sample text is a specific data set related to text recognition, and the target sample text contains labeled entity labels and labeled classification labels, and the entity labels and the classification labels can be labeled manually.

The labeling entity label is an entity label corresponding to each character in the target sample text, and optionally, a labeling scheme of the entity label can be designed according to actual needs. As an example, assuming that an entity tagging scheme for a sample text is designed to introduce seven roles (a principal company \ a principal person \ an associated company \ an associated product \ a risk word \ a temporal word), if a target sample text is "the associated company denies violence to accept", then after entity tagging is performed on the target sample text, a word can be obtained, and entity tags > corresponding to the word are respectively: < couplet, subject company >, < creation, subject company >, < company, subject company >, < department, subject company >, < no, tense word >, < acknowledgement, tense word >, < storm, risk word >, < force, risk word >, < hastening, risk word >, < receipt, risk word >. It should be noted that the entity label labeling may not be limited to the above manner, and a labeling scheme of the entity label may also be designed according to actual needs.

And the labeling classification label is a classification label corresponding to the second classification label added in the target sample text. In one illustrative example, when a text recognition model is used to classify the compliance text and the violation text, the classification tags may include a compliance tag and a violation tag; when the text recognition model is used for classifying the compliance text, the bad information text or the fraud information text, the classification label can be at least one of a compliance label, a bad information label and a fraud information label; when the text recognition model is used to classify negative text and non-negative text, the classification labels may include negative labels and non-negative labels. The embodiment of the present application does not limit the specific content of the classification label.

Specifically, the computer device may input the input object corresponding to the text to be recognized to the text recognition model, and determine the prediction scores of each word for each entity label and each first classification label for each classification label according to the output probability, the feature matrix between the labels, and the transition matrix between the labels.

In step S230, a predicted entity tag corresponding to each character in the text to be recognized output by the text recognition model and a predicted classification tag corresponding to the first classification tag are obtained.

After the computer device inputs the input object into the text recognition model, the prediction scores of each character for each entity label and each classification label of the first classification label can be determined according to the output probability, the feature matrix between the labels and the transition matrix between the labels.

In other words, the text recognition model may output the prediction scores of a plurality of paths, each path is formed by combining the labels corresponding to each position in the text to be recognized, the path with the highest prediction score is the real path, the entity label corresponding to each word in the real path is the predicted entity label corresponding to each word in the text to be recognized, and the classification label corresponding to the first classification label in the real path is the predicted classification label corresponding to the first classification label.

In step S240, an entity recognition result for the text to be recognized is generated according to the predicted entity tag, and a classification result for the text to be recognized is generated according to the predicted classification tag.

In this embodiment, after obtaining the predicted entity tag and the predicted classification tag output by the text recognition model, the computer device may generate an entity recognition result for the text to be recognized according to the predicted entity tag, and may generate a classification result for the text to be recognized according to the predicted classification tag.

As an example, the computer device may identify, as the same entity, words having consecutive positions in the text to be identified and corresponding predicted entity labels indicating the same entity, so as to obtain an entity identification result for the text to be identified.

Taking the text to be recognized, "Yangtze river company enters the clearing program" as an example, the prediction entity label corresponding to "long" is "main company", the prediction entity label corresponding to "river" is "main company", the prediction entity label corresponding to "public" is "main company", the prediction entity label corresponding to "department" is "main company", the text to be recognized is text with continuous positions, and the corresponding prediction entity labels indicate the same entity, "main company", therefore, it can be determined that "Yangtze river company" in the text to be recognized is an entity.

As an example, the computer device may use the classification category indicated by the predicted classification label corresponding to the first classification label as the classification result for the text to be recognized, for example, the classification category indicated by the predicted classification label corresponding to the first classification label added to the text to be recognized is "negative", and then may determine that the text to be recognized is negative text.

Based on the technical scheme of the embodiment, the text to be recognized is recognized through the text recognition model, the characteristics do not need to be constructed manually, semantic knowledge can be extracted automatically, labor cost is reduced greatly, the efficiency of entity recognition and text classification is improved, the text recognition model can extract entities in the text to be recognized while classifying the text to be recognized, the semantic knowledge is shared by two tasks in a mode of training the entity recognition task and the text classification task in a combined mode, the attention of the text recognition model to the entities is increased, and the accuracy of entity recognition and text classification is improved.

The above is a detailed description of the text recognition method, and the entity recognition task and the classification task are processed by the text recognition model.

Exemplarily, referring to fig. 3, fig. 3 shows a structure diagram of a text recognition model, and the text recognition model 30 is a combination model including a Bert model 301, an LSTM model 302, and a CRF model 303. The Bert model 301 is used to extract lexical, syntactic and bi-directional semantic features, and it is to be explained that the semantic features are features reflecting the semantics of the corresponding words/phrases. It is understood that the semantics here are the semantics of the word/phrase expressed in the target sample text. That is, semantics is the semantics of the corresponding word/word reflected in the context of the target sample text in conjunction with the contextual content. The LSTM model 302 is used to capture pre-and post-location information and output entity and category probability distribution maps. The CRF model 303 is used for learning the transfer rule of the adjacent label and finally outputting the optimal prediction result.

In this embodiment, the process of recognizing the text to be recognized may be described as follows:

firstly, dividing a text to be recognized, and adding a first classification mark RL to obtain a character sequence X₁、X₂…X_NWill be a character sequence X₁、X₂…X_NInput the Bert model 301, X₁、X₂Respectively representing the words, X, at first and second positions in the text to be recognized_NCorresponding to the first classification markNote RL. After the Bert model 301 obtains the character sequence to be input, the characters are converted into vectors (Embedding), and the Embedding is denoted by E, that is, the input vector sequence E is obtained by conversion₁、E₂…E_N The Bert model 301 is provided with a plurality of layers of coding networks (namely Trm), each layer of coding network comprises a multi-head attention layer and a feedforward neural network layer, a summation layer and a normalization layer are connected behind the multi-head attention layer and the feedforward neural network layer, and an input vector sequence E is input through the coding networks arranged in the Bert model 301₁、E₂…E_NConversion into a sequence of output vectors T₁、T₂…T_N。

Further, a vector sequence T is output₁、T₂…T_NInput to LSTM model 302, LSTM model 302 outputs probabilities of each word for each entity label and first class label for each class label. These probabilities will be the input to the CRF model 303, and in the output layer of the CRF model 303, the prediction scores of each word for each entity label and the first classification label for each classification label are determined according to the feature matrix between the probabilities output by the LSTM model 302 and the labels, and the transition matrix between the labels.

In other words, the prediction result finally output by the CRF model 303 is the prediction scores corresponding to a plurality of paths, each path is formed by combining the labels corresponding to each position in the text to be recognized, where the path with the highest prediction score is the real path, the entity label corresponding to each word in the real path is the predicted entity label corresponding to each word in the text to be recognized, and the classification label corresponding to the first classification label in the real path is the predicted classification label corresponding to the first classification label.

Illustratively, assuming that the representation form of the real path a is (B-PER, E-PER, O, B-COM, I-COM, E-COM, TRUE), in the representation form of the path, B-PER represents an initial character label of a person name, E-PER represents an ending character label of a person name, O represents an independent character label, B-COM represents an initial character label of an enterprise name, I-COM represents an intermediate character label of an enterprise name, E-COM represents an ending character label of an enterprise name, TRUE represents a category label, it may be determined that the text to be recognized contains a person name entity (B-PER, E-PER) and an enterprise entity (B-COM, I-COM, E-COM), the person name entity being composed of a first word and a second word in the text to be recognized, the business entity is composed of the fifth word, the sixth word and the seventh word in the text to be recognized, and the classification category of the text to be recognized is the classification category indicated by the classification label TRUE.

Referring to fig. 4, a flowchart of a training method of a text recognition model according to an embodiment of the present application is shown. The method may include steps S410-S430, as detailed below:

in step S410, a second classification mark is added to the target sample text to generate a sample input object corresponding to the target sample text.

The target sample text is sample data for training the text recognition model, and the computer equipment can directly pull the target sample text from the Internet. After the target sample text is pulled, a second classification mark can be added to the target sample text to obtain a sample input object corresponding to the target sample text. Compared with the target sample text, the second classification mark is added to the input object corresponding to the target sample text, and it should be noted that the added second classification mark is only a mark representing classification, and is not a specific classification category. The added second classification mark may be added at the start position, the middle position or the end position of the target sample text. As an example, the second category label may be "RN", and of course, the second category label is not limited to the above english alphabet, but may include, but is not limited to, at least one of the following forms: numbers, words, symbols, etc.

In one embodiment, the computer device may adopt a word segmentation method to perform division processing on the target sample text by taking a word as a unit to obtain a word sequence corresponding to the target sample text, add a second classification mark to the word sequence corresponding to the target sample text to obtain a new word sequence, and then use the new word sequence as a sample input object corresponding to the target sample text.

In step S420, the sample input object is input into the model to be trained, and the prediction scores of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label are obtained.

In the training process, the computer device may input the sample input objects into the model to be trained to train the model to be trained. The model to be trained may be a combined model, and specifically may be: (1) bidirectional Encoder characterization (BERT), Long Short-Term Memory network (LSTM) and Conditional Random Field (CRF) based models. (2) Bi-directional Long Short-Term Memory networks (Bi-LSTM) and Conditional Random Fields (CRF). (3) Bidirectional coders characterize (BERT) and Conditional Random Fields (CRFs) models.

Specifically, the computer device can divide the sample input object to generate a word sequence corresponding to the target sample text, then processing each character according to the sequence of each character in the character sequence to obtain a vector sequence, further determining the output probability of each word in the target sample text for each entity label and each classification label for the second classification label according to the vector sequence, according to the feature matrix between the output probability and the labels and the transfer matrix between the labels, the prediction scores of each character for each entity label and each classification label of the first classification label are determined, namely after the sample input object is input into the model to be trained, prediction scores corresponding to a plurality of paths output by the model to be trained can be obtained, and each path is formed by combining the corresponding label at each position in the text to be recognized.

For example, assuming that a target sample text contains 2 words, which are w0 and w1 according to the order of the words, the second classification label is RN, the entity labels include label1 and label2, and the classification labels include class1 and class2, then the model to be trained may output prediction scores corresponding to 8 paths:

prediction score P1 for path 1{ label1, label1, class1}

Prediction score P2 for path 2{ label1, label1, class2}

Prediction score P3 for path 3{ label1, label2, class1}

Prediction score P4 for path 3{ label1, label2, class2}

Prediction score P5 for path 3{ label2, label1, class1}

Prediction score P6 for path 3{ label2, label1, class1}

Prediction score P7 for path 3{ label2, label2, class2}

Prediction score P8 for path 3{ label2, label2, class2}

In step S430, a loss function is determined according to the labeled entity label, the labeled classification label, and the prediction score, and parameters of the model to be trained are adjusted according to the loss function, so as to obtain a text recognition model.

Because the labeling entity label and the labeling classification label can be manually labeled labels, which represent correct labels, the computer equipment can determine a loss function according to the prediction score output by the model to be trained, the labeling entity label and the labeling classification label, adjust the model parameters of the model to be trained according to the direction of the minimum loss function, reduce the loss function by updating the model parameters, continuously optimize and count the model parameters of the model to be trained, and determine the model parameters which minimize the loss function by adopting the minimization principle to obtain the text recognition model.

In one embodiment, as shown in fig. 5, the method for determining the loss function may specifically include steps S510 to S520, which are described in detail as follows:

step S510, according to the labeled entity label and the labeled classification label, determining a target prediction score corresponding to the entity label which is the same as the labeled entity label and the classification label which is the same as the labeled classification label from the prediction scores.

After the sample input object is input into the model to be trained, the prediction scores of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label can be obtained.

Because the target sample text contains the corresponding labeled entity label and labeled classification label, which can be manually labeled and are correct and real labels, when determining the loss function, the target prediction score corresponding to the entity label which is the same as the labeled entity label and the classification label which is the same as the labeled classification label can be determined from the prediction scores according to the labeled entity label and the labeled classification label.

Continuing with the example in step S420, assuming that the labeled entity label of w0 is label1, the labeled entity label of w1 is label2, and the labeled classification label of the second classification label is class1, the target prediction score corresponding to the same entity label as the labeled entity label and the same classification label as the labeled classification label is determined to be P3 from the prediction scores.

And step S520, determining a loss function according to the target prediction score.

The target prediction score is determined according to the labeling entity label and the labeling classification label and is a prediction score corresponding to the real path, namely the target prediction score is the highest of the prediction scores corresponding to all paths, so that the loss function can be determined according to the target prediction score.

In an embodiment of the present application, determining the loss function according to the target prediction score may specifically include: calculating a ratio between the target prediction score and a sum of the prediction scores; and carrying out logarithmic operation on the values to obtain an operation result, and determining a loss function according to the operation result. Illustratively, the loss function LossFunction may be defined as the following equation (1):

wherein, P_realPathIs the target prediction score, P₁、P₂、P_NIs the prediction score.

In an embodiment of the present application, the target sample text may be a training text directly acquired from the internet by a computer device, or may also be a sample text obtained by processing a directly acquired text, as shown in fig. 6, in this embodiment, the text recognition method may further specifically include steps S610 to S640, which are specifically described as follows:

step S610, obtaining keywords from a keyword library established in advance.

The pre-established keyword library is a preset word library consisting of words which can be selected as keywords, and the keyword library comprises a plurality of keywords.

Optionally, different keyword libraries are correspondingly set for texts of different classifications. The keyword library corresponding to each classified text is a preset word library composed of words which can be selected as keywords of the text. For example, the keyword library corresponding to the text of the entertainment category includes words related to entertainment, such as names of entertainment stars, names of movies and TV shows, names of art programs, and the like; for another example, the keyword library corresponding to the text of the sports classification includes words related to sports, such as names of sports stars, names of sports projects, names of teams, and the like; for example, the keyword library corresponding to the text of the public sentiment classification includes words related to public sentiment risks, such as announcements, false words, and the like.

Therefore, in order to obtain the target sample text for model training, keywords may be obtained from a keyword library in advance, and then the training text may be obtained by means of the keywords. It can be understood that the text obtained by the keywords may have more relevance to the text required for model training.

Step S620, obtaining a sample text containing the keywords according to the keywords.

After obtaining the keywords, the computer device may further obtain a sample text containing the keywords, for example, the computer device may obtain a text containing the keywords posted by a user in the social software from an open interface of the social software as the sample text.

Step S630, entity label labeling and classification label labeling are carried out on part of the obtained sample texts to generate initial sample texts.

Further, after obtaining the sample text, a part of the sample text may be selected for entity label labeling and classification label labeling to generate an initial sample text.

It should be noted that the selection mode may be random selection, or may be a mode in which a part of the sample text is selected according to a preset rule.

And step S640, performing data enhancement processing on the initial sample text to generate a target sample text.

In view of the problem that training data is often insufficient in the model training process, in this embodiment, after the initial sample text labeled with the label is obtained, data enhancement processing may be further performed on the initial sample text to generate a target sample text.

In some embodiments of the present application, the data enhancement processing manner may specifically include: copying the initial sample text to obtain a copy of the initial sample text; and generating a target sample text according to the initial sample text and the copy of the initial sample text.

In some embodiments of the present application, the data enhancement processing manner may further include: synonym replacement is carried out on the target keywords contained in the initial sample text to generate a sample text containing the synonyms of the target keywords; and generating a target sample text according to the initial sample text and the sample text containing the synonyms of the target keywords.

In some embodiments of the present application, the data enhancement processing manner may further include: deleting a part of characters contained in the initial sample text to obtain a processed initial sample text; and generating a target sample text according to the initial sample text and the processed initial sample text.

In some embodiments of the present application, the data enhancement processing manner may further include: randomly inserting characters into the initial sample text to obtain a new initial sample text; and generating a target sample text according to the initial sample text and the new initial sample text.

Of course, it is understood that the data enhancement processing mode for the initial sample text may also be any combination of the above modes, for example, a combination of a copy mode and a synonym replacement mode; a combination of a copy manner and a deletion processing manner, and the like. The embodiments of the present application are not particularly limited herein.

In an embodiment of the application, after the data enhancement processing is performed on the initial sample text, the posterior cleaning can be further performed on the data obtained after the data enhancement processing, if an abnormal text is found, the abnormal text is removed, and then the target sample text is generated according to the text after the removal.

In order to analyze the effect of the text recognition model (denoted as V3 model) in the embodiment of the present application, the results are now compared with the pipeline model (denoted as V2 model) trained by the keyword triggering model (denoted as V1 model), the named entity recognition task and the classification task, respectively, and the comparison results are shown in table 1 below.

	Rate of accuracy	Coverage rate	F1 value
				Model V1	50.00％	74.18％	59.73％
Model V2	66.67％	47.25％	55.31％
				Model V3	90.06％	92.31％	91.17％

TABLE 1

The comparison index comprises an accuracy rate, a coverage rate and an F1 value, wherein the accuracy rate is a ratio of correct prediction of the classification model for a certain category of a given test set, the coverage rate is a value of how much positive classes in a sample are correctly predicted by the classification model for the certain category of the given test set, and the F1 value is a value of comprehensively considering the accuracy rate and the coverage rate. As can be seen from table 1, the accuracy and the coverage of the text recognition model in the embodiment of the present application are both significantly improved.

Embodiments of the apparatus of the present application are described below, which may be used to perform the text recognition methods in the above-described embodiments of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the text recognition method described above in the present application.

Fig. 7 shows a block diagram of a text recognition apparatus according to an embodiment of the present application, and referring to fig. 7, a text recognition apparatus 700 according to an embodiment of the present application includes: a first adding unit 702, a first input unit 704, an obtaining unit 706, and a generating unit 708.

The first adding unit 702 is configured to add a first classification mark in a text to be recognized to generate an input object corresponding to the text to be recognized; the first input unit 704 is configured to input the input object to a text recognition model, where the text recognition model is obtained by training a target sample text carrying a labeled entity tag and a labeled classification tag, the labeled entity tag is an entity tag corresponding to each character in the target sample text, and the labeled classification tag is a classification tag corresponding to a second classification tag added to the target sample text; the obtaining unit 706 is configured to obtain a prediction entity tag corresponding to each word in the text to be recognized output by the text recognition model and a prediction classification tag corresponding to the first classification mark; the generating unit 708 is configured to generate an entity recognition result for the text to be recognized according to the predicted entity tag, and generate a classification result for the text to be recognized according to the predicted classification tag.

In some embodiments of the present application, the first adding unit 702 is configured to: dividing the text to be recognized by taking characters as units to generate a character sequence corresponding to the text to be recognized; and adding the first classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as an input object corresponding to the text to be recognized.

In some embodiments of the present application, the generating unit 708 is configured to: identifying characters which are continuous in position in the text to be identified and correspond to the predicted entity labels and indicate the same entity as the same entity to obtain an entity identification result aiming at the text to be identified; and taking the classification category indicated by the prediction classification label as a classification result for the text to be recognized.

In some embodiments of the present application, the apparatus further comprises: a second adding unit, configured to add the second classification mark in the target sample text to generate a sample input object corresponding to the target sample text; the second input unit is configured to input the sample input object into a model to be trained, and obtain the prediction scores of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label; and the determining unit is configured to determine a loss function according to the labeled entity label, the labeled classification label and the prediction score, and adjust parameters of the model to be trained according to the loss function to obtain the text recognition model.

In some embodiments of the present application, the determining unit comprises: a prediction score determining subunit configured to determine, according to the labeled entity tag and the labeled classification tag, a target prediction score corresponding to an entity tag that is the same as the labeled entity tag and a classification tag that is the same as the labeled classification tag from the prediction scores; a loss function determination subunit configured to determine the loss function according to the target prediction score.

In some embodiments of the present application, the loss function determination subunit is configured to: calculating a ratio between the target prediction score and a sum of the prediction scores; and carrying out logarithmic operation on the ratio to obtain an operation result, and determining the loss function according to the operation result.

In some embodiments of the present application, the second adding unit is configured to: dividing the target sample text by taking characters as units to generate a character sequence corresponding to the target sample text; and adding the second classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as a sample input object corresponding to the target sample text.

In some embodiments of the present application, the apparatus further comprises: a keyword acquisition unit configured to acquire a keyword from a keyword library established in advance; the sample text acquisition unit is configured to acquire a sample text containing the keywords according to the keywords; the labeling unit is configured to perform entity label labeling and category label labeling on part of the obtained sample texts to generate initial sample texts; and the processing unit is configured to perform data enhancement processing on the initial sample text to generate the target sample text.

In some embodiments of the present application, the processing unit is configured to: copying the initial sample text to obtain a copy of the initial sample text; and generating the target sample text according to the initial sample text and the copy of the initial sample text.

In some embodiments of the present application, the processing unit is configured to: synonym replacement is carried out on the target keywords contained in the initial sample text, so that a sample text containing synonyms of the target keywords is generated; and generating the target sample text according to the initial sample text and the sample text containing the synonyms of the target keywords.

In some embodiments of the present application, the processing unit is configured to: deleting a part of characters contained in the initial sample text to obtain a processed initial sample text; and generating the target sample text according to the initial sample text and the processed initial sample text.

In some embodiments of the present application, the processing unit is configured to: randomly inserting characters into the initial sample text to obtain a new initial sample text; and generating the target sample text according to the initial sample text and the new initial sample text.

It should be noted that the computer system 800 of the electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 8, a computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for system operation are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An Input/Output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. When the computer program is executed by the Central Processing Unit (CPU)801, various functions defined in the system of the present application are executed.

It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with a computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method of text recognition, the method comprising:

adding a first classification mark in a text to be recognized to generate an input object corresponding to the text to be recognized;

inputting the input object into a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added in the target sample text;

acquiring a prediction entity label corresponding to each character in the text to be recognized output by the text recognition model and a prediction classification label corresponding to the first classification mark;

and generating an entity identification result aiming at the text to be identified according to the predicted entity label, and generating a classification result aiming at the text to be identified according to the predicted classification label.

2. The method of claim 1, wherein adding a first classification mark to a text to be recognized to generate an input object corresponding to the text to be recognized comprises:

dividing the text to be recognized by taking characters as units to generate a character sequence corresponding to the text to be recognized;

and adding the first classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as an input object corresponding to the text to be recognized.

3. The method of claim 1, wherein generating an entity recognition result for the text to be recognized according to the predicted entity tag and generating a classification result for the text to be recognized according to the predicted classification tag comprises:

identifying characters which are continuous in position in the text to be identified and correspond to the predicted entity labels and indicate the same entity as the same entity to obtain an entity identification result aiming at the text to be identified;

and taking the classification category indicated by the prediction classification label as a classification result for the text to be recognized.

4. The method according to any one of claims 1-3, further comprising:

adding the second classification mark in the target sample text to generate a sample input object corresponding to the target sample text;

inputting the sample input object into a model to be trained to obtain the prediction scores of each character in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label;

and determining a loss function according to the labeled entity label, the labeled classification label and the prediction score, and adjusting parameters of the model to be trained according to the loss function to obtain the text recognition model.

5. The method of claim 4, wherein determining a loss function based on the labeled entity label, the labeled classification label, and the prediction score comprises:

according to the labeled entity label and the labeled classification label, determining a target prediction score corresponding to an entity label which is the same as the labeled entity label and a classification label which is the same as the labeled classification label from the prediction scores;

and determining the loss function according to the target prediction score.

6. The method of claim 5, wherein determining the loss function based on the target prediction score comprises:

calculating a ratio between the target prediction score and a sum of the prediction scores;

and carrying out logarithmic operation on the ratio to obtain an operation result, and determining the loss function according to the operation result.

7. The method of claim 4, wherein adding the second classification label to the target sample text to generate a sample input object corresponding to the target sample text comprises:

dividing the target sample text by taking characters as units to generate a character sequence corresponding to the target sample text;

and adding the second classification mark in the character sequence to obtain a new character sequence, and taking the new character sequence as a sample input object corresponding to the target sample text.

8. The method of claim 4, further comprising:

acquiring keywords from a pre-established keyword library;

acquiring a sample text containing the keywords according to the keywords;

carrying out entity label labeling and category label labeling on part of the obtained sample texts to generate initial sample texts;

and performing data enhancement processing on the initial sample text to generate the target sample text.

9. The method of claim 8, wherein performing data enhancement processing on the initial sample text to generate the target sample text comprises:

copying the initial sample text to obtain a copy of the initial sample text;

and generating the target sample text according to the initial sample text and the copy of the initial sample text.

10. The method of claim 8, wherein performing data enhancement processing on the initial sample text to generate the target sample text comprises:

synonym replacement is carried out on the target keywords contained in the initial sample text, so that a sample text containing synonyms of the target keywords is generated;

and generating the target sample text according to the initial sample text and the sample text containing the synonyms of the target keywords.

11. The method of claim 8, wherein performing data enhancement processing on the initial sample text to generate the target sample text comprises:

deleting a part of characters contained in the initial sample text to obtain a processed initial sample text;

and generating the target sample text according to the initial sample text and the processed initial sample text.

12. The method of claim 8, wherein performing data enhancement processing on the initial sample text to generate the target sample text comprises:

randomly inserting characters into the initial sample text to obtain a new initial sample text;

and generating the target sample text according to the initial sample text and the new initial sample text.

13. A text recognition apparatus, characterized in that the apparatus comprises:

the first adding unit is configured to add a first classification mark in a text to be recognized so as to generate an input object corresponding to the text to be recognized;

the first input unit is configured to input the input object to a text recognition model, wherein the text recognition model is obtained by training a target sample text carrying a labeled entity label and a labeled classification label, the labeled entity label is an entity label corresponding to each character in the target sample text, and the labeled classification label is a classification label corresponding to a second classification label added in the target sample text;

the obtaining unit is configured to obtain a prediction entity label corresponding to each character in the text to be recognized and output by the text recognition model and a prediction classification label corresponding to the first classification mark;

and the generating unit is configured to generate an entity identification result aiming at the text to be identified according to the predicted entity label and generate a classification result aiming at the text to be identified according to the predicted classification label.

14. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out a text recognition method as claimed in any one of claims 1 to 12.

15. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the text recognition method of any one of claims 1 to 12.