CN115080709A

CN115080709A - Text recognition method and device, nonvolatile storage medium and computer equipment

Info

Publication number: CN115080709A
Application number: CN202110276318.2A
Authority: CN
Inventors: 魏梦溪; 张雅婷
Original assignee: Alibaba Singapore Holdings Pte Ltd
Current assignee: Alibaba Innovation Co
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2022-09-20

Abstract

The application discloses a text recognition method, a text recognition device, a nonvolatile storage medium and computer equipment. Wherein, the method comprises the following steps: acquiring a text to be identified; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

Description

Text recognition method and device, nonvolatile storage medium and computer equipment

Technical Field

The present application relates to the field of machine learning, and in particular, to a text recognition method, a text recognition device, a non-volatile storage medium, and a computer apparatus.

Background

In the field of intelligent judicial, a judicial literature analysis platform is taken as a typical application of NLP in a judicial scene, a base bearing knowledge is used, and analysis of various kinds of literature (referees, prosecution, judgment, court trial notes, evidence materials and the like) in the judicial scene are carried on the shoulder. In a judicial literature analysis platform, entity extraction and event extraction are essential basic natural language processing tasks, large-section fact character descriptions in the literature are abstracted, structured texts are formed and output, facts can be more clearly shown in front of users, and the venation of event development, the relationship between events and the like can be cleared. And meanwhile, necessary bedding is formed for various upstream tasks.

In a complex description of an event, the "give/receive" behavior of an agent may shift, even in a complex event, the agent has multiple identities. Such as: the Wangzhi is a victim of the crime of property and a victim of the crime of person. However, in the prior art, since four basic problems of classification recognition, trigger word recognition, event element extraction and argument discrimination are attempted to be solved at one time, the design is based on some ideal basic assumptions, such as: the overlapping phenomenon between elements is not obvious; in the event represented by the same sentence, the host and the object are consistent and do not change; the relationship between the entity element in the event and the trigger word is simple.

Through statistics, such a system can solve 90% of the event situations in the law enforcement criminal case. However, obviously, in a real scene, complex events exist, and solving the extraction problem of the complex events is a difficult problem to overcome.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the application provides a text recognition method and device, a nonvolatile storage medium and computer equipment, so as to at least solve the technical problem that structured texts cannot be formed due to the fact that multiple identities of people in complex cases cannot be processed.

According to an aspect of an embodiment of the present application, there is provided a text recognition method including: acquiring a text to be identified; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

According to another aspect of the embodiments of the present application, there is also provided a text recognition method, including: acquiring a text to be identified; performing first identification on a text to be identified to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; performing secondary identification on the text to be identified to obtain a second identification result, wherein the second identification result comprises entities and argument information corresponding to each event type; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

According to another aspect of the embodiments of the present application, there is also provided a text recognition apparatus, including: the acquisition module is used for acquiring a text to be recognized; the first identification module is used for identifying the text to be identified by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; the second identification module is used for identifying the text to be identified by adopting the entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities and argument information corresponding to each event type; and the determining module is used for respectively determining the event information corresponding to each event type based on the first recognition result and the second recognition result.

According to another aspect of the embodiments of the present application, there is also provided a non-volatile storage medium including a stored program, wherein the apparatus in which the non-volatile storage medium is controlled to execute the text recognition method when the program is executed.

According to another aspect of the embodiments of the present application, there is also provided a computer device, including: a processor; and a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: acquiring a text to be identified; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; and respectively determining the event information corresponding to each event type based on the first recognition result and the second recognition result.

In the embodiment of the application, the text to be recognized is obtained; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; the method for determining the event information corresponding to each event type based on the first recognition result and the second recognition result achieves the purpose of determining the event information corresponding to each event type by recognizing the event type and the entity and argument information corresponding to each event type, thereby realizing the technical effect of recognizing the complex text and further solving the technical problem that the structured text cannot be formed due to the fact that multiple identities of the actors in the complex case cannot be processed.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a block diagram of a hardware configuration of a computer terminal according to an embodiment of the present application;

FIG. 2 is a schematic flow chart diagram of a text recognition method according to an embodiment of the present application;

FIG. 3 is a schematic flow chart diagram of another text recognition method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a text recognition apparatus according to an embodiment of the present application;

FIG. 5 is a schematic workflow diagram of a text recognition method according to an embodiment of the present application;

FIG. 6a is a diagram illustrating the classification result of an event according to an embodiment of the present application;

FIG. 6b is a schematic diagram of an entity and argument recognition result according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an algorithmic position in a judicial literature analysis of a text recognition method according to an embodiment of the application;

FIG. 8 is a schematic diagram of an interactive interface of a text recognition method according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:

pre-trained Language Model. In recent years, models such as BERT are pre-trained through a language model loss function based on a large amount of linguistic data, and powerful results are obtained in a series of Natural Language Processing (NLP) tasks.

BERT, a pre-trained language model, has achieved the performance of state-of-art in multiple natural language processing tasks.

Event extraction: the method is widely applied to the fields of automatic abstracting, automatic question answering, information retrieval and the like.

MRC Machine Reading Comprehension (MRC), giving an article (context), and having a question (query) based on the article, let the Machine answer the question after Reading the article. The task involved in this design only requires the selection of a snippet from the article that can answer the question, i.e., a "snippet selection" task.

Entity nesting: nesting is present in the entities identified in the entity identification task. For example, "Beijing university" is not only an organization, but also a location.

Example 1

There is also provided, in accordance with an embodiment of the present application, a method embodiment for text recognition, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing a text recognition method. As shown in fig. 1, the computer terminal 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 104 for storing data, and a transmission module 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).

The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the text recognition method in the embodiment of the present application, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, that is, implementing the vulnerability detection method of the application program. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet via wireless.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with the user interface of the computer terminal 10 (or mobile device).

Under the above operating environment, the present application provides a text recognition method as shown in fig. 2. Fig. 2 is a flowchart of a text recognition method according to a first embodiment of the present application, as shown in fig. 2, the method includes the following steps:

s202, acquiring a text to be identified;

s204, identifying the text to be identified by adopting an event identification model in the machine learning model to obtain a first identification result;

and the first recognition result comprises the event type in the text to be recognized and an event trigger word corresponding to the event type. As shown in fig. 6a, the event trigger corresponding to the event type "property" is "rush to walk", and the event trigger corresponding to the event type "person" is "prick to death".

In some embodiments of the present application, the event recognition model is trained by:

respectively inputting multiple groups of first sample data into the event recognition model for training, wherein each group of data in the multiple groups of first sample data comprises: the system comprises a first sample text, a first question template used for determining event trigger words in the first sample text, answer information corresponding to the first question template and an event type. For example, the first text sample may be "when a car drives slowly to a road section of an a-road gas station and a co-pedestrian is stuck in collision", the first question template may be "what is a trigger of an event", the corresponding answer information is "collision down", and the corresponding event type is "social-traffic offender type".

S206, recognizing the text to be recognized by adopting an entity recognition model in the machine learning model to obtain a second recognition result;

and the second identification result comprises entities and argument information corresponding to each event type. Specifically, in some embodiments of the present application, the entity refers to information such as the time, place, and result of the case, and the argument refers to the pest and the victim in the case. As shown in FIG. 6b, the case has the consequences that the pest applicator kills the victim, the pest applicator of the case is ' Wangyang ', and the pest of the case is ' Song ' some '.

respectively inputting multiple groups of second sample data into the event recognition model for training, wherein each group of data in the multiple groups of second sample data comprises: the answer-oriented text question generating method comprises a second sample text, a second question template used for determining entities and arguments in the second sample text, and answer information corresponding to the second question template. For example, the second sample text may be "a king robe a satchel and kill the victim", the second question template is "property triggered by robbery-who is the victim of the theft rob knock down event" and "person triggered by killing-who is the victim of the injury knock down event", and answer information "king" corresponding to this question template.

In some embodiments of the present application, the event recognition model and the entity recognition model are the same pre-trained language model, such as a bert model, and the event recognition model and the entity recognition model use the same model parameters. That is, the model parameters updated after the training of the event recognition model may also be used to update the model parameters of the entity recognition model. Similarly, the updated model parameters after the entity model is trained may also be used to update the model parameters after the event recognition model is trained.

In some embodiments of the present application, a pre-trained language model may be constructed as a machine-readable understandable model for better recognition of text. However, since the text to be recognized is not a natural reading comprehension problem, the construction of the question becomes critical in order to improve the accuracy of the model. For the reading understanding task of event identification and entity argument identification, the problem of each stage can be constructed by using a slot filling method.

For phase one, the problem may be set to: what is the trigger word of the event? The trigger for the event is obtained by searching for the answer to the question in context (context of the sentence to be recognized). And because the type of the event is likely to be larger than 1, the answer to the question is likely to be a plurality of words, even the intersection exists between the words, which can be supported by the reading and understanding type model. Examples are as follows:

example (c):

q is what is the trigger for an event?

And C, when a vehicle is driven slowly to the Gannan major gas station road section, the pedestrians in the same direction are dug down.

A, knocking over (society-traffic accident)

For phase two, the question template may be set to: < event classification > class event triggered by < trigger word > is < what/who > is < entity name/argument name >? Assuming that the set of labels for all entities and arguments of the dataset is Y, there is a question q (x, Y) about each entity label Y therein, and for each trigger x predicted in phase one. Examples are as follows:

c, rob Wang Song satchel and kill the victim.

Q property triggered by robbery-who is the victim of the theft robbery knock-on lasso-like event?

A is Wangzhi

Q who is the victim of the personal injury blow type event triggered by the kill?

A is Song one

In this manner, when processing complex events, the model can extract both the entities and arguments (the offenders/victims) associated with a particular event.

After the problem or the problem template of each stage is determined, the machine-readable understanding-type model specific training data is constructed as follows:

in the first stage, given a text sequence W of length N as the text (context) of the model, the set question Q1 "is what is the trigger of the event? ", multiple (Question-text-Answer) triplets (Question, Context, Answer) can be obtained, each Answer corresponds to a different event trigger class, and the events have several types, which can generate several triplets. These triplets are training examples. For each event type i, the trigger word j for the event type i can be represented by Ej (start: end), and if start and end (where start is the beginning position of the entity and end is the ending position of the entity) are not found, it is proved that the word has no trigger word under the event i, i.e. does not belong to the event classification. If start and end are found, the start can be determined by context [ start: end (i.e., the beginning to end position of the event in the text) to obtain the trigger word for the event.

Similarly, in the second phase, the question filling can be performed by using the known trigger, and the question Q2 is set to circle the event type to be processed by the input. For multiple events, multiple problems may be set, and the total number of triples obtained is: event type number entity number. Similarly, two pointers, start and end, can be used to predict the location of a particular entity. By context [ start: end ] to get the extraction result of the specific entity type under the trigger word of a certain event.

In the training phase of the whole model, the deep pre-training model BERT can also be used as an encoder, giving a sentence as input: (omega) ₁ ，ω ₂ ，...，ω _n ) Where n is the text length and ω represents a character. The filled question is: (q) a ₁ ：q ₂ ，...，q _m ) M is the length of the entire question, and q represents one character. A special [ CLS ]]Character and one SEP]Characters (the two characters are the flag bits of the bert language, where [ CLS]Character at head end of question, [ SEP ]]Characters at the tail end of the stationery) before entering the network, and between a question and a context: the overall input of the model is { [ CLS { []，q(1)，q(2)，…，q(m)，[SEP]ω (1), ω (2), …, ω (n) }. In the model, [ CLS ] can be used]The last implicit state of each layer output is spliced and used as the representation of the vector space of the whole sequence for the event classification module. And the remaining vectors of the sequences output by the BERT model will be directly used for reading the understood module as follows:

C＝H ₀

in the above equation, BERT is the pre-training model, and Θ represents all learnable parameters in the model, ω _0：k Representing the kth token (label) in the input. H represents the hidden state vector output after tokens passes through the BERT model. Because [ CLS ] is added at the beginning of the original sentence]token, therefore H ₀ ＝C＝[CLS]Is represented by a hidden vector. The rest of H represents the hidden vector corresponding to the token position.

Reading and understanding the formula model when identifying start, end: for each character it is predicted whether it is the beginning of an entity or whether it is the end of an entity.

The above describes the way to construct the training data, since the model is a pipeline system, finally when the actual training phase is performed, the model will run the first phase first, and fill the problem template of the second phase with the type and trigger word predicted in the first phase as input. Then the second phase is run, and all the results are finally obtained.

And S208, respectively determining event information corresponding to each event type based on the first recognition result and the second recognition result.

In some embodiments of the application, in order to improve the text recognition quality, before determining event information corresponding to each event type based on the first recognition result and the second recognition result, first counting a first number of event trigger words in a text to be recognized and a second number of event types in the text to be recognized, then comparing the first number with the second number, and determining all event trigger words in the text to be recognized according to the comparison result to perform screening, so as to obtain a target event trigger word.

Specifically, determining all event trigger words in the text to be recognized according to the comparison result, and screening to obtain a target event trigger word, including: when the comparison result indicates that the first number is larger than the second number, the situation that some non-event trigger words are mistakenly identified as target trigger words is indicated, the evaluation indexes of all event trigger words in the text to be identified are determined, the evaluation indexes comprise the certainty factor of the event trigger words, the certainty factor refers to the correlation size of the event trigger words and the events corresponding to the event trigger words, and the certainty factor can be obtained by counting the probability that the events corresponding to some event trigger words in a large number of legal texts are in accordance with the actual events in the texts through big data; sequencing all event trigger words in the text to be recognized according to the size of an evaluation index; and selecting the target event trigger word from the first number of event trigger words according to the sorting result. It is to be noted that the number of the target event trigger words selected from the first number of event trigger words is the same as the second number.

And when the comparison result indicates that the first number is smaller than the second number, generating prompt information for prompting that the recognition result of the event trigger word is wrong. It is understood that when the first number, i.e., the number of event trigger words, is less than the second number of event types, it indicates that there is a false recognition, i.e., that all event trigger words are not accurately recognized, or that some event types are recognized incorrectly.

In some embodiments of the present application, when the first number is found to be non-zero and the second number is zero, then a prompt message is issued and execution of the following steps is denied: and respectively determining the event information corresponding to each event type based on the first recognition result and the second recognition result. In order to better understand the working process of the text recognition method shown in fig. 2, the following is further explained in conjunction with the working flow diagram of the text recognition method shown in fig. 5, wherein the working flow diagram shown in fig. 5 comprises the following steps:

s502, inputting the fact in the referee document;

s504, paragraph clause;

s506, judging whether the traversal is finished, if so, executing the step S516, and if not, executing the step S508;

s508, event classification and trigger word recognition;

s510, processing a classification result by using the event classification number;

s512, filling a question with each identified trigger word, and identifying all possible entities under the classification;

s514, obtaining a complete event/events, and then executing the step S506;

s516, arranging the plurality of facts according to the event sequence.

As shown in fig. 5, in order to solve a problem that there are multiple events in a sentence, each event has a different entity and argument, such as: 'Wangzhiji can chase after being robbed and knock down on the ground'; robbery in a 12 th day by Wang, killing people the next day. In the event of such a complex event, the present application designs a pipeline system. The system splits a deep learning model part into two parts except that the logic of sentence preprocessing is consistent with that of the previous system, wherein the two parts are respectively as follows: an event classification phase and an entity and argument identification phase, as well as the first and second phases in FIG. 5.

In the first stage, the task of the model is to perform event-triggered word recognition and event number classification on the input sentences. By means of the recognition of different types of trigger words, a classification of events is achieved. For example, a trigger "hijack" that successfully identifies the "property" category, i.e., a category representing an event, may be classified into the "property" category. In addition, the result of using event number classification can be used to correct the trigger recognition result: when the event number is predicted to be 0, the trigger word is not released to the user or the next stage even if recognized. The number of the recognized events is 1, but when two different types of trigger words are actually predicted, the certainty factors of the trigger words need to be sequenced, and only one with higher certainty factor is selected for reservation.

In the second stage, the task of the model is to sequentially construct input together with the original sentence for the n trigger words (and the event classification result) obtained in the first stage, and identify the entity (time, place, etc.) and argument (offender, victim) in the sentence under the specific event type and event trigger word. Such an input is constructed n times, and through this stage, the complete fact of multiple events occurring in a sentence is obtained.

Finally, it is worth mentioning that although the model is divided into two phases, the two phases are performed simultaneously, i.e. the parameters of the BERT layer are shared. Therefore, the problems that the modules of the traditional pipeline system are independent from each other and the data utilization is insufficient are fully solved.

In some embodiments of the present application, there is also provided an application for performing the above method, the application having an interactive interface, as shown in fig. 8. The upper half part of the interactive interface is a text input area, corresponding text content can be selected to be directly copied and pasted, files in common text formats such as doc, docx and PDF can also be copied, a file uploading control can also be set, and when the control is triggered, the files to be identified are constructed and uploaded to an app so as to identify the types and the number of events in the files. The lower half part is an output area which can output the identified event type, keyword, entity, argument and the like. It will be appreciated that the number of events of the output area in actual use need not be three as shown in fig. 8. Meanwhile, the output area is also used for sending prompt information to the user when the identification is found to be wrong.

In addition, in some embodiments of the present application, the text recognition method may also be applied to the following scenarios.

Application scenario 1: processing of punishment rules and complaints of the E-commerce platform;

when a customer complains on the e-commerce platform, the e-commerce platform acquires complaint contents to be identified; then, the e-commerce platform identifies the complaint content by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises the complaint type (namely the event type) in the text to be identified and event trigger words such as 'false publicity', 'deliberately sending wrong goods', and the like corresponding to the event type; adopting an entity identification model in the machine learning model to identify the complaint content again to obtain a second identification result, wherein the second identification result comprises entity information such as time and place corresponding to each event type and argument information such as victims and victims; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

For example, when a customer purchases article a in an online store on a certain e-commerce platform, there is no reason for article a to be refunded for 14 days. As a result, after receiving the express delivery, the customer finds that the actual efficacy of the commodity A is seriously inconsistent with the propaganda efficacy in the online shop. And the network shop side refuses the refund requirement of the client all the time when the client requires the refund. Thus, when the customer complains to the e-commerce platform, the e-commerce platform can use the machine learning model to perform text recognition on the complaining content of the customer. When in recognition, the machine learning model can judge the penalty rules involved in the complaint content according to the keywords of 'propaganda efficacy seriously inconsistent' and 'refusal of unpruned refund', and determine that the online shop violates the benefit of the customer A in the process of purchasing the product A. Therefore, the e-commerce platform can reasonably penalize the online shop according to the penalty rule and the complaint content of the user.

Application scenario 2: lawyers simulate court opinions;

a lawyer can prepare case information needing to be debated in advance before a court; then, identifying case information by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises a crime type (namely an event type) in a text to be identified and event trigger words such as 'injury', 'theft' and the like corresponding to the event type; adopting an entity recognition model in the machine learning model to recognize the case information again to obtain a second recognition result, wherein the second recognition result comprises entity information such as time and place corresponding to each event type and argument information such as victims and victims; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

For example, when a lawyer needs to defend against a victim's minds in case A, the lawyer can analyze case A using the machine learning model described herein to determine which criminal activities are specifically involved in case A, which springs were perpetrated, and which rights and interests of the victim's minds were violated by the criminal suspect. Based on the information obtained through the machine learning model, lawyers can simulate possible opinions in the court before the court, and thus can achieve targeted debates.

Application scenario 3: forming a plan in a hospital;

before the public complaints are lifted, the case content can be prepared in a hospital; then, identifying the case content by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises a crime type (namely an event type) in a text to be identified and event trigger words such as 'injury', 'theft' and the like corresponding to the event type; adopting an entity recognition model in the machine learning model to recognize the case information again to obtain a second recognition result, wherein the second recognition result comprises entity information such as time and place corresponding to each event type and argument information such as victims and victims; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

When the inspection hospital needs to lift the official complaints about a certain case, the inspection hospital can firstly use the machine learning model to analyze the case content. The criminal behavior is determined, the criminal behavior offends laws, criminal suspects and victims in the criminal behavior cannot increase or decrease criminal episodes and the like, so that a plan is formed before the official complaints are mentioned, and the problem that some criminal behaviors or criminal mistakes are missed possibly is avoided.

Application scenario 4: corporate law;

before a company wants to develop a certain new service, the company legal affairs can prepare the service content of the new service; then, identifying the service content by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises a service type (namely an event type) in the text to be identified and event trigger words such as 'parallel purchase', 'combination', and the like corresponding to the event type; adopting an entity identification model in the machine learning model to identify the case information again to obtain a second identification result, wherein the second identification result comprises entity information such as time and place corresponding to each event type and argument information such as an acquirer and an acquirer; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

For example, when a company wants to develop a new service, the legal affairs of the company may analyze the content of the new service using the machine learning model, and determine legal risks existing in the new service, that is, which laws and regulations the new service may violate, thereby avoiding in advance.

Application scenario 5: mediation and arbitration of related departments such as a residence committee, a civil bureau and the like;

before the residence committee reconciles the contradiction between the neighborhoods, the dispute content needing to be reconciled can be prepared; then, identifying the dispute execution content by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises the dispute execution type (namely the event type) in the text to be identified and event trigger words such as 'disturbing residents', 'occupying parking spaces' and the like corresponding to the event type; adopting an entity identification model in the machine learning model to identify the dispute content again to obtain a second identification result, wherein the second identification result comprises entity information such as time and place corresponding to each event type and argument information such as a benefit infringement party and a benefit infringement party; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

For example, when a residence committee mediates a dispute between neighborhoods, the residence committee can analyze the dispute content using the machine learning model to determine which events the parties specifically dispute due to, and who the interested party and the interested party in the events are respectively, thereby better mediating the relationships.

Application scenario 6: checking whether the dispute focus is changed;

in the dispute process of the defendant and the original report in the court, the working personnel in the court can record dispute contents in real time; then, identifying the dispute content by adopting an event identification model in the machine learning model to obtain a first identification result, wherein the first identification result comprises dispute problems (namely event types) in the text to be identified and event trigger words such as injuries, theft and the like corresponding to the dispute problems; adopting an entity identification model in the machine learning model to identify the disputed content again to obtain a second identification result, wherein the second identification result comprises entity information such as time and place corresponding to each event type and argument information such as victim and victim; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

For example, in a certain court trial, a judge can judge whether lawyer of a certain party blurs the dispute focus in the dispute process according to the analysis result of the machine learning model, so that the court trial efficiency is improved.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.

Example 2

According to an embodiment of the present application, there is also provided a text recognition method, as shown in fig. 3, the method including the steps of:

s302, acquiring a text to be recognized;

s304, carrying out first recognition on the text to be recognized to obtain a first recognition result;

S306, carrying out secondary recognition on the text to be recognized to obtain a second recognition result;

and the second identification result comprises entities and argument information corresponding to each event type. Specifically, in some embodiments of the present application, the entity refers to information such as the time, place, and result of the case, and the argument refers to the pest and the victim in the case. As shown in FIG. 6b, the case is resulted from the fact that the pest is killed by the pest, the pest of the case is named "Wangyang", and the pest of the case is named "Song Dynasty".

And S308, respectively determining event information corresponding to each event type based on the first recognition result and the second recognition result.

And when the comparison result indicates that the first number is smaller than the second number, generating prompt information for prompting that the recognition result of the event trigger word is wrong. It is understood that when the first number, i.e., the number of event trigger words, is less than the second number of event types, it indicates that there is a misrecognition case, i.e., that all event trigger words are not recognized accurately, or that some event types are recognized incorrectly.

In some embodiments of the present application, when the first number is found to be non-zero and the second number is zero, then a prompt message is issued and execution of the following steps is denied: and respectively determining event information corresponding to each event type based on the first identification result and the second identification result.

In order to better understand the working process of the text recognition method shown in fig. 3, the following is further explained in conjunction with the working flow diagram of the text recognition method shown in fig. 5, wherein the working flow diagram shown in fig. 5 comprises the following steps:

s502, inputting the fact in the referee document;

s504, paragraph clause;

s508, event classification and trigger word recognition;

s514, obtaining a complete event/events, and then executing the step S506;

s516, the facts are arranged according to the event sequence.

As shown in fig. 5, in order to solve a problem that there are multiple events in a sentence, each event has a different entity and argument, such as: 'Wangzhi can chase after being robbed and knock down the duji on the ground'; robbery in a certain 12 days is treated by the King and the people are killed in the next day. In the event of such a complex event, the present application designs a pipeline system. The system splits a deep learning model part into two parts except that the logic of sentence preprocessing is consistent with that of the previous system, wherein the two parts are respectively as follows: an event classification phase and an entity and argument identification phase, and a first phase and a second phase in fig. 5.

In the first stage, the task of the model is to perform event-triggered word recognition and event number classification on the input sentences. By means of the recognition of different types of trigger words, a classification of events is achieved. For example, a trigger "hijack" that successfully identifies the "property" category, i.e., a category representing an event, may be classified into the "property" category. In addition, the result of using event number classification can be used to correct the trigger recognition result: when the number of events is predicted to be 0, the trigger word is not revealed to the user or the next stage even if recognized. The number of the recognized events is 1, but when two different types of trigger words are actually predicted, the certainty factors of the trigger words need to be sequenced, and only one with higher certainty factor is selected for reservation.

In some embodiments of the present application, an interactive interface is also provided, as shown in fig. 8. The upper half part of the interactive interface is a text input area, wherein corresponding text content can be selected to be directly copied and pasted, and files with common text formats such as doc, docx, PDF and the like can also be copied. The lower half part is an output area which can output the identified event type, keyword, entity, argument and the like. It will be appreciated that the number of events of the output area in actual use need not be three as shown in fig. 8. Meanwhile, the output area is also used for sending prompt information to the user when the identification is found to be wrong.

Example 3

According to an embodiment of the present application, there is also provided an apparatus for implementing the text recognition method, as shown in fig. 4, the apparatus includes:

the obtaining module 40 is configured to obtain a text to be recognized; the first identification module 42 is configured to identify the text to be identified by using an event identification model in a machine learning model to obtain a first identification result, where the first identification result includes an event type in the text to be identified and an event trigger word corresponding to the event type; a second identification module 44, configured to identify the text to be identified by using an entity identification model in the machine learning model to obtain a second identification result, where the second identification result includes entities and argument information corresponding to each event type; a determining module 46, configured to determine event information corresponding to the respective event types based on the first recognition result and the second recognition result, respectively.

In some embodiments of the application, before determining the event information corresponding to each event type based on the first recognition result and the second recognition result, the determining module 46 needs to count a first number of event trigger words in the text to be recognized and a second number of event types in the text to be recognized, compare the first number and the second number, and determine all event trigger words in the text to be recognized according to the comparison result to perform screening, so as to obtain the target event trigger word.

And when the comparison result indicates that the first number is smaller than the second number, generating prompt information for prompting that the recognition result of the event trigger word is wrong. It is understood that when the first number, i.e., the number of event trigger words, is less than the second number of event types, it indicates that there is a misrecognition case, i.e., that all event trigger words are not recognized accurately, or that some event types are recognized incorrectly. When the first number is found to be non-zero and the second number is zero, a prompt message is issued and execution of the following steps is refused: and respectively determining event information corresponding to each event type based on the first identification result and the second identification result.

It should be noted here that the above modules as a part of the apparatus may be operated in the computer terminal 10 provided in the first embodiment.

Example 4

The embodiment of the application can provide a computer terminal, and the computer terminal can be any one computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.

Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.

In this embodiment, the computer terminal may execute program codes of the following steps in the text recognition method: acquiring a text to be identified; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

By adopting the embodiment of the application, a scheme for text recognition is provided. Obtaining a text to be recognized; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities and argument information corresponding to each event type; the event information corresponding to each event type is respectively determined based on the first recognition result and the second recognition result, so that the purpose of determining the event information corresponding to each event type is achieved, the technical problem that structured texts cannot be formed due to the fact that multiple identities of people in complex cases cannot be processed is solved, and judicial literature analysis can be better performed, as shown in fig. 7.

It should be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 1 is a diagram illustrating a structure of the electronic device. For example, the computer terminal 1 may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

Embodiments of the present application also provide a storage medium. Optionally, in this embodiment, the storage medium may be configured to store a program code executed by the text recognition method provided in the first embodiment.

Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.

Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring a text to be identified; identifying a text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type; identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to the event types and argument information; event information corresponding to each event type is determined based on the first recognition result and the second recognition result, respectively.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technical content can be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims

1. A text recognition method, comprising:

acquiring a text to be recognized;

identifying the text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type;

identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities corresponding to various event types and argument information;

and respectively determining event information corresponding to each event type based on the first identification result and the second identification result.

2. The method according to claim 1, wherein before determining the event information corresponding to the respective event types based on the first recognition result and the second recognition result, respectively, the method further comprises:

counting a first number of event trigger words in the text to be recognized and a second number of event types in the text to be recognized;

comparing the first number with the second number;

and determining all event trigger words in the text to be recognized according to the comparison result, and screening to obtain target event trigger words.

3. The method according to claim 2, wherein determining all event trigger words in the text to be recognized according to the comparison result to perform screening to obtain a target event trigger word comprises:

when the comparison result indicates that the first number is larger than the second number, determining evaluation indexes of all event trigger words in the text to be recognized;

sequencing all event trigger words in the text to be recognized according to the size of an evaluation index; and selecting the target event trigger words from the first number of event trigger words according to the sorting result.

4. The method of claim 3, wherein the number of target event-triggered words selected from the first number of event-triggered words is the same as the second number.

5. The method of claim 3, further comprising: and when the comparison result indicates that the first number is smaller than the second number, generating prompt information, wherein the prompt information is used for prompting that the recognition result of the event trigger word is wrong.

6. The method of claim 2, further comprising: when the first number is a non-zero value and the second number is a zero value, refusing to perform the following steps: and respectively determining event information corresponding to each event type based on the first identification result and the second identification result.

7. The method of claim 1, wherein the event recognition model is trained by:

respectively inputting multiple groups of first sample data into the event recognition model for training, wherein each group of data in the multiple groups of first sample data comprises: the system comprises a first sample text, a first question template used for determining event trigger words in the first sample text, answer information corresponding to the first question template and an event type.

8. The method of claim 1, wherein the event recognition model is trained by:

respectively inputting multiple groups of second sample data into the event recognition model for training, wherein each group of data in the multiple groups of second sample data comprises: the answer-oriented question generating method comprises a second sample text, a second question template used for determining entities and arguments in the second sample text, and answer information corresponding to the second question template.

9. The method according to any one of claims 1 to 8, wherein the event recognition model and the entity recognition model are the same pre-trained language model, and the same model parameters are used for the event recognition model and the entity recognition model.

10. A text recognition method, comprising:

acquiring a text to be identified;

performing first recognition on the text to be recognized to obtain a first recognition result, wherein the first recognition result comprises an event type in the text to be recognized and an event trigger word corresponding to the event type;

performing secondary recognition on the text to be recognized to obtain a second recognition result, wherein the second recognition result comprises entities and argument information corresponding to each event type;

11. A text recognition apparatus, comprising:

the acquisition module is used for acquiring a text to be recognized;

the first identification module is used for identifying the text to be identified by adopting an event identification model in a machine learning model to obtain a first identification result, wherein the first identification result comprises an event type in the text to be identified and an event trigger word corresponding to the event type;

the second identification module is used for identifying the text to be identified by adopting an entity identification model in the machine learning model to obtain a second identification result, wherein the second identification result comprises entities and argument information corresponding to each event type;

and the determining module is used for respectively determining the event information corresponding to each event type based on the first recognition result and the second recognition result.

12. A non-volatile storage medium, comprising a stored program, wherein a device in which the non-volatile storage medium is located is controlled to perform the text recognition method of any one of claims 1 to 9 when the program is run.

13. A computer device, comprising:

a processor; and

a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:

acquiring a text to be identified;