CN111597224B

CN111597224B - Method and device for generating structured information, electronic equipment and storage medium

Info

Publication number: CN111597224B
Application number: CN202010305158.5A
Authority: CN
Inventors: 李旭; 刘桂良; 孙明明; 李平
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2023-09-15
Anticipated expiration: 2040-04-17
Also published as: CN111597224A

Abstract

The application discloses a method and a device for generating structured information, electronic equipment and a storage medium, and relates to the field of information processing in the field of natural language processing. The specific implementation scheme is as follows: acquiring a source text sequence; and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

Description

Method and device for generating structured information, electronic equipment and storage medium

Technical Field

The present application relates to the field of information processing in the field of natural language processing, and in particular, to a method and apparatus for generating structured information, an electronic device, and a storage medium.

Background

In various industries, text recorded in natural language is common, and such text is often defined as unstructured text, such as financial statements, news, medical records, and the like. At present, there is a need for structured information extraction in public opinion analysis, propagation analysis, data platform service, etc. application scenarios, that is, extracting required structured fields from unstructured text, such as extracting company names from financial reports, extracting places of attack events from news, extracting patients from medical records, etc. The information extraction includes vertical domain information extraction and open domain information extraction.

The existing structured information extraction method is mainly used for extracting information aiming at the vertical field, and a training sample set needs to be marked when the vertical field is optimally modeled. However, since the open field data are more, the cost of labeling the samples in the open field is high, and the model cannot achieve a good effect, so that the problem of low accuracy rate exists when the model extracts the structured information of the open field information.

Disclosure of Invention

The application provides a method, a device, electronic equipment and a storage medium for a method for generating structured information.

An embodiment of a first aspect of the present application provides a method for generating structured information, including:

acquiring a source text sequence;

and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model.

An embodiment of a second aspect of the present application provides a device for generating structured information, including:

the acquisition module is used for acquiring a source text sequence;

the generation module is used for inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model.

An embodiment of a third aspect of the present application provides an electronic device, including:

At least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating structured information of an embodiment of the first aspect.

An embodiment of a fourth aspect of the present application provides a non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the method for generating structured information of the embodiment of the first aspect.

One embodiment of the above application has the following advantages or benefits: the method comprises the steps of obtaining a source text sequence, inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:

fig. 1 is a flow chart of a method for generating structured information according to an embodiment of the present application;

fig. 2 is a flow chart of a method for generating structured information according to a second embodiment of the present application;

FIG. 3 is an exemplary diagram of a method for generating structured information according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a device for generating structured information according to a third embodiment of the present application;

fig. 5 is a block diagram of an electronic device for implementing a method of generating structured information according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The information extraction task is to extract structured information, such as entities, relationships between entities, etc., from unstructured natural language sentences. The information extraction comprises a vertical field information extraction task and an open field information extraction task, wherein the vertical field information extraction task is to perform information extraction work in a defined just-in-time system, and a plurality of supervised learning or weakly supervised learning methods are often adopted to mine relations among entities in a predefined entity set from a text. The open field information extraction work does not have the predefined system, and usually focuses on the knowledge contained in the natural language sentence and the method for expressing the knowledge, and extracts the entity and the relation between the entities from the open field natural sentence, which can be called as the fact contained in the natural language. These facts are very valuable in many tasks, such as: text summarization, reading understanding, word similarity, knowledge-based question-answering system.

However, the information extraction model in the related art adopts a pattern matching method, such as a manually defined pattern, heuristic learning, etc., which is not suitable for the open field, is unfavorable for the expansion of the model, and requires a lot of manual intervention, so that the disadvantage of high labor cost exists. In addition, the existing information extraction model is mostly used for extracting information in the vertical field, and has the problem of low accuracy when structured information is extracted from information in the open field.

Aiming at the prior art problems, the application provides a method for generating structured information by acquiring a source text sequence; and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

The method, the device, the electronic equipment and the storage medium for generating the structured information according to the embodiment of the application are described below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for generating structured information according to an embodiment of the present application.

The embodiment of the application is exemplified by the configuration of the method for generating the structured information in the device for generating the structured information, and the device for generating the structured information can be applied to any electronic equipment so that the electronic equipment can execute the function of generating the structured information.

The electronic device may be a personal computer (Personal Computer, abbreviated as PC), a cloud device, a mobile device, etc., and the mobile device may be a hardware device with various operating systems, such as a mobile phone, a tablet computer, a personal digital assistant, a wearable device, a vehicle-mounted device, etc.

As shown in fig. 1, the method for generating the structured information may include the following steps:

step S101, a source text sequence is acquired.

Where a source text sequence refers to unstructured text recorded in natural language. Such as personal resume, patient history, news, etc.

The source text sequence may be a text sequence input by a user, for example, an unstructured source text sequence input manually by the user, an unstructured source text sequence input by voice, or the like, and the manner in which the source text sequence is input by the user is not limited in the embodiment of the present application.

In another possible scenario, the source text sequence may also be text downloaded from the server side. For example, a patient's medical record is downloaded from a medical record management system in a hospital.

It should be noted that the source text sequence contains a large amount of information, but the amount of data of the source text sequence as unstructured text is large, so that some structured information needs to be extracted from unstructured text, for example, company names are extracted from financial reports, cancer stage conditions of patients are extracted from medical records, skills of users are extracted from personal resume, and the like.

Step S102, inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence.

The behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, wherein the behavior sub-model is used for generating structural information corresponding to a source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model.

It should be noted that the behavior-evaluation model does not need to be trained by a large number of training samples labeled with structured information, and the structured information output by the model is also represented in a sequence form.

In the related art, when extracting structured information from unstructured text sequences, although a deep learning model is also adopted, the existing deep learning model is limited by a labeled training set, and has the defects of high cost and incapability of labeling all samples when labeling open field data, so that the existing deep learning model has the problem of low accuracy when extracting structured information.

However, in the application, aiming at the data in the open field, after the acquired source text sequence is input into the behavior-evaluation model, the model outputs the structured information corresponding to the source text sequence, so that the accuracy of structured information extraction is improved.

Exchanging data between different programs of a computer, or between different programming languages, also requires a person to be able to understand "language" audibly. As one possible implementation manner of the application, after the obtained source text sequence is input into the behavior-evaluation model, the behavior submodel performs word embedding coding on the input source text sequence, and the word vector obtained after the word embedding coding is input into a bidirectional gating circulation unit (Gated Recurrent Units, GRU for short) coder for coding, and then the result output by the bidirectional GRU coder is input into a bidirectional GRU decoder for decoding, so as to obtain the generated structured information.

Word embedding is a method of converting words in a text sequence into digital vectors, which need to be entered in digital form in order to analyze them using standard machine learning algorithms. The word embedding process is to embed a high-dimensional space with all word numbers in a much lower-dimensional continuous vector space, each word or phrase is mapped into a vector on the real number domain, and the word vector is generated as a result of word embedding.

It should be noted that, in the present application, bidirectional GRU is adopted to perform encoding and decoding, because the speed of the GRU for encoding and decoding the input source text sequence is faster, and the encoding and decoding effect of the bidirectional GRU is better than that of the unidirectional GRU.

In the embodiment of the application, after the obtained unstructured source text sequence is input into a behavior-evaluation model, the model outputs structured information corresponding to the source text sequence.

For example, assume that the source text sequence is: x is X _1…M ＝{x ₁ ,x ₂ ,…,x _M}, wherein ,X_1…M Representing M words in the source text sequence. After the source text sequence is input into the behavior-evaluation model, corresponding structured information is output:wherein the output structured information includes T words.

Also for example, assume that there is an unstructured source text sequence of "yellow stone national park is the first national park, located in wyoming, known for its rich wild animal species and geothermal resources. The source text sequence is input into the behavior-evaluation model, and the structural information that can be output is "(the yellow national park is the first national park) (yellow national park position wyoming) (yellow national park is abundant in X for wild animals geothermal resources)".

As a possible scenario, the behavior-assessment model may be modeled as a markov decision process. The structured information extraction task in the open field is essentially a sequence generation task, in which the newly generated word should be, determined by the input environmental state (here the source text sequence) and the previously generated word sequence, which naturally constitutes a markov decision process.

In the application, the state of the t step (i.e. the generated t-th word) in the Markov decision process can be defined as follows:wherein X represents the input source text sequence, Y represents the generated sequence, and the subscript of Y represents the partial sequence generated in step 1 … t-1. Defining the next word to be generated as an action in the markov process: />

The method for generating the structured information comprises the steps of obtaining a source text sequence; and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

Based on the above embodiment, the present application proposes another method for generating structured information, specifically, see embodiment two.

Fig. 2 is a flow chart of a method for generating structured information according to a second embodiment of the present application.

As shown in fig. 2, the method for generating the structured information may include the following steps:

step S201, a plurality of training samples are acquired.

Each training sample comprises a sample text sequence and sample structural information corresponding to the sample text sequence.

It is understood that the training samples are also unstructured text sequences, and when training the behavior-evaluation model by using the training samples, each training sample includes a sample text sequence and sample structural information corresponding to the sample text sequence.

The training sample may be a text sequence downloaded from the server, or may be a text sequence manually input by the user, which is not limited herein.

Step S202, inputting a plurality of training samples into the behavior submodel to generate predictive structural information.

In the embodiment of the application, after a plurality of training samples are obtained, each training sample is input into a behavior submodel to generate prediction structural information corresponding to each sample sequence.

Specifically, after each sample text sequence is obtained by the behavior submodel, word embedding encoding is carried out on each input sample text sequence, corresponding word vectors obtained after the word embedding encoding are input into a bidirectional GRU encoder for encoding, and then the result output by the bidirectional GRU encoder is input into a bidirectional GRU decoder for decoding, so that prediction structural information corresponding to each sample text sequence is obtained.

It should be noted that, an overlay mechanism may be added to the decoder to solve the problem of repeated generation or missing generation in the sequence generation process.

Step S203, the prediction structural information and the sample structural information are input into the evaluation submodel to generate an evaluation value.

Wherein the evaluation submodel is an encoding-decoding model based on an attention-based mechanism, and comprises an encoder and a decoder.

In the embodiment of the application, after each training sample is input into a behavior sub-model to generate prediction structural information, the corresponding sample structural information is input into an encoder, and the corresponding prediction structural information is input into a decoder to generate a prediction quality score.

In the embodiment of the application, after each training sample is input into a behavior sub-model to generate prediction structural information, a similarity score of each sequence position in the prediction structural information is generated according to the sample structural information and the prediction structural information. Further, a bonus score of the predicted structured information is generated based on the similarity score for each sequence position, wherein the bonus score for a sequence position is the difference between the similarity score for the current sequence position and the similarity score for the last sequence position. Therefore, the evaluation sub-model can be trained according to the generated reward score, so that the trained behavior-evaluation model can more accurately output the structured information, and the accuracy of the model for extracting the structured information can be improved.

As an example, assume that n is generated _p The predicted structural information isn _G The structural information of each sample isThe word generated in the current step may be evaluated based on a similarity score between the predicted structural information and the sample structural information. The similarity score between the predicted structured information and the sample structured information is calculated here to score the predicted structured information generated by the model, and if the generated predicted structured information and the sample structured information are closer, the score is higher.

For example, the similarity score between the predicted structured information and the sample structured information may be calculated by the following formula:

wherein ,the sequences output for the model are the similarity between the sequences formed by the sample structural information; np is the predictive structural information output by the model; ng is sample structural information; y is a sequence formed by a plurality of sample structured information; />A sequence output for the model;

g represents the format matching function, which is a function of calculating the similarity between sequences

Wherein f is predictive structural information; f is sample structural information; simstr represents string similarity; the number of sample structural information in Y; i is a positive integer. For example, in the example of the above-described yellow park, Y includes 3 facts.

After calculating the similarity score at the position generated by each sequence using the above formula, calculating the bonus score for each sequence position as the difference between the similarity score of the current position and the similarity score of the last position:

wherein ,a bonus score for the sequence generated in step t; />A similarity score between the predicted structured information and the sample structured information for step t; />A similarity score between the predicted structured information and the sample structured information for step t-1.

In the embodiment of the application, after the predicted quality score and the rewarding score are obtained according to the predicted structural information and the sample structural information, an evaluation value is generated according to the predicted quality score and the rewarding score.

And step S204, training the evaluation submodel according to the evaluation value.

In the embodiment of the application, after the evaluation value is calculated, the evaluation sub-model can be trained according to the evaluation value, so that the predicted structural information output by the behavior-evaluation model is as close as possible to the sample structural information.

As one possible scenario, the loss function of the evaluation submodel is:

wherein ,for predicting quality score->A bonus score; t is the sequence length corresponding to the sample structured information, N is the number of samples, +. > and />The word at the t-th sequence position and the word at the first sequence position, respectively,/-up>Words at the 1 st to t-1 st sequence positions, Y ⁿ For sample structural information, n, gamma and t are all positive integers.

According to the application, the degree of inconsistency between the predicted structural information of the model and the sample structural information is estimated by evaluating the loss function of the sub-model, the accuracy of model output is improved by training the loss function, and the smaller the loss function is, the better the robustness of the model is.

In the embodiment of the application, when the behavior-evaluation model is trained, the parameters of the behavior sub-model can be subjected to gradient update according to the predicted quality score.

As one possible implementation, the parameters of the behavior submodel may be updated gradient according to the following formula.

wherein ,/>Is an advanced equation, wherein,

wherein ,for predicting the quality score θ is the parameter to be trained, +.>For the word list, a and b are words in the word list, < >>Words at the 1 st to t-1 st sequence positions, Y ⁿ For sample structural information, pi is a behavior submodel, and n and t are positive integers.

It should be noted that, in the existing information extraction model, it is necessary to evaluate after all prediction sequences are generated during optimization, so that it is difficult to see which step in the generation process leads to a change in the generation result. Therefore, the behavior-evaluation model in the application can give the score of each step according to the evaluation value through the evaluation sub-model, and the error of reinforcement learning training is reduced.

And when the parameters of the behavior submodel are updated, the evaluation submodel can be synchronously updated through the following formula.

wherein ,q is time difference learning for predicting quality score;

a bonus score; />Words that are the 1 st to t th sequence positions; y is Y ⁿ Structuring information for a sample, +.>For the word list, a is a word in the word list, and n and t are positive integers.

The known information (marked data set) is utilized to fit in the existing information extraction model, and the search is rarely performed in a sequence space, however, in the application, a search mechanism based on confidence is added in the training process of the behavior-evaluation model, so that the search mechanism is beneficial to finding a better generated sequence, and the accuracy of the structured information generated by the model is improved.

As one possible implementation, the probability of predicting each sequence position in the structured information is obtained; when determining the sequence position with probability smaller than the preset probability threshold, expanding at the sequence position with probability smaller than the preset probability threshold according to the corresponding prediction result of the sequence position; predicting from the sequence position using the behavior submodel and regenerating the predicted structural information; and updating the rewards score according to the regenerated prediction structural information.

In the present application, the probability of predicting each sequence position in the structured information can be determined by a replication mechanism. Where replication mechanism refers to the direct replication of words at a certain location (e.g., perhaps only 50 locations) in an input text sequence to a generated sequence, rather than selecting from all word lists in the process of generating structured information. Therefore, the selection space can be reduced, and the model training difficulty is reduced.

It should be noted that, determining a sequence position with a small probability, the confidence of the representation model to the word generated at the current position is the smallest, and the word at the position may be replaced by other words, so that a better sequence may be generated.

Step S205, inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence.

In the embodiment of the present application, the implementation process of step S205 may refer to the implementation process of step S102 in the above embodiment, which is not described herein.

According to the structured information generation method, a plurality of training samples are obtained, the training samples are input into a behavior sub-model to generate predicted structured information, the predicted structured information and the sample structured information are input into an evaluation sub-model to generate an evaluation value, the evaluation sub-model is trained according to the evaluation value, and a source text sequence is input into a behavior-evaluation model to generate structured information corresponding to the source text sequence. Therefore, the evaluation sub-model is trained by generating the evaluation value according to the predicted structural information and the sample structural information generated by the behavior sub-model, so that the structural information corresponding to the source text sequence is output according to the trained behavior-evaluation model, and the accuracy and the efficiency of the model extraction of the structural information are improved.

As an example, fig. 3 is an exemplary diagram of a method for generating structured information according to an embodiment of the present application. As can be seen from fig. 3, the training process of the behavior-evaluation model is that after a sample text sequence is input into a behavior sub-model, prediction structural information is output, the prediction structural information output by the behavior sub-model is expanded according to a confidence exploration mechanism to obtain regenerated prediction structural information, and the regenerated prediction structural information is evaluated and stored so as to update model parameters of the evaluation sub-model and the behavior sub-model by using the stored prediction structural information. Therefore, when the updated behavior submodel and the updated evaluation submodel are used for extracting the structural information of the source text sequence, the accuracy of the structural information extraction is improved.

In order to achieve the above embodiment, the present application proposes a device for generating structured information.

Fig. 4 is a schematic structural diagram of a device for generating structured information according to a third embodiment of the present application.

As shown in fig. 3, the apparatus 300 for generating structured information may include: an acquisition module 310 and a generation module 320.

Wherein, the obtaining module 310 is configured to obtain a source text sequence.

The generating module 320 is configured to input the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, where the behavior-evaluation model includes a behavior sub-model and an evaluation sub-model, the behavior sub-model is configured to generate the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is configured to train the behavior sub-model.

As a possible case, the generating device 300 of structured information may further include:

the sample acquisition module is used for acquiring a plurality of training samples; each training sample is marked with sample structural information corresponding to the sample text sequence.

A first input module for inputting a plurality of training samples into the behavior submodel to generate predictive structural information.

And the second input module is used for inputting the prediction structural information and the sample structural information into the evaluation submodel to generate an evaluation value. And

and the training module is used for training the evaluation submodel according to the evaluation value.

As another possible scenario, the evaluation submodel includes an encoder and a decoder; a second input module comprising:

a first generation unit for inputting the sample structured information into the encoder and inputting the prediction structured information into the decoder to generate a prediction quality score;

A second generation unit for generating a bonus score based on the predicted structural information and the sample structural information;

and a third generation unit for generating an evaluation value according to the bonus score and the predicted quality score.

As another possible case, the second generating unit is further configured to:

generating a similarity score of each sequence position in the predicted structured information according to the sample structured information and the predicted structured information;

and generating a reward score of the predicted structural information according to the similarity score of each sequence position, wherein the reward score of the sequence position is the difference between the similarity score of the current sequence position and the similarity score of the last sequence position.

As another possible scenario, the loss function of the evaluation submodel is:

wherein ,for predicting quality score->For rewarding the score, T is the sequence length corresponding to the sample structured information, N is the number of samples, +.> and />A word at the t-th sequence position and a word at the gamma-th sequence position, respectively, < ->Words at the 1 st to t-1 st sequence positions, Y ⁿ For sample structural information, n, gamma and t are all positive integers.

As another possible case, the generating device 300 of structured information may further include:

And the updating module is used for carrying out gradient updating on the parameters of the behavior sub-model according to the predicted quality score.

As another possibility, the parameters of the behavior submodel are updated gradient by the following formula,

wherein ,/>Is an advanced equation, wherein,

wherein ,for predicting the quality score θ is the parameter to be trained, +.>For word list, a and b are words in word list, T is the sequence length corresponding to the sample structural information, N is the number of samples, < + >>Words at the 1 st to t-1 st sequence positions, Y ⁿ For sample structural information, pi is a behavior submodel, and n and t are positive integers.

As another possible case, the second input module further includes:

an acquisition unit configured to acquire a probability of predicting each sequence position in the structured information;

the expanding unit is used for expanding at the sequence position with the probability smaller than the preset probability threshold according to the corresponding prediction result of the sequence position when the sequence position with the probability smaller than the preset probability threshold is determined;

a fourth generation unit for predicting from the sequence position using the behavior submodel to regenerate the predicted structural information; and

and the updating unit is used for updating the rewarding score according to the regenerated prediction structural information.

It should be noted that the foregoing explanation of the embodiment of the method for generating structural information is also applicable to the generating and processing device of structural information, and will not be repeated herein.

The device for generating the structured information in the embodiment of the application obtains the source text sequence; and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.

As shown in fig. 5, a block diagram of an electronic device is provided for a method of generating structured information according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 501 is illustrated in fig. 5.

Memory 502 is a non-transitory computer readable storage medium provided by the present application. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of generating structured information provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of generating structured information provided by the present application.

The memory 502 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 310 and the generation module 320 shown in fig. 4) corresponding to the method of generating structured information in an embodiment of the present application. The processor 501 executes various functional applications of the server and data processing, i.e., a method of implementing the generation of structured information in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.

Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created from the use of the electronic device according to the generation of the structured information, and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to the electronic device that generates the structured information via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method for generating structured information may further include: an input device 503 and an output device 504. The processor 501, memory 502, input devices 503 and output devices 504 may be connected by a bus or otherwise, for example in fig. 5.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device that generate the structured information, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, the source text sequence is acquired; and inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model. Therefore, corresponding structured information is directly generated according to the input source text sequence through the behavior-evaluation model, end-to-end information extraction is achieved, the technical problem of high cost of training model labeling during open field information extraction is solved, and the efficiency and accuracy of structured information extraction are effectively improved.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. The method for generating the structured information is applied to extracting the structured information of the open field information, and is characterized by comprising the following steps:

acquiring a source text sequence;

inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model;

Inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence comprises:

inputting the acquired source text sequence into a behavior-evaluation model;

word embedding coding is carried out on the input source text sequence through the behavior submodel, and word vectors obtained after the word embedding coding are input into a bidirectional gating circulation unit coder for coding;

inputting the result output by the bidirectional gating cycle unit encoder into a bidirectional gating cycle unit decoder for decoding to obtain the generated structured information;

before the source text sequence is input into the behavior-evaluation model to generate the structured information corresponding to the source text sequence, the method further comprises:

acquiring a plurality of training samples; each training sample comprises a sample text sequence and sample structural information corresponding to the sample text sequence;

inputting the plurality of training samples into the behavior sub-model to generate predictive structural information;

inputting the predictive structural information and the sample structural information into the evaluation submodel to generate an evaluation value; and

training the evaluation sub-model according to the evaluation value;

The evaluation submodel includes an encoder and a decoder; the inputting the predictive structural information and the sample structural information into the evaluation submodel to generate an evaluation value includes:

inputting the sample structured information to the encoder and the prediction structured information to the decoder to generate a prediction quality score;

generating a bonus score based on the predictive structural information and the sample structural information;

and generating the evaluation value according to the predicted quality score and the rewards score.

2. The method of generating structured information of claim 1, wherein said generating a bonus score based on said predicted structured information and said sample structured information comprises:

generating a similarity score for each sequence position in the predicted structured information according to the sample structured information and the predicted structured information;

3. The method for generating structured information of claim 1, wherein the loss function of the evaluation submodel is:

，

wherein ,for the predicted quality score, < >>For the bonus points, T is the sequence length corresponding to the sample structured information, N is the number of samples, ++> and />Words at the t-th sequence position and words at the ϒ -th sequence position, respectively, +.>Words for the 1 st to t-1 st sequence positions,/->For the sample structured information, n, ϒ, t are positive integers.

4. The method for generating structured information of claim 1, wherein the method further comprises:

and carrying out gradient updating on the parameters of the behavior submodel according to the predicted quality score.

5. The method for generating structured information of claim 4, wherein the parameters of said behavior submodel are updated gradient by the following formula,

, wherein ,/>Is an advanced equation, wherein,

，

wherein ,for the predicted quality score, < >>For the parameters to be trained, < >>A and b are words in the word list, T is the sequence length corresponding to the sample structural information, N is the number of samples, and +.>Words for the 1 st to t-1 st sequence positions,/- >Structuring information for the sample, +.>For the behavior submodel, n and t are both positive integers.

6. The method for generating structured information of claim 1, wherein after said inputting said plurality of training samples into said behavior submodel to generate predicted structured information, further comprising:

acquiring the probability of each sequence position in the predictive structural information;

when the sequence position with the probability smaller than the preset probability threshold is determined, expanding at the sequence position with the probability smaller than the preset probability threshold according to the corresponding prediction result of the sequence position;

predicting using the behavior submodel from the sequence position to regenerate predicted structured information; and

updating the bonus points according to the regenerated prediction structural information.

7. A device for generating structured information, which is used for extracting structured information from open field information, the device comprising:

the acquisition module is used for acquiring a source text sequence;

the generation module is used for inputting the source text sequence into a behavior-evaluation model to generate structured information corresponding to the source text sequence, wherein the behavior-evaluation model comprises a behavior sub-model and an evaluation sub-model, the behavior sub-model is used for generating the structured information corresponding to the source text sequence according to the input source text sequence, and the evaluation sub-model is used for training the behavior sub-model;

The generating module is specifically configured to:

inputting the acquired source text sequence into a behavior-evaluation model;

the device further comprises:

the sample acquisition module is used for acquiring a plurality of training samples; each training sample is marked with sample structural information corresponding to a sample text sequence;

a first input module for inputting the plurality of training samples into the behavior sub-model to generate predictive structural information;

a second input module for inputting the predictive structural information and the sample structural information into the evaluation submodel to generate an evaluation value; and

the training module is used for training the evaluation sub-model according to the evaluation value;

The evaluation submodel includes an encoder and a decoder; the second input module includes:

a first generation unit for inputting the sample structured information to the encoder and inputting the prediction structured information to the decoder to generate a prediction quality score;

a second generation unit for generating a bonus score according to the predictive structural information and the sample structural information;

and a third generation unit configured to generate the evaluation value according to the bonus score and the predicted quality score.

8. The apparatus for generating structured information of claim 7, wherein the second generating unit is further configured to:

9. The apparatus for generating structured information of claim 7, wherein the evaluation submodel has a loss function of:

，

10. The apparatus for generating structured information of claim 7, wherein the apparatus further comprises:

and the updating module is used for carrying out gradient updating on the parameters of the behavior sub-model according to the prediction quality score.

11. The structured information generation apparatus of claim 10, wherein the parameters of the behavior submodel are gradient updated by the following formula,

, wherein ,/>Is an advanced equation, wherein,

，

wherein ,for the predicted quality score, < >>For the parameters to be trained, < >>A and b are words in the word list, T is the sequence length corresponding to the sample structural information, N is the number of samples, and +.>Words for the 1 st to t-1 st sequence positions,/->Structuring information for the sample, +.>For the behavior submodel, n and t are both positive integers.

12. The apparatus for generating structured information of claim 7, wherein said second input module further comprises:

an acquisition unit configured to acquire a probability of each sequence position in the sample structural information;

the expansion unit is used for expanding at the sequence position with the probability smaller than the preset probability threshold according to the sequence position corresponding prediction result when the sequence position with the probability smaller than the preset probability threshold is determined;

a fourth generation unit for predicting using the behavior submodel from the sequence position to generate new sample structural information; and

and the updating unit is used for updating the rewarding score according to the new sample structural information.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating structured information according to any one of claims 1 to 6.

14. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of generating structured information of any one of claims 1-6.