CN114372467A

CN114372467A - Named entity extraction method and device, electronic equipment and storage medium

Info

Publication number: CN114372467A
Application number: CN202210030953.7A
Authority: CN
Inventors: 张兆
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2022-01-12
Filing date: 2022-01-12
Publication date: 2022-04-19

Abstract

The application relates to the field of artificial intelligence, and discloses a named entity extraction method and device, electronic equipment and a storage medium, wherein the named entity extraction method comprises the following steps: acquiring a statement to be extracted; acquiring an entity type to be extracted, and constructing a corresponding entity statement according to the entity type to be extracted; coding the entity sentences and the sentences to be extracted and inputting the coded entity sentences and the sentences to be extracted into a preset characterization model for processing to obtain corresponding expression matrixes; calculating the probability distribution of the statement to be extracted according to the representation matrix; and extracting the named entity corresponding to the entity type to be extracted according to the probability distribution. The scheme provided by the application can avoid rewriting corresponding rules when applied to different systems every time, and meanwhile, various characteristics influencing entity extraction do not need to be selected by depending on a corpus.

Description

Named entity extraction method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a named entity extraction method and device, electronic equipment and a computer-readable storage medium.

Background

An entity refers to a word or phrase with a describable meaning, and generally can be a name of a person, a place, an organization, a product, etc., or a content with a certain meaning in a certain field, such as a disease, a drug, a name of an organism in a medical field, or a proprietary vocabulary related to a legal field, etc. The main task of entity extraction is to identify the text range of named entities and classify them into predefined categories, which are the basis of question-answering systems, translation systems, knowledge maps, etc.

Conventional entity extraction tasks mostly use rule and dictionary based methods or statistical based methods, etc. The rule and dictionary based methods mostly use linguistic experts to manually construct rule templates, and generally speaking, the rule based methods are superior to statistical based methods in performance when extracted rules can reflect linguistic phenomena more accurately. However, these rules often depend on specific languages, domains and text styles, are time-consuming in programming, are difficult to cover all linguistic phenomena, are particularly prone to errors, are not well portable, and require linguistic experts to rewrite different rules for different systems. The statistical-based method has higher requirements on feature selection, needs to select various features which have influence on the task from texts, has higher dependence on the corpus, and has less large-scale general corpus which can be used for building and evaluating a named entity recognition system.

Disclosure of Invention

In order to solve the above technical problems, embodiments of the present application provide a named entity extraction method and apparatus, an electronic device, and a computer-readable storage medium, which are intended to solve the technical problems in the prior art that rule-based entity extraction requires compiling different extraction rules and statistic-based entity extraction has a large dependency on a corpus.

Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.

According to an aspect of an embodiment of the present application, there is provided a named entity extraction method, including:

acquiring a statement to be extracted;

acquiring an entity type to be extracted, and constructing a corresponding entity statement according to the entity type to be extracted;

coding the entity sentences and the sentences to be extracted and inputting the coded entity sentences and the sentences to be extracted into a preset characterization model for processing to obtain corresponding expression matrixes;

calculating the probability distribution of the statement to be extracted according to the representation matrix; the probability distribution comprises a first probability that each word of the statement to be extracted serves as the starting position of the named entity corresponding to the entity type to be extracted and a second probability that each word of the statement to be extracted serves as the ending position of the named entity corresponding to the entity type to be extracted;

and extracting the named entity corresponding to the entity type to be extracted according to the probability distribution.

Further, the step of constructing a corresponding entity statement according to the entity type to be extracted includes:

acquiring a text scene of the sentence to be extracted;

determining a corresponding entity sentence template according to the text scene;

and constructing a corresponding entity sentence according to the entity sentence template and the entity type to be extracted.

Further, the step of calculating the probability distribution of the sentence to be extracted according to the representation matrix includes:

acquiring a first parameter matrix and a second parameter matrix; the first parameter matrix and the second parameter matrix are trained based on a neural network model respectively;

calculating a first probability of each word of the statement to be extracted through a softmax function after dot product is carried out on the first parameter matrix and the representation matrix, and calculating a second probability of each word of the statement to be extracted through a softmax function after dot product is carried out on the second parameter matrix and the representation matrix;

and forming probability distribution of the statement to be extracted according to the first probability and the second probability of each word of the statement to be extracted.

Further, the step of extracting the named entity corresponding to the entity type to be extracted according to the probability distribution includes:

acquiring the number of the entity types to be extracted;

if the entity type number is equal to 1, processing the probability distribution through argmax to obtain a first target sequence;

and extracting the named entity corresponding to the entity type to be extracted from the first target sequence through a preset first pointer network.

Further, after the step of obtaining the number of the entity types to be extracted, the method includes:

if the number of the entity types is more than 1, obtaining probability distribution corresponding to each entity type to be extracted;

processing each probability distribution by argmax, and then merging to obtain a second target sequence;

and extracting the named entities corresponding to the entity types to be extracted from the second target sequence through a preset second pointer network.

Further, the second target sequence comprises a first token sequence and a second token sequence, the first token sequence comprises N first subsequences, the second token sequence comprises N second subsequences, the N first subsequences correspond to the N second subsequences, and one first subsequence and the corresponding second subsequence form a sequence pair; the step of extracting the named entities corresponding to the entity types to be extracted from the second target sequence through a preset second pointer network includes:

detecting whether the number of candidate positions in the first subsequence and the second subsequence is equal in each sequence pair;

and if so, combining the candidate position in the first subsequence with the corresponding candidate position in the second subsequence through the second pointer network to extract the named entity corresponding to each entity type to be extracted.

Further, the step of detecting whether the number of candidate positions in the first subsequence and the second subsequence is equal includes:

and if not, combining the candidate positions in the second subsequence with the corresponding candidate positions in the first subsequence through the second pointer network according to a preset rule to extract the named entity corresponding to the entity type to be extracted.

According to an aspect of an embodiment of the present application, there is provided a named entity extraction apparatus, including:

the first acquisition unit is configured to acquire a statement to be extracted;

the second acquisition unit is configured to acquire the entity type to be extracted and construct a corresponding entity statement according to the entity type to be extracted;

the coding unit is configured to code the entity statement and the statement to be extracted and input the coded entity statement and the statement to be extracted into a preset representation model for processing to obtain a corresponding representation matrix;

the calculation unit is configured to calculate the probability distribution of the statement to be extracted according to the representation matrix; the probability distribution comprises a first probability that each word of the statement to be extracted serves as the starting position of the named entity corresponding to the entity type to be extracted and a second probability that each word of the statement to be extracted serves as the ending position of the named entity corresponding to the entity type to be extracted;

and the extraction unit is configured to extract the named entity corresponding to the entity type to be extracted according to the probability distribution.

According to an aspect of an embodiment of the present application, there is provided an electronic device including: one or more processors; storage means for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement the named entity extraction method as described above.

According to an aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon computer-readable instructions, which, when executed by a processor of a computer, cause the computer to execute the named entity extraction method as described above.

According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative embodiments described above.

In the technical scheme provided by the embodiment of the application, the corresponding entity sentences are constructed for the sentences to be extracted, the sentences to be extracted and the entity sentences are jointly encoded to obtain an expression matrix, which is equivalent to introducing priori knowledge for the sentences to be extracted, so that more prompts can be brought to subsequent entity extraction.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 is a flow diagram of a named entity extraction method to which the present application relates;

FIG. 2 is a block diagram of a named entity extraction apparatus to which the present application relates;

FIG. 3 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

It should also be noted that: reference to "a plurality" in this application means two or more. "and/or" describe the association relationship of the associated objects, meaning that there may be three relationships, e.g., A and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Fig. 1 is a flowchart illustrating a named entity extraction method according to an exemplary embodiment, which may include steps S1 to S5, and is described in detail as follows:

step S1, obtaining a statement to be extracted;

step S2, acquiring the entity type to be extracted, and constructing a corresponding entity statement according to the entity type to be extracted;

step S3, encoding the entity sentences and the sentences to be extracted and inputting the encoded entity sentences and the sentences to be extracted into a preset characterization model for processing to obtain corresponding expression matrixes;

step S4, calculating the probability distribution of the statement to be extracted according to the representation matrix; the probability distribution comprises a first probability that each word of the statement to be extracted serves as the starting position of the named entity corresponding to the entity type to be extracted and a second probability that each word of the statement to be extracted serves as the ending position of the named entity corresponding to the entity type to be extracted;

and step S5, extracting the named entity corresponding to the entity type to be extracted according to the probability distribution.

As described in the above step S1, in the present application, the named entity extraction method may be applied to a corresponding entity extraction model, where the entity extraction model includes a representation portion and an entity extraction portion, the representation portion may adopt a bert (bidirectional Encoder retrieval from transforms) model, the entity extraction portion adopts a neural network model, and the neural network model includes a matrix conversion layer and a full connection layer. The sentence to be extracted is a sentence which needs to be subjected to entity extraction, the sentence at least comprises one character, and the sentence to be extracted can be a single sentence or any sentence in a text.

As described in step S2, the named entity (named entity) generally refers to a person name, an organization name, a place name, and all other entities identified by names, and more generally, the named entity also includes numbers, dates, currency, and the like. The type of named entity can be defined in terms of questions, for example, in one existing definition, a named entity can include three broad classes: entity class, time class, and value class. Wherein, the entity class comprises a name of a person, a place name and an organization name; the time class includes date, time of day, etc.; the numerical classes include currencies, metrics, percentages, and the like. Corresponding entity sentences are constructed according to the entity types to be extracted, the entity sentences can be words or sentences, and for example, an entity with the entity type of place name to be extracted can construct an entity sentence related to the place name. In other embodiments, the entity type to be extracted may be fixed in the entity extraction model, i.e., the entity extraction model fixes the entities that extract the entity type.

As described in the step S3, the representation model is a BERT model, where the input of the BERT model is an original Word Vector of each Word/Word (or called token) in the text, and the Vector may be initialized randomly, or pre-trained by using an algorithm such as Word2 Vector to serve as an initial value; the output is the vector representation of each character/word in the text after full-text semantic information is fused. The input code vector (length 512) of the BERT model is the unit sum of 3 embedded features, which are: WordPiece embedding: WordPiece means that a word is divided into a limited group of common subword units, and a compromise balance can be obtained between the effectiveness of the word and the flexibility of the character; position Embedding (Position Embedding): the position embedding refers to encoding position information of words into feature vectors, and the position embedding is a crucial ring for introducing a word position relation into a model; segment Embedding (Segment Embedding): for distinguishing whether two sentences, e.g., B is a context of a (dialog scenario, question-and-answer scenario, etc.).

The entity sentence and the sentence to be extracted are jointly encoded to form a sentence pair, specifically, if the sentence to be extracted is: beijing is really good in the weather today; if the entity type to be extracted is a place name, a corresponding entity statement is constructed, such as: all place names are found. The "good weather today in Beijing" and "find all place names" form a sentence pair, namely: [ CLS ] North, Jing, Jintian, Tian, Qi, true, good [ SEP ] is found, and then, some, Di, and the name [ SEP ] are found, and in another embodiment, the [ CLS ] finding, Qin, Tian, Qi, true, and good [ SEP ] can be also formed. Where [ CLS ] above indicates that the feature is used in a classification model, the fit may be omitted for non-classification models. [ SEP ] denotes a sentence break symbol for breaking two sentences in the input corpus. The sentence pair is coded through a coder of a BERT model, then a corresponding representation matrix is obtained through processing, the shape of the representation matrix is n x d, and the prior knowledge of an entity sentence is introduced into the representation matrix, so that the information which should be paid attention by a subsequent entity extraction part is clearly told.

As described in the above steps S4-S5, a representation matrix is obtained from the statements to be extracted and the entity statements, and the representation matrix is transmitted to the subsequent entity extraction part, so as to obtain a probability distribution, in which a first probability that each word in the statements to be extracted serves as the start position of the entity and a second probability that each word serves as the end position of the entity are calculated. And extracting the final named entity from the probability distribution, specifically, selecting two words with the maximum first probability and the maximum second probability as a starting position and an ending position, wherein if the sentence to be extracted comprises 20 words, the word with the maximum first probability is the 5 th word, and the word with the maximum second probability is the 9 th word, the 5 th word is used as the starting position of the named entity, and the 9 th word is used as the ending position of the named entity, so that the corresponding named entity can be determined.

In this embodiment, the statement to be extracted and the entity statement are encoded together to obtain an expression matrix, which is equivalent to introducing a priori knowledge to the statement to be extracted, and can bring more prompts to subsequent entity extraction. The method has the advantages that named entity extraction is placed under a question and answer framework understood by machine reading, and entities with specific labels are extracted from specific visual angles, compared with the traditional extraction algorithm, the method is realized by a uniform algorithm, only one named entity is found each time, non-nested entities can be extracted, in addition, due to the fact that priori knowledge is coded in a representation model, the model can be more clearly told which information needs to be noticed, for example: when the entity related to the location is extracted, the problem related to the location is put forward, and the model is definitely told to pay special attention to the information related to the location, so that the strategy can better promote the entity extraction process.

In an exemplary embodiment, the step S2 of constructing the corresponding entity statement according to the entity type to be extracted includes:

step S21, acquiring the text scene of the sentence to be extracted;

step S22, determining a corresponding entity sentence template according to the text scene;

and step S23, constructing a corresponding entity sentence according to the entity sentence template and the entity type to be extracted.

In this embodiment, the entity statement is required to meet a certain requirement, because it can bring a priori knowledge, a better entity statement is constructed, and more prompts can be brought to the entity extraction model, the constructed entity statement generally needs to be fitted with the meaning of the entity type to be extracted, for example, an entity whose entity type is an address needs to be extracted, some definitions and descriptions about the address can be found in advance according to the hundred degree encyclopedia, and an entity statement template is formed according to the definitions and descriptions. Under different text scenes, entities corresponding to the same entity type may be different, so that a text scene of a sentence to be extracted is determined before the entity sentence is constructed, a corresponding entity sentence template is determined according to the text scene, and the entity type to be extracted is brought into the entity sentence template to obtain the corresponding entity sentence.

In an exemplary embodiment, the step S4 of calculating the probability distribution of the sentence to be extracted according to the representation matrix includes:

step S41, acquiring a first parameter matrix and a second parameter matrix; the first parameter matrix and the second parameter matrix are trained based on a neural network model respectively;

step S42, performing dot product on the first parameter matrix and the representation matrix, and then calculating through a softmax function to obtain a first probability of each word of the statement to be extracted, and performing dot product on the second parameter matrix and the representation matrix, and then calculating through a softmax function to obtain a second probability of each word of the statement to be extracted;

step S43, forming a probability distribution of the sentence to be extracted according to the first probability and the second probability of each word of the sentence to be extracted.

In this embodiment, the first parameter matrix and the second parameter matrix are obtained by learning in a process of training an entity extraction model, and may be represented as t(s) and t (E), the first parameter matrix, the second parameter matrix and the representation matrix have the same dimension, and the first parameter matrix and the second parameter matrix are respectively subjected to dot product with the representation matrix and then calculated by a softmax function to obtain a first probability p(s) and a second probability p (E), where the first probability and the second probability may be respectively replaced by the following formulas, p(s) ═ softmax (E · t (s)), and p (E) ═ softmax (E · t (E)), where · represents the dot product. softmax function is also called normalized exponential function, and softmax is the conversion of the prediction result from negative infinity to positive infinity into corresponding probability. The softmax firstly converts the prediction result into an index function, so that the nonnegativity of the probability is ensured, and then carries out normalization processing on the prediction result converted into the index function, namely dividing the prediction result converted into the index function by the sum of all the prediction results converted into the index function, so that the approximate probability is obtained. In this embodiment, two parameter matrices are provided, and the first probability and the second probability of each word of the statement to be extracted can be calculated simultaneously.

In an exemplary embodiment, the step S5 of extracting the named entity corresponding to the entity type to be extracted according to the probability distribution includes:

step S51, obtaining the entity type number to be extracted;

step S52, if the entity type number is equal to 1, processing the probability distribution through argmax to obtain a first target sequence;

step S53, extracting the named entity corresponding to the entity type to be extracted from the first target sequence through a preset first pointer network.

In this embodiment, when a named entity is extracted, the obtained probability distribution includes two rows, where one row corresponds to a first probability that each word of a sentence to be extracted serves as a start position of the named entity, and the other row corresponds to a second probability that each word of the sentence to be extracted serves as an end position of the named entity, and argmax is set for each row of the probability distribution, that is, y ═ argmaxf (t): y is a parameter t when the function (f), (t) takes the maximum value, and a first target sequence is obtained through argmax; the first target sequence comprises two 0-1 representation sequences I(s) and I (e) with the length of n, wherein n is the number of words of a statement to be extracted, in I(s), if the kth position is 1, the kth token is possibly a starting position, in I(s), if the mth position is 1, the kth token is possibly an ending position, and then the corresponding named entity is extracted according to a preset first pointer network. Specifically, the following are described:

for example, the text sequence is "Beijing is really good today's weather. "there are 1 named entities related to place names, after obtaining probability distribution, the probability distribution is processed by argmax to obtain a first target sequence, the first target sequence includes two sequences of start _ labels and end _ labels, {" start ": 1," end ": 3," entity ": beijing", "type": 5}, which can be expressed as:

start_labels＝[5,0,0,0,0,0,0,0,0,0]

end_labels＝[0,0,5,0,0,0,0,0,0,0]

in the two sequences, the position of a number with a non-0 is taken as a candidate position, named entities are extracted in a first pointer network mode, strict decoding is adopted, the candidate positions of start _ labels and end _ labels are scanned from beginning to end, and the heads and the tails with the same entity types are combined, so that the corresponding named entities can be extracted.

In an exemplary embodiment, after the step S51 of obtaining the number of the entity types to be extracted, the method includes:

step S54, if the number of the entity types is larger than 1, obtaining the probability distribution corresponding to each entity type to be extracted;

step S55, merging the probability distributions after being processed by argmax to obtain a second target sequence;

step S56, extracting, through a preset second pointer network, a named entity corresponding to each entity type to be extracted from the second target sequence.

In this embodiment, named entities of different entity types may be extracted from the same statement to be extracted, when different named entities are extracted, multiple probability distributions may be obtained, the multiple probability distributions are merged after being processed by argmax, respectively, to obtain a second target sequence, and then the second target sequence is extracted from the second target sequence through a second pointer network.

For example, the phrase to be extracted is that the product is a capsule and the content is dark brown granules. "the entity types to be extracted are x ═ 5, and two named entities are specifically extracted, and after argmax processing, a corresponding second target sequence is obtained, the second target sequence also includes 2 token sequences, the two token sequences respectively include 5 subsequences, each subsequence represents one named entity, and each position in the subsequence can have two values; specifically, the method comprises the following steps: { "start": 3, "end": 6, "entry": capsule "," type ": 5}, {" start ": 14," end ": 16," entry ": particle", "type": 5}, and a characterization sequence start _ labels [ [0,0,0,0,0,0,0,0,0,0,0,0, 0], [0,0,0,0,0,0,0,0,0,0, 0;

the characterization sequence end _ labels [ [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0], [0,0,0,0,0,0,0,0,0,0,0,0,0,0, 1,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,1,0] ]

By means of the second pointer network, both non-nested and nested entities can be represented. In the extraction process, two characterization sequences of start _ labels and end _ labels are obtained through model output, and the head and tail of 1 in the corresponding sub-sequences of the characterization sequences of start _ labels and end _ labels are combined to extract the named entity.

In this embodiment, the form of the entity is represented by a pointer network, which is more flexible than a conventional representation, and the second pointer network can represent both non-nested entities and nested entities in a unified manner, and can be conveniently combined with other models, and can also be applied to other entity extraction models for extracting multiple entities simultaneously. In addition, due to the representation form of the pointer network, compared with the traditional decoding frames such as a Conditional Random Field (CRF), the parallel operation is more convenient, and the extraction speed is improved.

In an exemplary embodiment, the second target sequence includes a first token sequence and a second token sequence, the first token sequence includes N first subsequences, the second token sequence includes N second subsequences, the N first subsequences correspond to the N second subsequences, and one first subsequence and the corresponding second subsequence form a sequence pair; the step of extracting the named entities corresponding to the entity types to be extracted from the second target sequence through a preset second pointer network includes:

In this embodiment, as described above, one first subsequence and a corresponding second subsequence in the second target sequence represent one entity type to be extracted, in one statement to be extracted, there may be a plurality of named entities that can be extracted by one entity type to be extracted, and accordingly, the starting positions and the ending positions of two different named entities are the same and different; or the end is the same, and the starting position is different; or the starting position and the ending position are different; in the first two cases, the number of candidate positions in the first subsequence and the second subsequence may not be equal, and in the last case, the number of candidate positions must be equal. When the number of the candidate positions is equal, the candidate positions are directly combined to obtain the named entity, namely, the first candidate position in the first subsequence is combined with the first candidate position in the second subsequence, the second candidate position in the first subsequence is combined with the second candidate position in the second subsequence, and the following steps are analogized to obtain all the next-year entities.

In an exemplary embodiment, the step of detecting whether the number of candidate positions in the first subsequence and the second subsequence is equal comprises:

In this embodiment, as described above, when the number of candidate positions is not equal, the starting positions and the ending positions of two different named entities may be the same; or the end is the same, and the starting position is different; when the starting positions are the same and the ending positions are different, namely the number of the candidate positions in the first subsequence is greater than that of the candidate positions in the second subsequence, for example, the number of the candidate positions in the first subsequence is 3, and the number of the candidate positions in the second subsequence is 2, and when the starting positions are the same or the ending positions are different, namely the number of the candidate positions in the first subsequence is less than that of the candidate positions in the second subsequence, for example, the number of the candidate positions in the first subsequence is 2, and the number of the candidate positions in the second subsequence is 3. The two subsequences have the same length, and the end position of the extracted named entity is after the start position, that is, when the candidate positions of the two subsequences are combined, the candidate position in the second subsequence can only be combined with the candidate position in the first subsequence before the end position, for example, the candidate position in the first subsequence is at position 2, 5, or 8, and the candidate position in the second subsequence is at position 6 or 10, and the candidate positions are combined in the manner of [2,6], [5,6], [2,10], [5,10], [8,10], so as to obtain the corresponding named entity.

Referring to fig. 2, an embodiment of the present application provides a named entity extraction apparatus, including:

a first obtaining unit 10 configured to obtain a sentence to be extracted;

a second obtaining unit 20, configured to obtain an entity type to be extracted, and construct a corresponding entity statement according to the entity type to be extracted;

the encoding unit 30 is configured to encode the entity statement and the statement to be extracted and then input the encoded entity statement and the statement to be extracted into a preset representation model for processing to obtain a corresponding representation matrix;

a calculating unit 40 configured to calculate a probability distribution of the sentence to be extracted according to the representation matrix; the probability distribution comprises a first probability that each word of the statement to be extracted serves as the starting position of the named entity corresponding to the entity type to be extracted and a second probability that each word of the statement to be extracted serves as the ending position of the named entity corresponding to the entity type to be extracted;

and the extracting unit 50 is configured to extract the named entity corresponding to the entity type to be extracted according to the probability distribution.

In an exemplary embodiment, the second obtaining unit 20 includes:

the first acquisition subunit is configured to acquire a text scene of the sentence to be extracted;

the determining subunit is configured to determine a corresponding entity sentence template according to the text scene;

and the construction subunit is configured to construct a corresponding entity statement according to the entity statement template and the entity type to be extracted.

In an exemplary embodiment, the computing unit 40 includes:

a second obtaining subunit configured to obtain the first parameter matrix and the second parameter matrix; the first parameter matrix and the second parameter matrix are trained based on a neural network model respectively;

the calculating subunit is configured to perform dot product on the first parameter matrix and the representation matrix, and then calculate through a softmax function to obtain a first probability of each word of the statement to be extracted, and perform dot product on the second parameter matrix and the representation matrix, and then calculate through a softmax function to obtain a second probability of each word of the statement to be extracted;

and the composition subunit is configured to compose a probability distribution of the statement to be extracted according to the first probability and the second probability of each word of the statement to be extracted.

In an exemplary embodiment, the extracting unit 50 includes:

a third obtaining subunit, configured to obtain the number of the entity types to be extracted;

a first processing subunit, configured to, if the number of entity types is equal to 1, process the probability distribution by argmax to obtain a first target sequence;

and the first extraction subunit is configured to extract the named entity corresponding to the entity type to be extracted from the first target sequence through a preset first pointer network.

In an exemplary embodiment, the extracting unit 50 further includes:

a fourth obtaining subunit, configured to obtain, if the number of the entity types is greater than 1, a probability distribution corresponding to each entity type to be extracted;

the second processing subunit is configured to process each probability distribution by argmax and then merge the probability distributions to obtain a second target sequence;

and the second extraction subunit is configured to extract the named entities corresponding to the entity types to be extracted from the second target sequence through a preset second pointer network.

In an exemplary embodiment, the second extraction subunit includes:

a detection module configured to detect whether the number of candidate positions in the first subsequence and the second subsequence is equal in each sequence pair;

and if the candidate positions in the first subsequence and the corresponding candidate positions in the second subsequence are equal, the first extraction module is configured to extract the named entities corresponding to the entity types to be extracted by combining the candidate positions in the first subsequence and the corresponding candidate positions in the second subsequence through the second pointer network.

In an exemplary embodiment, the second extraction subunit includes:

and if the candidate positions in the second subsequence are not equal to the candidate positions in the first subsequence, the second extraction module is configured to combine the candidate positions in the second subsequence with the corresponding candidate positions in the first subsequence through the second pointer network according to a preset rule to extract the named entity corresponding to the entity type to be extracted.

It should be noted that the apparatus provided in the foregoing embodiment and the method provided in the foregoing embodiment belong to the same concept, and the specific manner in which each module and unit execute operations has been described in detail in the method embodiment, and is not described again here.

An embodiment of the present application further provides an electronic device, including: one or more processors; a storage device, configured to store one or more programs, which when executed by the one or more processors, cause the electronic device to implement the named entity extraction method provided in the above embodiments.

It should be noted that the computer system 300 of the electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 3, the computer system 300 includes a Central Processing Unit (CPU)301, which can perform various appropriate actions and processes, such as executing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 302 or a program loaded from a storage portion 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for system operation are also stored. The CPU 301, ROM 302, and RAM 303 are connected to each other via a bus 304. An Input/Output (I/O) interface 305 is also connected to bus 304.

The following components are connected to the I/O interface 305: an input portion 306 including a keyboard, a mouse, and the like; an output section 307 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 308 including a hard disk and the like; and a communication section 309 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 309 performs communication processing via a network such as the internet. A drive 310 is also connected to the I/O interface 305 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 310 as necessary, so that a computer program read out therefrom is mounted into the storage section 308 as necessary.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 309, and/or installed from the removable medium 311. When the computer program is executed by a Central Processing Unit (CPU)301, various functions defined in the system of the present application are executed.

It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with a computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

Yet another aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described above. The computer-readable storage medium may be included in the electronic device described in the above embodiment, or may exist separately without being incorporated in the electronic device.

Another aspect of the application also provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various embodiments described above.

The above description is only a preferred exemplary embodiment of the present application, and is not intended to limit the embodiments of the present application, and those skilled in the art can easily make various changes and modifications according to the main concept and spirit of the present application, so that the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A named entity extraction method is characterized by comprising the following steps:

acquiring a statement to be extracted;

2. The named entity extraction method of claim 1, wherein the step of constructing the corresponding entity statement according to the entity type to be extracted comprises:

acquiring a text scene of the sentence to be extracted;

3. The named entity extraction method of claim 1, wherein the step of calculating the probability distribution of the sentence to be extracted according to the representation matrix comprises:

4. The method according to claim 1, wherein the step of extracting the named entity corresponding to the entity type to be extracted according to the probability distribution comprises:

acquiring the number of the entity types to be extracted;

5. The named entity extraction method of claim 4, wherein the step of obtaining the number of entity types to be extracted is followed by:

6. The named entity extraction method of claim 5, wherein the second target sequence comprises a first token sequence and a second token sequence, the first token sequence comprises N first subsequences, the second token sequence comprises N second subsequences, the N first subsequences correspond to the N second subsequences, and a first subsequence and a corresponding second subsequence form a sequence pair; the step of extracting the named entities corresponding to the entity types to be extracted from the second target sequence through a preset second pointer network includes:

7. The named entity extraction method of claim 6, wherein the step of detecting whether the number of candidate positions in the first subsequence and the second subsequence is equal comprises:

8. A named entity extraction apparatus, comprising:

9. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement the named entity extraction method of any of claims 1-7.

10. A computer-readable storage medium having stored thereon computer-readable instructions which, when executed by a processor of a computer, cause the computer to perform the named entity extraction method of any one of claims 1 to 7.