CN116151241A

CN116151241A - Entity identification method and device

Info

Publication number: CN116151241A
Application number: CN202310417766.9A
Authority: CN
Inventors: 邓正秋; 何亮
Original assignee: Hunan Malanshan Video Advanced Technology Research Institute Co ltd
Current assignee: Hunan Malanshan Video Advanced Technology Research Institute Co ltd
Priority date: 2023-04-19
Filing date: 2023-04-19
Publication date: 2023-05-23
Anticipated expiration: 2043-04-19
Also published as: CN116151241B

Abstract

The invention provides an entity recognition method and a device, wherein the entity recognition method performs character embedding on an input text and generates unique vector representation for each character; determining potential entity areas and corresponding context areas in the text by enumerating span units in the input sequence; jointly modeling potential entity regions and context regions using a graph convolution network and a multi-headed attention layer; the result of the joint modeling determines the entity class of the potential entity region via a classifier. The entity identification method can efficiently and accurately identify the contained entity information from the unstructured sequence text. When the invention recognizes whether the character sequence in the text is an entity, the semantic information of the sequence is considered, the context information formed by the residual characters is fully modeled, and the entity recognition precision is effectively improved.

Description

Entity identification method and device

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a method and apparatus for entity identification.

Background

Natural text is typically propagated and recorded in unstructured sequences, where there is a large amount of entity information, such as names of people, places, organizations, and institutions, that express specific concepts, as shown in fig. 1. The rapid and accurate identification of entity information in unstructured sequence text is one of the key technologies for constructing question-answering systems and recommendation systems.

The entity identification in the unstructured sequence text is most complex, characteristics such as syntax, semantics and context are required to be considered at the same time, and the traditional rule-based information extraction method is difficult to meet the entity identification requirement of the unstructured sequence text. Human beings can read and acquire entity information in unstructured sequence texts, but entity identification of massive data is not enough for work.

Disclosure of Invention

Aiming at the technical problems in the related art, the invention provides an entity identification method, which comprises the following steps:

s1, performing character embedding on an input text, and generating a unique vector representation for each character to obtain a vector sequence of the input text

；

S2, inputting a text vector sequence by enumeration

The span unit in the text obtaining unit obtains the span set of the input text

；

S3, collecting the spans

Input semantic feature vector of bidirectional convolution generation span region +.>

；

S4, the semantic feature vector is processed

Inputting the two-way long-short-period memory network to obtain the context information +.>

；

S5, the context information is processed

Obtaining the joint modeling result of the semantic features and the contextual features of the span units by nonlinear transformation>

；

S6, modeling the indicated combination results

The input classifier obtains the entity class.

Specifically, the step S1 includes: s11, randomly initializing a feature matrix

An embedding matrix as a character, wherein->

Is the length of the character table, < >>

Representing the embedding dimension of each character;

s12, for each character in the input text, the character is extracted from the feature matrix according to the id of the character in the character table

The middle cables lead to respective vector representations.

Specifically, the step S3 includes:

s31, reconstructing a span sequence of the chain structure into a graph structure;

s32, constructing each node characteristic in the two-way graph convolution layer aggregation graph;

s33, accumulating and averaging all nodes in the feature map, and calculating semantic feature representation of the span unit

。

Specifically, the step S4 includes:

s41, using feature vectors

Replacing vector sequences of span units in the original input vector sequence, i.e.

Become->

;

S42, constructing two-way long-short-term memory network modeling

Is a sequence feature of (2);

s43, aggregating sequences based on self-attention mechanisms

Midspan characteristics->

And contextual characteristics->

The dependency relationship exists, and the calculation formula is as follows:

，

wherein ,

is of dimension +.>

Is a normalized exponential function;

is a feature vector of a span unit with dimension +.>

；/>

Is a feature matrix formed by state feature vectors of a two-way long-short-term memory network, and the dimension is +.>

；/>

As a joint modeling output of span semantic features and context features, the vector dimension is +.>

。

Specifically, the step S5 includes:

s51, repeatlSub-step S4, depth modeling semantic and contextual features, the output feature vectors of which are expressed as

；

S52, feature vectors

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

。

In a second aspect, another embodiment of the present invention discloses an entity recognition apparatus, including:

an input text vector generation unit for character embedding the input text, generating a unique vector representation for each character to obtain a vector sequence of the input text

；

A span set generating unit for inputting text for enumerating the input text vector sequence

The span unit in (1) obtaining the span set of the input text +.>

；

A semantic feature vector generation unit for aggregating the spans

；

A context information generating unit for generating the semantic feature vector

；/>

A joint modeling result generation unit of semantic features and context features for generating the context information

；

An entity acquisition unit for modeling the results of the joint modeling

The input classifier obtains the entity class.

Specifically, the input text vector generation unit includes: an embedded matrix initializing unit for randomly initializing a feature matrix

As charactersIs embedded in matrix->

Is the length of the character table, < >>

Representing the embedding dimension of each character;

a vector generation unit for generating a character matrix for each character in the input text based on its id in the character table

The middle cables lead to respective vector representations.

Specifically, the semantic feature vector generating unit includes:

a graph structure reconstructing unit for reconstructing the span sequence of the chain structure into a graph structure;

the bidirectional graph convolution construction unit is used for constructing each node characteristic in the bidirectional graph convolution layer aggregation graph;

a semantic feature representation calculation unit for calculating semantic feature representation of the span unit by cumulatively averaging each node in the feature map

。

Specifically, the context information generating unit includes:

a first vector replacement unit for using feature vectors

Replacement of vector sequences of span units in the original input vector sequence, i.e. +.>

Become->

;

Two-way long-short-term memory network construction unit for constructing two-way long-short-term memory network modeling

Is a sequence feature of (2);

a first joint modeling unit for aggregating sequences based on a self-attention mechanism

Midspan characteristics->

And contextual characteristics->

The dependency relationship exists, and the calculation formula is as follows:

，

wherein ,

is of dimension +.>

Is a normalized exponential function;

is a feature vector of a span unit with dimension +.>

；/>

；/>

。

Specifically, the unit for generating the joint modeling result of the semantic features and the contextual features comprises:

a first execution unit for repeatedly executinglA secondary context information generating unit for modeling semantic features and context feature depth, the output feature vector of which is expressed as

；/>

A second modeling unit for modeling the feature vector

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

。

In a third aspect, another embodiment of the present invention discloses a nonvolatile memory having instructions stored thereon, which when executed by a processor, are configured to implement an entity identification method as described above.

The entity recognition method of the invention performs character embedding on an input text, and generates unique vector representation for each character; determining potential entity areas and corresponding context areas in the text by enumerating span units in the input sequence; jointly modeling potential entity regions and context regions using a graph convolution network and a multi-headed attention layer; the result of the joint modeling determines the entity class of the potential entity region via a classifier. The entity identification method can efficiently and accurately identify the contained entity information from the unstructured sequence text. When the invention recognizes whether the character sequence in the text is an entity, the semantic information of the sequence is considered, the context information formed by the residual characters is fully modeled, and the entity recognition precision is effectively improved. According to the method and the device, through an enumeration mode, all character subsequences in the text are considered to be potential entities, and overlapped entity information in the text can be well identified.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of unstructured text provided by an embodiment of the present invention;

FIG. 2 is a flowchart of a method for entity identification according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a text embedding process provided by an embodiment of the present invention;

FIG. 4 is a span enumeration schematic of input text of length 4 provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of joint modeling of span semantic features and context features provided by an embodiment of the present invention;

FIG. 6 is a schematic diagram of a two-way long and short term memory network according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of an entity identification device according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a seed entity identification device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.

Example 1

Referring to fig. 2, the embodiment discloses an entity identification method, which includes the following steps:

s1, character embedding is carried out on an input text, and a unique direction is generated for each characterVector sequence for quantity representation to obtain input text

；

The computer cannot directly perform calculations on text characters, and this embodiment requires that characters in the input text be mapped to vector space first.

The specific step S1 includes: s11, randomly initializing a feature matrix

An embedding matrix as a character, wherein->

Is the length of the character table, < >>

Representing the embedding dimension of each character.

Specifically, the character table of the embodiment may be obtained by counting the number of different characters in the corpus. In another embodiment the character table may also be pre-set.

The middle cables lead to respective vector representations.

Referring to fig. 3, fig. 3 is a schematic diagram of a process of text character embedding. In the embodiment, a corpus is firstly obtained, and my love my ancestor is expected to exist in the corpus, …, and my love my hometown; and (5) corpus waiting. The characters in the expected library are then counted and a character table is obtained, which includes i am, love, ancestor, state … home, county. And each character in the character table has a unique id, such as: i am (1), love (2), 3, ancestor (4), state (5), … families (V-1), county (V).

Assume that the character sequence of the input text is

，nRepresenting the character length of the input text. The input text can be represented as a vector sequence +.>

, wherein />

Is of dimension ofdIs a character vector of (a).

Referring to FIG. 3, for the character "love" in "I love my our country" in the input text, the vector sequence it generates

。

S2, inputting a text vector sequence by enumeration

The span unit in (1) obtaining the span set of the input text +.>

；

Wherein the span set represents a potential entity region of the input text and a corresponding context region;

the present embodiment defines arbitrary length contiguous subsequences in the input text as a Span unit (Span), each Span unit being considered a potential entity area to be identified. Specifically, assuming that the length of the input text sequence is N, one can enumerate

And the subsequent neural network model models all enumerated span units and judges whether the span units are entities or belong to which type of entity. Fig. 4 is a span enumeration schematic of an input text of length 4, wherein a total of 10 span units may be enumerated.

Assume that the vector sequence of the input text is

Then the span set can be obtained after enumeration

, wherein />

。

S3, collecting the spans

；

The neural network model designed in this embodiment models span sets by jointly modeling span semantic features and context featuresSEach span unit of (a)

Generating unique characteristic representation->

. The specific operation process is shown in fig. 5, and mainly comprises the following three steps:

s3, collecting the spans

；

S4, the semantic feature vector is processed

；

S5, the context information is processed

Acquiring semantic features of span units by nonlinear transformationCombined modeling of contextual features>

；

The present embodiment takes a span unit of i=k=3 as an example, and details the modeling process of the present embodiment:

wherein step S3 comprises:

s31, reconstructing the span sequence of the chain structure into a graph structure.

In the reconstruction process, the character vector in the span unit is used as a node characteristic, and the node with the front time sequence can point to the subsequent node. As shown in fig. 4, the span unit

There are three nodes, wherein->

Can point to +.>

and />

，/>

Can only point to +.>

;

S32, constructing each node characteristic in a two-way graph convolution (BiGCN) layer aggregation graph.

In particular using a non-linear function ReLU and three sets of characteristic parameters

、/>

and />

And carrying out nonlinear transformation on the characteristics of the neighborhood nodes to update the characteristic vector of each node, wherein the mathematical expression is as follows:

，

，

，

therein, wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Is used for the parameter vector of (a). />

，/>

Is a vector concatenation operation.

Operation example: [1,2,3]

[4,5,6]=[1,2,3,4,5,6]The method comprises the steps of carrying out a first treatment on the surface of the Let->

Then

。

：

，

wherein

Representing accumulation operations, e.g.)>

。

The step S4 specifically includes:

s41, using feature vectors

Become->

;

S42, constructing a two-way long-short-term memory network (BiLSTM) model

Is a sequence feature of (2);

the structure of the network is shown in FIG. 6, in which

For the input feature vector at the current time, +.>

The two feature vectors are respectively output at the previous moment, and t represents the position of a character in the input text.

The specific calculation formula is as follows:

，

，

，

，

，

，

wherein

Representation->

Feature vector at middle t position, +.>

From the previous moment ∈>

And (5) calculating.

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Is used for the parameter vector of (a). />

，

Is a vector concatenation operation. />

Multiplication of corresponding elements in the representative vector, i.e.

。

The present embodiment uses a two-way long short-term memory network (BiLSTM) to output a state vector at each time t

Constitution of

Is expressed as +.>

。/>

S43, aggregating sequences based on self-attention mechanisms

Midspan characteristics->

Is { about the contextual characteristics>

The dependency relationship exists between the two, and the calculation formula is as follows:

，

wherein ,

is of dimension +.>

Parameter matrix of (2)Softmax is a normalized exponential function. />

Is a feature vector of a span unit with dimension +.>

。/>

。/>

。

The step S5 specifically includes:

；

S52, feature vectors

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

：

wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

The output of max (x, y) is the larger of x and y. />

Maximum spanning Unit->

And carrying out entity recognition by a subsequent classifier on the joint modeling result in the text D to output entity categories.

S6, modeling the indicated combination results

The input classifier obtains the entity class.

Constructing a linear classifier and calculating the span unit from the normalized exponential function softmax

Probability distribution of the belonging entity class:

，

wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Parameter vector of>

Equal to the number +1 of entity categories in the corpus (non-entities are set as a class of entities). Output of classifier->

Is of dimension of

Wherein each dimension represents a stride element +.>

The represented character sequence belongs to the probability value of a certain class of entity.

The embodiment takes

Entity class corresponding to the dimension with the largest probability value in the text is used as the input textDMid span unit->

Entity recognition results of the represented character sequence. For example, there are 4 entities in the corpus altogether, if +.>

Span unit->

Represented character sequence->

Belonging to a second class of entities; if->

Span unit->

Represented character sequence->

Belongs to non-entity.

The entity recognition method of the embodiment performs character embedding on an input text, and generates unique vector representation for each character; determining potential entity areas and corresponding context areas in the text by enumerating span units in the input sequence; jointly modeling potential entity regions and context regions using a graph convolution network and a multi-headed attention layer; the result of the joint modeling determines the entity class of the potential entity region via a classifier. The entity identification method can efficiently and accurately identify the contained entity information from the unstructured sequence text. When the character sequence in the text is identified as the entity, the embodiment considers the semantic information of the sequence, fully models the context information formed by the residual characters, and effectively improves the accuracy of entity identification. In this embodiment, by means of enumeration, all character sub-sequences in the text are considered to be potential entities, so that overlapping entity information in the text can be well identified. For example, "Wuhan Yangtze bridge" is an entity, and "Wuhan" included therein is also an entity.

Example two

Referring to fig. 7, the present embodiment discloses an entity recognition apparatus, which includes the following units:

；

The specific input text vector generation unit includes: an embedded matrix initializing unit for randomly initializing a feature matrix

Embedding matrix as characterWherein->

Is the length of the character table, < >>

Representing the embedding dimension of each character.

The middle cables lead to respective vector representations.

Assume that the character sequence of the input text is

, wherein />

Is of dimension ofdIs a character vector of (a).

。

The span unit in (1) obtaining the span set of the input text +.>

；

Assume that the vector sequence of the input text is

Then the span set can be obtained after enumeration

, wherein />

。

A semantic feature vector generation unit for aggregating the spans

Language input with bidirectional graph convolution to generate span regionSense eigenvector->

；

；

；

wherein the semantic feature vector generating unit includes:

and the diagram structure reconstruction unit is used for reconstructing the span sequence of the chain structure into a diagram structure.

There are three nodes, wherein->

Can point to +.>

and />

，/>

Can only point to +.>

;

And the bidirectional graph convolution construction unit is used for constructing each node characteristic in the bidirectional graph convolution (BiGCN) layer aggregation graph.

、/>

and />

，

，

，

therein, wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Is used for the parameter vector of (a). />

，/>

Is a vector concatenation operation.

Operation example: [1,2,3]

Then

。

：

wherein

Representing accumulation operations, e.g.)>

。

The context information generation unit specifically includes:

a first vector replacement unit for using feature vectors

Become->

;

Two-way long-short-term memory network construction unit for constructing two-way long-short-term memory network (BiLSTM) modeling

Is a sequence feature of (2);

the structure of the network is shown in FIG. 6, in which

For the input feature vector at the current time, +.>

The specific calculation formula is as follows:

，

，

，

，

，

，

wherein

Representation->

Feature vector at middle t position, +.>

From the previous moment ∈>

And (5) calculating.

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Is used for the parameter vector of (a). />

，

Is a vector concatenation operation. />

Multiplication of corresponding elements in the representative vector, i.e. +.>

。

Constitution of

Is expressed as +.>

。

Midspan characteristics->

Is { about the contextual characteristics>

wherein ,

is of dimension +.>

Is a normalized exponential function. />

Is a feature vector of a span unit with dimension +.>

。/>

。/>

。

The generation unit of the joint modeling result of the semantic features and the contextual features specifically comprises:

；

A second modeling unit for modeling the feature vector

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

：

，

wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

The output of max (x, y) is the larger of x and y. />

Maximum spanning Unit->

In textAnd D, carrying out entity recognition by a subsequent classifier on the joint modeling result in the step D to output entity categories.

An entity acquisition unit for modeling the results of the joint modeling

The input classifier obtains the entity class.

Probability distribution of the belonging entity class:

，

wherein

Is of dimension +.>

Parameter matrix of>

Is of dimension +.>

Parameter vector of>

Is of dimension of

Wherein each dimension represents a stride element +.>

The embodiment takes

Entity recognition results of the represented character sequence. For example, there are 4 entities in total in the corpus, if

Span unit->

Represented character sequence->

Belonging to a second class of entities; if it is

Span unit->

Represented character sequence->

Belongs to non-entity.

Example III

Referring to fig. 8, fig. 8 is a schematic diagram of the structure of an entity recognition apparatus of the present embodiment. The entity identification device 20 of this embodiment comprises a processor 21, a memory 22 and a computer program stored in said memory 22 and executable on said processor 21. The steps of the above-described method embodiments are implemented by the processor 21 when executing the computer program. Alternatively, the processor 21 may implement the functions of the modules/units in the above-described device embodiments when executing the computer program.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 22 and executed by the processor 21 to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments are used to describe the execution of the computer program in the entity identification device 20. For example, the computer program may be divided into modules in the second embodiment, and specific functions of each module refer to the working process of the apparatus described in the foregoing embodiment, which is not described herein.

The entity identification device 20 may include, but is not limited to, a processor 21, a memory 22. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the entity identification device 20 and does not constitute a limitation of the entity identification device 20, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the entity identification device 20 may also include input-output devices, network access devices, buses, etc.

The processor 21 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 21 is a control center of the entity identification device 20, and connects the various parts of the entire entity identification device 20 using various interfaces and lines.

The memory 22 may be used to store the computer program and/or module, and the processor 21 may implement the various functions of the entity identification device 20 by running or executing the computer program and/or module stored in the memory 22 and invoking data stored in the memory 22. The memory 22 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory 22 may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.

Wherein the integrated modules/units of the entity identification device 20 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the method embodiments described above when executed by the processor 21. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. An entity identification method is characterized in that: the method comprises the following steps:

s1, performing character embedding on an input text to obtainGenerating a unique vector representation for each character to obtain a vector sequence of input text

；

S2, inputting a text vector sequence by enumeration

The span unit in (1) obtaining the span set of the input text +.>

；

S3, collecting the spans

；

S4, the semantic feature vector is processed

；

S5, the context information is processed

；

S6, modeling the indicated combination results

The input classifier obtains the entity class.

2. The method according to claim 1, characterized in that: the step S1 includes: s11, randomly initializing a feature matrix

An embedding matrix as a character, wherein->

Is the length of the character table, < >>

Representing the embedding dimension of each character;

The middle cables lead to respective vector representations.

3. The method according to claim 1, characterized in that: the step S3 includes:

。

4. A method according to claim 3, characterized in that: the step S4 includes:

s41, using feature vectors

Become->

;

S42, constructing two-way long-short-term memory network modeling

Is a sequence feature of (2);

s43, aggregating sequences based on self-attention mechanisms

Midspan characteristics->

And contextual characteristics->

The dependency relationship exists, and the calculation formula is as follows:

，

wherein ,

is of dimension +.>

Is a normalized exponential function; />

Is a feature vector of a span unit with dimension +.>

；/>

Is of a two-way long-short-term memory networkFeature matrix formed by state feature vectors, the dimension is +.>

；/>

。/>

5. The method according to claim 4, wherein: the step S5 includes:

；

S52, feature vectors

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

。

6. An entity recognition device, characterized in that: comprising the following units:

；

The span unit in (1) obtaining the span set of the input text +.>

；

A semantic feature vector generation unit for aggregating the spans

；

；

；

An entity acquisition unit for modeling the results of the joint modeling

The input classifier obtains the entity class.

7. The apparatus according to claim 6, wherein: the input text vector generation unit includes: an embedded matrix initializing unit for randomly initializing a feature matrix

An embedding matrix as a character, wherein->

Is the length of the character table, < >>

Representing the embedding dimension of each character;

The middle cables lead to respective vector representations.

8. The apparatus according to claim 6, wherein: the semantic feature vector generating unit includes:

。

9. The apparatus according to claim 8, wherein: the context information generation unit includes:

a first vector replacement unit for using feature vectors

Become->

;

Is a sequence feature of (2); />

Midspan characteristics->

With contextual characteristics

The dependency relationship exists, and the calculation formula is as follows:

，

wherein ,

is of dimension +.>

Is a normalized exponential function; />

Is a feature vector of a span unit with dimension +.>

；/>

；/>

。

10. The apparatus according to claim 9, wherein: the joint modeling result generating unit of the semantic features and the context features comprises:

；

A second modeling unit for modeling the feature vector

The most input is output via the following way>

Combined modeling result of semantic features and contextual features +.>

。/>