CN116167378A

CN116167378A - Named entity recognition method and system based on countermeasure migration learning

Info

Publication number: CN116167378A
Application number: CN202310133155.1A
Authority: CN
Inventors: 程良伦; 朱志鸿; 张伟文
Original assignee: Guangdong Nengge Knowledge Technology Co ltd; Guangdong University of Technology
Current assignee: Guangdong Nengge Knowledge Technology Co ltd; Guangdong University of Technology
Priority date: 2023-02-16
Filing date: 2023-02-16
Publication date: 2023-05-26

Abstract

The invention discloses a named entity recognition method and a named entity recognition system based on countermeasure transfer learning, wherein the method comprises the steps of constructing a training data set; inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation; inputting sentence vector representation into a two-way long and short-time memory network for feature extraction processing to obtain a feature set; inputting the private feature set into a self-attention layer for dependency analysis and processing to obtain a named entity task representation and a Chinese word segmentation task representation; inputting the named entity task representation and the Chinese word segmentation task representation into a conditional random field layer for decoding processing to obtain a named entity sequence tag and a Chinese word segmentation sequence tag; and performing countermeasure learning training treatment on the untrained named entity recognition model according to the named entity sequence tag and the Chinese word segmentation sequence tag in combination with the task classifier. The embodiment of the invention can improve the accuracy of named entity identification and can be widely applied to the technical field of artificial intelligence.

Description

Named entity recognition method and system based on countermeasure migration learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a named entity identification method and system based on anti-migration learning.

Background

In the big data age, people can acquire more and more data, which increases the cost of extracting information and acquiring knowledge from a large amount of data. If manual processing is used entirely, it becomes less cumbersome and complex. Automated processing of massive amounts of data thus involves natural language processing techniques, and Named Entity Recognition (NER) is a preliminary and important task in the field of Natural Language Processing (NLP). When deep learning is used for named entity recognition, large scale annotation data is often required, and there are a large number of existing datasets available for use in Chinese Word Segmentation (CWS) tasks. In the related art, task sharing information between named entity recognition and Chinese word segmentation is concerned, private information of each task is not filtered, noise is brought to a named entity recognition method, and accuracy of named entity recognition is affected. In view of the foregoing, there is a need for solving the technical problems in the related art.

Disclosure of Invention

In view of this, the embodiment of the invention provides a named entity recognition method and a named entity recognition system based on anti-migration learning, which are realized.

In one aspect, the invention provides a named entity recognition method based on anti-migration learning, which comprises the following steps:

constructing a training data set, wherein the training data set comprises a named entity data set and a Chinese word segmentation data set;

inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation;

inputting the sentence vector representation into a two-way long and short-time memory network for feature extraction processing to obtain a feature set, wherein the feature set at least comprises named entity private features, shared features and Chinese word segmentation private features;

inputting the private feature set into a self-attention layer for dependency analysis and processing to obtain a named entity task representation and a Chinese word segmentation task representation;

inputting the named entity task representation and the Chinese word segmentation task representation into a conditional random field layer for decoding processing to obtain a named entity sequence tag and a Chinese word segmentation sequence tag;

performing countermeasure learning training treatment on the untrained named entity recognition model according to the named entity sequence tag and the Chinese word segmentation sequence tag in combination with a task classifier to obtain a trained named entity recognition model;

and acquiring a named entity to be identified, inputting the named entity to be identified into the trained named entity identification model for named entity identification processing, and obtaining a named entity identification result.

Optionally, the constructing a training dataset, the training dataset including a named entity dataset and a chinese word segmentation dataset, includes:

performing data crawling processing on the data website to obtain an original data set;

labeling the original data set to obtain a named entity data set;

and selecting a data set in a field different from the named entity data set from the general data set to obtain the Chinese word segmentation data set.

Optionally, the inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation includes:

sentence embedding processing is carried out on the training data set to obtain an embedded vector;

and encoding the embedded vector according to the preprocessing model to obtain sentence vector representation.

Optionally, the feature extraction processing is performed on the sentence vector representation input to a two-way long and short-time memory network to obtain a feature set, where the feature set at least includes a named entity private feature, a shared feature and a chinese word segmentation private feature, and the feature extraction processing includes:

the bidirectional long-short-term memory network comprises a named entity private network layer, a Chinese word segmentation private network layer and a shared network layer;

carrying out named entity recognition task feature extraction processing on the sentence vector representation through the named entity private network layer to obtain named entity private features;

carrying out shared information feature extraction processing on the sentence vector table through the shared network layer to obtain shared features;

and extracting the Chinese word segmentation task characteristics from the sentence vector representation through the Chinese word segmentation private network layer to obtain Chinese word segmentation private characteristics.

Optionally, the inputting the private feature set into the self-attention layer to perform dependency analysis processing to obtain a named entity task representation and a chinese word segmentation task representation includes:

the self-attention layer comprises a named entity private attention layer, a Chinese word segmentation private attention layer and a shared attention layer;

the self-attention layer learns the dependency relationship among characters of the private feature set, and extracts internal structure information to obtain named entity vector representation, chinese word segmentation vector representation and shared vector representation;

and respectively connecting the shared vector representation with the named entity vector representation and the Chinese word segmentation vector representation to obtain a named entity task representation and a Chinese word segmentation task representation.

Optionally, the step of inputting the named entity task representation and the chinese word segmentation task representation into a conditional random field layer for decoding processing to obtain a named entity sequence tag and a chinese word segmentation sequence tag includes:

the conditional random field layer comprises a named entity conditional random field and a Chinese word segmentation conditional random field;

carrying out label marking processing on the named entity task representation through the named entity conditional random field to obtain a named entity sequence label;

and carrying out label marking processing on the Chinese word segmentation task representation through the Chinese word segmentation conditional random field to obtain a Chinese word segmentation sequence label.

Optionally, the sentence embedding processing is performed on the training data set to obtain an embedded vector, including:

converting each character in the training data set to obtain character vector representation;

marking the character vector representation to obtain word embedded vector representation;

carrying out semantic classification processing on the character vector representation to obtain a segment embedded vector representation;

performing marking position coding processing on the character vector representation to obtain a position embedded vector representation;

and carrying out vector summation on the word embedding vector representation, the segment embedding vector representation and the position embedding vector representation to obtain an embedding vector.

On the other hand, the embodiment of the invention also provides a named entity recognition system based on the anti-migration learning, which comprises the following steps:

a first module for constructing a training data set, the training data set comprising a named entity data set and a Chinese word segmentation data set;

the second module is used for inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation;

the third module is used for carrying out feature extraction processing on the sentence vector representation input two-way long and short time memory network to obtain a feature set, wherein the feature set at least comprises named entity private features, shared features and Chinese word segmentation private features;

the fourth module is used for inputting the private feature set into the self-attention layer for dependency analysis and processing to obtain a named entity task representation and a Chinese word segmentation task representation;

the fifth module is used for inputting the named entity task representation and the Chinese word segmentation task representation into a conditional random field layer for decoding processing to obtain a named entity sequence tag and a Chinese word segmentation sequence tag;

the sixth module is used for performing countermeasure learning training treatment on the untrained named entity recognition model according to the named entity sequence tag and the Chinese word segmentation sequence tag combined with the task classifier to obtain a trained named entity recognition model;

and a seventh module, configured to obtain a named entity to be identified, input the named entity to be identified into the trained named entity identification model, and perform named entity identification processing to obtain a named entity identification result.

Optionally, the first module is configured to construct a training data set, where the training data set includes a named entity data set and a chinese word segmentation data set, and includes:

the first unit is used for carrying out data crawling processing on the data website to obtain an original data set;

the second unit is used for carrying out labeling processing on the original data set to obtain a named entity data set;

and the third unit is used for selecting a data set in the field different from the named entity data set from the general data set to obtain the Chinese word segmentation data set.

Optionally, the second module is configured to input the training data set into a preprocessing model for encoding processing, to obtain sentence vector representation, and includes:

a third unit, configured to perform sentence embedding processing on the training data set to obtain an embedded vector;

and a fourth unit, configured to encode the embedded vector according to the preprocessing model, to obtain a sentence vector representation.

Compared with the prior art, the technical scheme provided by the invention has the following technical effects: according to the embodiment of the invention, the feature analysis is carried out on the named entity data set and the Chinese word segmentation data set to obtain the named entity sequence tag and the Chinese word segmentation sequence tag, the task classifier is combined to carry out the countermeasure learning training treatment on the untrained named entity recognition model, and the named entity recognition is carried out through the trained named entity recognition model, so that word boundary information shared by two different tasks can be introduced into the countermeasure training, noise of the private information of the Chinese word segmentation task is prevented from being introduced, and the accuracy of the named entity recognition is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a named entity recognition method based on challenge migration learning according to an embodiment of the present application;

FIG. 2 is a diagram of a named entity recognition model architecture based on challenge migration learning according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an embedded layer model according to an embodiment of the present application;

fig. 4 is a schematic diagram of a preprocessing model according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

First, several nouns referred to in this application are parsed:

named Entity Recognition (NER), a preliminary and important task in the field of Natural Language Processing (NLP), can be used for many downstream NLP tasks, such as relationship extraction, event extraction, question-answering, etc., and its effect directly affects the subsequent relationship extraction and event extraction tasks. Named entity recognition is mainly applied to extracting entities with specific meanings, such as characters, organizations, places, etc., from unstructured text.

Chinese Word Segmentation (CWS) tasks are tasks that identify word boundaries, and Chinese NER tasks share much similarity with CWS tasks, referred to as task sharing information. Whereas Chinese NER typically has coarser granularity boundaries than CWS, the differences between such tasks are referred to as task private information.

A Bi-directional Long Short-Term Memory network (BiLSTM) is formed by combining a forward LSTM network with a backward LSTM network, which is a variant of the Recurrent Neural Network (RNN). The bidirectional long-short-time memory network which simultaneously considers the early-stage information and the later-stage information is provided on the basis of the unidirectional LSTM network, and the accuracy of a result obtained by time sequence prediction can be effectively ensured.

In the related art, in order to integrate word boundary information in a CWS task into a NER task, a joint model is proposed to execute chinese NER and CWS tasks. However, the model only focuses on task sharing information between Chinese NER and CWS, and does not filter private information for each task, which can create noise for both tasks. For example, the CWS task splits a "Lingshan island dock" into a "Lingshan island" and a "dock", while the NER task splits a "Lingshan island dock" as a whole. Therefore, how to utilize the sharing information of the tasks to prevent the NER tasks from being negatively affected by the CWS tasks has important research significance.

In view of the above, referring to fig. 1, an embodiment of the present invention provides a named entity recognition method based on anti-migration learning, including:

s101, constructing a training data set, wherein the training data set comprises a named entity data set and a Chinese word segmentation data set;

s102, inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation;

s103, inputting the sentence vector representation into a two-way long short-time memory network for feature extraction processing to obtain a feature set, wherein the feature set at least comprises named entity private features, shared features and Chinese word segmentation private features;

s104, inputting the private feature set into a self-attention layer for dependency analysis and processing to obtain a named entity task representation and a Chinese word segmentation task representation;

s105, inputting the named entity task representation and the Chinese word segmentation task representation into a conditional random field layer for decoding processing to obtain a named entity sequence tag and a Chinese word segmentation sequence tag;

s106, performing countermeasure learning training treatment on the untrained named entity recognition model according to the named entity sequence tag and the Chinese word segmentation sequence tag in combination with a task classifier to obtain a trained named entity recognition model;

s107, acquiring a named entity to be identified, inputting the named entity to be identified into the trained named entity identification model for named entity identification processing, and obtaining a named entity identification result.

Referring to fig. 2, in an embodiment of the present invention, a Named Entity (NER) dataset and a Chinese Word Segmentation (CWS) dataset are first constructed, the constructed resulting training dataset is encoded by a pre-processing model (BERT), the BERT is mapped as a sequence encoder to a vector representation by inputting text sentences of the two datasets into an embedding layer, wherein the BERT is shared by three subtasks. It should be noted that the named entity dataset constructed in the embodiment of the present invention is a chinese named entity dataset. As a pre-training language model, BERT further improves the generalization capability of the word embedding model, and fully expresses character-level, word-level, sentence-level and even inter-sentence relation characteristic information, thereby obtaining sentence vector representation. The embodiment of the invention performs feature extraction processing on sentence vector representation input two-way long and short time memory network (BiLSTM) to obtain a hidden state of each character in a Chinese sentence as a feature set, wherein the feature set at least comprises named entity private features, shared features and Chinese word segmentation private features. Then, the embodiment of the invention inputs the hidden state of each character output by the BiLSTM layer into the self-attention layer, learns the dependency relationship between any two characters, and extracts the internal structure information of sentences to obtain the named entity task representation and the Chinese word segmentation task representation. Finally, the embodiment of the invention introduces a specific Conditional Random Field (CRF) layer for the two tasks respectively, inputs the final representations of the two tasks into the respective CRF layers for decoding, and obtains the final sequence label. Performing countermeasure learning training treatment on the untrained named entity recognition model according to the named entity sequence tag and the Chinese word segmentation sequence tag in combination with the task classifier to obtain a trained named entity recognition model; and acquiring the named entity to be identified, inputting the named entity to be identified into a trained named entity identification model, and carrying out named entity identification processing to obtain a named entity identification result.

Further as a preferred embodiment, the constructing a training dataset comprising a named entity dataset and a chinese word segmentation dataset comprises:

labeling the original data set to obtain a named entity data set;

In the embodiment of the invention, different data websites are crawled by using a web crawler technology to obtain text data, the original data is marked, and a Chinese named entity data set is constructed. For Chinese word segmentation tasks, a general data set from different fields is selected. Specifically, the original data is marked by using an entity type, and a marking mode adopts a BIO marking method, wherein B represents the beginning of an entity, I represents the middle or end of the entity, and O represents a non-entity. In the embodiment of the invention, the data in the ship field can be adopted, and the entity types can be five types (name, place, organization, ship name and ship type). The next step of data enhancement is to collect all entities of different categories first, for example, put all the entities of ship name category in one set, put all the entities of ship type category in another set, then internal disorder each set, put them back to the position of the entity of the same category, finally delete the repeated sentences, construct and get the Chinese named entity data set.

Further as a preferred embodiment, the inputting the training data set into a preprocessing model for coding processing to obtain sentence vector representation includes:

In the embodiment of the invention, a named entity data set and text sentences of a Chinese word segmentation data set in a training data set are input into an embedding layer to be subjected to sentence embedding processing to obtain embedded vectors, and finally discrete characters of the input sentences are mapped into vector representations by taking a preprocessing model BERT as a sequence encoder to obtain sentence vector representations.

Further as a preferred embodiment, the feature extraction processing is performed on the sentence vector representation input to a two-way long and short-term memory network to obtain a feature set, where the feature set at least includes a named entity private feature, a shared feature, and a chinese word segmentation private feature, and the method includes:

Referring to fig. 2, in the embodiment of the present invention, in order to fuse information on both sides of a sequence, feature extraction is performed using a bidirectional long-short-time memory network, so as to obtain a hidden state of each character in a chinese sentence. Wherein the bidirectional long-short-time memory network comprises a named entity private network layer (NER BILSTM), a Chinese word-segmentation private network layer (CWS BILSTM) and sharingNetwork layer (Shared BiLSTM). Given a sentence x= { c from a chinese named entity dataset or chinese word segmentation dataset ₁ ,c ₂ ,…,c _N The hidden state of the BiLSTM layer may be represented as follows:

in the above-mentioned method, the step of,

and->

The hidden states of the forward and backward LSTM at the vector representation position i are respectively represented; representing the connection operation.

The named entity private network layer is used for extracting the characteristics of the Chinese named entity task, and the private characteristics of the shared network layer and the Chinese word segmentation private network layer are used for countermeasure training to learn shared word boundary information, wherein the named entity private network layer and the Chinese word segmentation private network layer can be collectively called as a private network layer. The hidden states of the private network layer and the shared network layer are respectively expressed as follows:

in θ _private And theta _s h _ared Representing private network layer parameters and shared network layer parameters, x, respectively _t Indicating the t-th character of the input BiLSTM layer,

representing private network layer hidden status,/->

Representing the shared network layer hidden state.

Further as a preferred embodiment, the inputting the private feature set into the self-attention layer to perform dependency analysis processing to obtain a named entity task representation and a chinese word segmentation task representation includes:

In the embodiment of the invention, the self-attention layer comprises a named entity private attention layer, a Chinese word segmentation private attention layer and a shared attention layer. The private feature set is input into the self-attention layer, the dependency relationship between any two characters is learned, and the internal structure information of sentences is extracted. Taking the named entity private attention layer as an example, h= { H ₁ ,h ₂ ,…,h _N The output of the named entity private network layer is represented, and the applied scaling dot product attention is represented as follows:

in the above

And->

Respectively representing a query matrix, a key matrix and a value matrix; n is the total number of characters of the input sentence, d _h Outputting a dimension of the vector representation for each character for the unidirectional LSTM; d represents the dimension of the hidden unit of the BiLSTM layer, and is equal to 2d in value _h 。

In addition, the embodiment of the invention adopts a multi-head attention mechanism, and carries out linear projection on the query, the key and the value for h times through multi-head attention, and then carries out dot product zooming attention on the h projections in parallel. Finally, the results are concatenated and projected again to obtain a new representation. The formula for multi-head attention is as follows:

head _i ＝Attention(QW _i ^Q ,KW _i ^K ,VW _i ^V )；

H′＝(head _i ⊕…⊕head _h )W _o ；

in the above-mentioned method, the step of,

and

are trainable parameters.

Further as a preferred embodiment, the inputting the named entity task representation and the chinese word segmentation task representation into a conditional random field layer for decoding to obtain a named entity sequence tag and a chinese word segmentation sequence tag includes:

Referring to FIG. 2, in an embodiment of the present invention, the conditional random field layer includes a named entity conditional random field (NER CRF) and a Chinese word segmentation conditional random field (CWS CRF). Because the Chinese naming entity task and the Chinese word segmentation task have different dependency relationships between labels, a specific Conditional Random Field (CRF) layer is respectively introduced for the two tasks. The final representations of the two tasks are input into respective CRF layers for decoding to obtain final sequence tags. Given a sentence x= { c ₁ ,c ₂ ,…,c _N Sum of the corresponding labels y= { y ₁ ,y ₂ ,…,y _N The CRF labeling process is expressed as follows:

o _i ＝W _s h _i i+b _s ；

in the above, H' _n ＝{h′ ₁ ,h′ ₂ ,…,h′ _N -represents an input of the CRF layer;

and->

Are trainable parameters, |t| represents the number of output tags; o (o) _i,yi Representing character c _i Is the y of (2) _i A score of the individual tags; t is a transfer matrix that scores two adjacent labels; y is Y _x All candidate tag sequences representing a given sentence x; />

Is a predicted tag sequence, and is obtained by decoding through a Viterbi algorithm.

The probability of outputting the best tag sequence is defined as:

in the above-mentioned method, the step of,

representing the actual tag sequence.

Given T training samples

The loss function is defined as:

the loss function can be minimized by gradient back propagation.

Further as a preferred embodiment, the sentence embedding processing is performed on the training data set to obtain an embedded vector, including:

Referring to fig. 3 and 4, in an embodiment of the present invention, text sentences of two data sets are input to an embedding layer, BERT as a sequence encoder that maps discrete characters of the input sentence into vector representations, the encoder being shared by three subtasks. As a pre-training language model, BERT further improves the generalization capability of the word embedding model and fully expresses character-level, word-level, sentence-level and even inter-sentence relation characteristic information.

FIG. 3 is a schematic diagram of an embedding layer model, with the result of the embedding process of a data set by the embedding layer as an input to the BERT. The embedded vector is composed of word embedded vector representation (Token embedded) and segment embedded vector representation (Segment Embeddings) and position embedded vector representation (Position Embeddings). The word embedding layer is used for converting each Chinese character into 768-dimensional vector representation, the input sentence is labeled before the word embedding layer is input, and [ CLS ] is respectively added at the beginning and the end of the label sequence]And [ SEP ]]Two special marks. The role of segment embedding is to distinguish between the tokens in the input sentence pair, i.e. to classify according to whether the two sentences are semantically similar. Specifically, an input sentence pair "[ CLS ]]Temporary berthing of ship at Lingshan island wharf [ SEP ]]Building national ocean park [ SEP ] by Lingshan island]", the segment embedding layer assigns the first vector (index 0) to sentence 1 ([ CLS)]Temporary berthing of ship at Lingshan island wharf [ SEP ]]) The second vector (index 1) was assigned to sentence 2 (Lingshan island was assigned to establish national ocean park [ SEP ]]). The function of the position embedding is to encode the position information of the tokens in the input sentence. Since the NER task ultimately predicts the tag sequence in view of the order of each character of the input sentence, the position embedding layer assigns a vector to each tag of the sentence so that the BERT learns the sequential features of the input sequence. Summing the word embedding vector, segment embedding vector and position embedding vector of each tag to obtain { E } ₁ ,E ₂ ,…,E _N Input to the converter pre-training model of BERT for language understanding, and finally obtain the vector representation { c } of the sentence ₁ ,c ₂ ,…,c _N }. The overall architecture of BERT is shown in fig. 4, where Trm is an abbreviation for transducer.

In the related art, the joint model integrates word boundary information in the Chinese word segmentation task into the named entity task so as to execute the Chinese named entity task and the Chinese word segmentation task, but only focuses on task sharing information between the Chinese NER and the CWS, does not filter private information of each task, and brings noise to the two tasks.

In summary, the embodiment of the invention has the following advantages: the named entity recognition method and system based on the anti-migration learning integrate word boundary information shared by tasks into a Chinese named entity recognition task. The countermeasure transfer learning incorporates countermeasure training into the transfer learning, optimizes the named entity recognition task, and not only utilizes shared information obtained from the named entity recognition and the Chinese word segmentation task training, but also prevents noise from being introduced into the Chinese word segmentation task. Compared with other models, the invention introduces countermeasure training, utilizes word boundary information shared by two tasks, prevents noise of Chinese word segmentation task private information from being introduced, and improves accuracy of a named entity recognition method.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims

1. A named entity recognition method based on challenge migration learning, the method comprising:

2. The method of claim 1, wherein the constructing a training dataset comprising a named entity dataset and a chinese word segmentation dataset comprises:

labeling the original data set to obtain a named entity data set;

3. The method of claim 1, wherein inputting the training data set into a preprocessing model for encoding processing results in sentence vector representations, comprising:

4. The method according to claim 1, wherein the feature extraction processing is performed on the sentence vector representation input to a two-way long short-term memory network to obtain a feature set, where the feature set includes at least named entity private features, shared features, and chinese word segmentation private features, and the method includes:

5. The method according to claim 1, wherein the inputting the private feature set into the self-attention layer for dependency analysis processing to obtain a named entity task representation and a chinese word segmentation task representation includes:

the self-attention layer learns the dependency relationship among characters of the private feature set, and extracts internal structure information to obtain named entity vector representation, chinese word segmentation vector representation and a shared vector table

Showing;

6. The method of claim 1, wherein said inputting the named entity task representation and the chinese word segmentation task representation into a conditional random field for decoding to obtain a named entity sequence tag and a chinese word segmentation sequence tag comprises:

7. A method according to claim 3, wherein said sentence embedding of said training data set to obtain an embedded vector comprises:

8. A named entity recognition system based on challenge transfer learning, the system comprising:

9. The named entity recognition system of claim 8, wherein the first module is configured to construct a training dataset comprising a named entity dataset and a chinese word segmentation dataset, comprising:

10. The named entity recognition system of claim 8, wherein the second module is configured to input the training dataset into a preprocessing model for encoding, to obtain a sentence vector representation, and comprises: