CN115310449A - Named entity identification method and device based on small sample and related medium - Google Patents

Named entity identification method and device based on small sample and related medium Download PDF

Info

Publication number
CN115310449A
CN115310449A CN202211000683.1A CN202211000683A CN115310449A CN 115310449 A CN115310449 A CN 115310449A CN 202211000683 A CN202211000683 A CN 202211000683A CN 115310449 A CN115310449 A CN 115310449A
Authority
CN
China
Prior art keywords
text
sample set
sample
label
named entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211000683.1A
Other languages
Chinese (zh)
Inventor
张黔
王伟
陈焕坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Resources Digital Technology Co Ltd
Original Assignee
China Resources Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Resources Digital Technology Co Ltd filed Critical China Resources Digital Technology Co Ltd
Priority to CN202211000683.1A priority Critical patent/CN115310449A/en
Publication of CN115310449A publication Critical patent/CN115310449A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Abstract

The invention discloses a named entity identification method, a named entity identification device and a related medium based on small samples, wherein the method comprises the following steps: acquiring sample data, and labeling an entity label on the sample data to construct a first sample set; selecting pivot characters in the first sample set, and constructing a label mapping space based on the pivot characters; mapping the first set of samples to a second set of samples using the label mapping space; fine-tuning a pre-training language model by using the second sample set; and carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model. According to the method, the most representative pivot character is selected to construct the label mapping space so as to map the sample data, then the second sample set obtained by mapping is utilized to conduct fine adjustment on the pre-training language model, and therefore the fine-adjusted pre-training language model is utilized to conduct named entity recognition prediction, and therefore named entity recognition efficiency and accuracy can be improved.

Description

Named entity identification method and device based on small sample and related medium
Technical Field
The invention relates to the technical field of named entity identification, in particular to a named entity identification method and device based on small samples and a related medium.
Background
Named entity recognition refers to recognition of entities with specific meanings in text, and mainly includes names of people, places, organizations, proper nouns and the like. With the continuous development of the current information industry, the number of various electronic texts is increased sharply, and the difficulty of quickly and efficiently acquiring structured information is higher and higher, so that the named entity recognition technology is applied to various fields for accurately and efficiently extracting key information in the texts.
At present, the mainstream method for processing an entity recognition task is a deep learning-based method, and the common method is that after a text is encoded, semantic features of the text are captured by using a deep learning model, and then the semantic features are input into a classification layer to recognize and classify entities in the text. One disadvantage of this approach is that it requires a certain number of samples in the training set, and the model can be trained on a large number of samples to effectively capture the entity information. In some specific fields, the problems of small sample quantity, high collection difficulty, high cost and the like exist. In view of the above problems, the prior art also proposes a neural network model based on hint learning for small samples. However, such a method based on prompt learning needs to enumerate all potential templates or entities for inference prediction, which consumes a lot of time, and the recognition effect of the model is also affected to a certain extent due to the inconsistency between the fine tuning target and the pre-trained language model.
Disclosure of Invention
The embodiment of the invention provides a named entity identification method and device based on a small sample, computer equipment and a storage medium, aiming at improving the efficiency and the precision of named entity identification.
In a first aspect, an embodiment of the present invention provides a named entity identification method based on a small sample, including:
acquiring sample data, and labeling an entity label on the sample data to construct a first sample set;
selecting pivot characters in the first sample set, and constructing a label mapping space based on the pivot characters;
mapping the first set of samples to a second set of samples using the label mapping space;
fine-tuning a pre-training language model by using the second sample set;
and carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model.
In a second aspect, an embodiment of the present invention provides a named entity identification apparatus based on a small sample, including:
the label marking unit is used for acquiring sample data and marking an entity label on the sample data so as to construct a first sample set;
the character selection unit is used for selecting pivot characters in the first sample set and constructing a label mapping space based on the pivot characters;
a sample mapping unit for mapping the first sample set into a second sample set by using the label mapping space;
the model fine-tuning unit is used for fine-tuning a pre-training language model by utilizing the second sample set;
and the recognition prediction unit is used for carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model.
In a third aspect, an embodiment of the present invention provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the method for named entity identification based on small samples according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the method for named entity identification based on a small sample according to the first aspect.
The embodiment of the invention provides a named entity identification method and device based on a small sample, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring sample data, and labeling an entity label on the sample data to construct a first sample set; selecting pivot characters in the first sample set, and constructing a label mapping space based on the pivot characters; mapping the first set of samples to a second set of samples using the label mapping space; fine-tuning a pre-training language model by using the second sample set; and carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model. According to the embodiment of the invention, the most representative pivot character is selected to construct the label mapping space so as to map the sample data, and then the second sample set obtained by mapping is utilized to finely tune the pre-training language model, so that the finely tuned pre-training language model is utilized to carry out named entity recognition prediction, and the named entity recognition efficiency and precision can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a named entity identification method based on a small sample according to an embodiment of the present invention;
fig. 2 is a schematic network structure diagram of a named entity identification method based on a small sample according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a prediction flow of a named entity recognition method based on a small sample according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a named entity recognition apparatus based on a small sample according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1, fig. 1 is a schematic flow chart of a named entity identification method based on a small sample according to an embodiment of the present invention, which specifically includes: steps S101 to S105.
S101, obtaining sample data, and marking an entity label on the sample data to construct a first sample set;
s102, selecting pivot characters in the first sample set, and constructing a label mapping space based on the pivot characters;
s103, mapping the first sample set into a second sample set by using the label mapping space;
s104, fine adjustment is carried out on a pre-training language model by utilizing the second sample set;
and S105, carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model.
In this embodiment, entity tagging is performed on a small amount of sample data to obtain a first sample set, then the most representative character is selected from the first sample set as a pivot character, and a tag mapping space is constructed according to the pivot character to map the sample data in the first sample set to obtain a corresponding second sample set. And then, the pre-training language model is subjected to fine adjustment by mapping the second sample set, so that the named entity recognition prediction is performed by using the fine-adjusted pre-training language model, and the efficiency and the precision of the named entity recognition can be improved.
In one embodiment, the step S101 includes:
dividing the sample data into a named entity text and a non-named entity text;
labeling the named entity text with an entity label;
marking the non-named entity text as O;
constructing and obtaining the first sample set S based on the labeling structure 1 =<Text X, label Y>。
In this embodiment, when constructing the first sample set, entity label tagging is performed on a small amount of sample data, and the named entity text is labeled as a corresponding entity label, such as name (PER), gender (GEN), AGE (AGE), date of birth (DOB), and the like. For non-named entity text, labeled O in unison. After the labeling, a first sample set S in the form of a binary group can be obtained 1 =<Text X, label Y>。
In an embodiment, the step S102 includes:
the tag mapping space M is constructed as follows:
Figure BDA0003807015600000041
wherein x and y represent text and corresponding entity labels in the first sample set, respectively,
Figure BDA0003807015600000042
indicating the representative degree index of the pivot character w to the entity label Li,
Figure BDA0003807015600000051
indicating that the entity label L is selected from all the characters V i Pivot character w, tf (x = w, y = l) with highest representative degree index i ) Means that all are marked with L i Idf (x = w) represents a measure of the general importance of the pivot character w.
The purpose of this embodiment is to select the most representative character (called pivot character) from the dictionary V for each label of the first sample set, thereby constructing the label mapping space M. With a single arbitrary label l i For example, the following steps are carried out:
Figure BDA0003807015600000052
wherein the content of the first and second substances,
Figure BDA0003807015600000053
defined as pivot character w to label L i Is a representative degree index of (a).
Figure BDA0003807015600000054
Indicating that the label L is selected from all the characters V i The pivot character w with the highest degree index is represented.
tf(x=w,y=l i ) Is defined as all labeled L i The frequency of occurrence of the pivot character w. The higher the frequency, the more the character can represent the label, and the specific formula is as follows:
Figure BDA0003807015600000055
wherein the formula N (-) is used to calculate the number of occurrences of characters within the first sample set that satisfy the condition. In the above formula, the pivot character w of the molecular representation is markedIs signed as i And the denominator indicates all tagged as l i The sum of the number of occurrences of the character.
idf (x = w) is defined as a measure of the general importance of the pivot character w. If the general importance is higher, the character is more common in each label sample, and the representing capability for a single label sample is weaker, the formula is as follows:
Figure BDA0003807015600000056
in the above formula, the numerator represents the number of tag types in the first sample set, and the denominator represents the number of tag types including the pivot word w.
Thus, a tag mapping space M is constructed that is capable of mapping an entity tag to a pivot character representing the tag.
In one embodiment, the step S103 includes:
selecting entity labels in a first sample set;
mapping texts corresponding to entity labels in the first sample set according to the following formula to obtain a second sample set S containing texts and target texts 2 =<Text X, target text X'>:
X'={x 1 ,…,M(y i ),…,x n }
Where X' represents the target text mapped into the second sample set, M (-) represents the tag mapping space, y i Denotes an entity tag, x, in a first sample set 1 And x n Representing text in the first sample set.
In this embodiment, the first sample set S 1 (X={x 1 ,…,x n },Y={y 1 ,…,y n }) label mapping. If the entity label is the entity label, mapping the entity label into a pivot character; if not, the original text is retained. Suppose y i The entity label is an entity label, and the formula of the target text X' obtained by mapping the original text X is as follows:
X'={x 1 ,…,M(y i ),…,x n }
where M (-) is the tag mapping space, pivot character M (y) i ) Replace original x i . On the basis, a second sample set S in the form of a binary group is constructed 2 =<Text X, target text X'>。
In an embodiment, the pre-training language model is a BERT pre-training model. Of course, in other embodiments, other pre-trained language models may be employed, such as the Roberta chinese pre-trained model, the ERNIE pre-trained model, and so forth.
Further, the step S104 includes:
inputting the texts in the second sample set into a BERT pre-training model, and outputting corresponding feature codes by the BERT pre-training model;
based on the feature encoding, according to the probability P of calculating the input text to be predicted as the target text:
P(x i =x' i |X)=softmax(W LM ·h i )
wherein x is i Denotes the ith text data, x 'of the input' i Representing ith target text data, X representing text of a second sample set, LM representing BERT pre-training model, W LM Weight parameter, h, representing the last fully-connected layer of the BERT pre-training model LM i A feature code representing the ith text data;
using the loss function according to
Figure BDA0003807015600000061
And performing optimization updating on the fine tuning training to obtain a fine-tuned BERT pre-training model LM':
Figure BDA0003807015600000062
the present embodiment uses the second sample set S 2 (X={x 1 ,…,x n },X'={x' 1 ,…,x' n }) carrying out fine tuning training on the pre-training language model LM, specifically:
input text X = { X 1 ,…,x n Get the characteristic code H = { H after the pre-training language model LM processing 1 ,…,h n H, then encode the character X in the input text X according to the features i Is predicted as x 'in target text' i The probability of (c) is calculated:
P(x i =x' i |X)=softmax(W LM ·h i )
thus fine-tuning the loss function of the training
Figure BDA0003807015600000063
Comprises the following steps:
Figure BDA0003807015600000064
the weight parameters of the pre-training language model LM can be adaptively updated in the fine tuning process, and finally the fine-tuned language model LM' is obtained through training.
As shown in fig. 2, in the training process, entity label labeling is performed on a small number of samples to generate a first sample set S 1 (X, Y), followed by a first set of samples S 1 Selecting representative pivot characters from (X, Y) to construct a label mapping space M, and using the label mapping space M to perform a first sample set S 1 (X, Y) performing label mapping to obtain a corresponding second sample set S 2 (X, X'). Then using a second set of samples S 2 (X, X ') performing fine tuning training on the pre-training language model LM to obtain an optimized pre-training language model LM'.
In one embodiment, the step S105 includes:
and performing character prediction on the specified text by adopting the fine-tuned pre-training language model according to the following formula:
o i =softmax(W LM' ·e i )
Figure BDA0003807015600000071
wherein o is i Representing the probability of character generation, W LM' Weight parameter representing the trimmed pre-trained language model, e i A feature code representing the ith text data in the specified text,
Figure BDA0003807015600000072
an ith character representing a prediction;
and constructing characters generated by prediction into a prediction text, and mapping the characters in the prediction text into entity labels by using the label mapping space.
In this embodiment, referring to fig. 3, a specified text T = { T is specified by using a pre-training language model LM' after fine-tuning training 1 ,…,t n Predicting, specifically including:
will specify the text T = { T = { T } 1 ,…,t n Inputting the code to a pre-training language model LM', and outputting a corresponding feature code E = { E } by the pre-training language model LM 1 ,…,e n };
After the full connection layer of the pre-training language model LM', calculating the character generation probability by using a softmax function, and operating and taking the character with the maximum possibility by argmax (a parameter solving function). Character with position i
Figure BDA0003807015600000073
The generation formula is as follows:
o i =softmax(W LM' ·e i )
Figure BDA0003807015600000074
according to the obtained character
Figure BDA0003807015600000075
Constructing to obtain a predicted text
Figure BDA0003807015600000076
And mapping the label mapping space M into a label. In particular toIf the text is
Figure BDA0003807015600000077
If the pivot character in the label mapping space M is the pivot character, the corresponding entity label is output
Figure BDA0003807015600000078
Otherwise, outputting a non-entity label O; the final predicted label result is
Figure BDA0003807015600000079
Figure BDA00038070156000000710
Representing a reverse mapping, i.e. deriving the corresponding entity label from the pivot character,
Figure BDA00038070156000000711
specifically, the entity tags may be PER, AGE, and the like.
Fig. 4 is a schematic block diagram of a named entity recognition apparatus 400 based on a small sample according to an embodiment of the present invention, where the apparatus 400 includes:
a tag labeling unit 401, configured to acquire sample data and label an entity tag for the sample data, so as to construct a first sample set;
a character selection unit 402, configured to select pivot characters in the first sample set, and construct a label mapping space based on the pivot characters;
a sample mapping unit 403, configured to map the first sample set into a second sample set by using the label mapping space;
a model fine-tuning unit 404, configured to perform fine-tuning on a pre-training language model by using the second sample set;
and the recognition prediction unit 405 is configured to perform named entity recognition prediction on the specified text by using the fine-tuned pre-training language model.
In one embodiment, the label labeling unit 401 includes:
the data dividing unit is used for dividing the sample data into a named entity text and a non-named entity text;
the first text labeling unit is used for labeling the entity label to the named entity text;
the second text labeling unit is used for labeling the non-named entity text as O;
a first sample set constructing unit, configured to obtain the first sample set S based on labeled structure construction 1 =<Text X, label Y>。
In one embodiment, the character selecting unit 402 includes:
a space construction unit, configured to construct the label mapping space M according to the following formula:
Figure BDA0003807015600000081
wherein x and y represent text and corresponding entity labels in the first sample set, respectively,
Figure BDA0003807015600000082
indicating the representative degree index of the pivot character w to the entity label Li,
Figure BDA0003807015600000084
indicating that the entity label L is selected from all the characters V i The pivot character w, tf (x = w, y = l) with the highest representative degree index i ) Means that all are marked with L i Idf (x = w) represents a measure of the general importance of the pivot character w.
In an embodiment, the sample mapping unit 403 includes:
the label selecting unit is used for selecting the entity labels in the first sample set;
a second sample set constructing unit, configured to map a text corresponding to the entity tag in the first sample set according to the following formula, so as to obtain a second sample set S including a text and a target text 2 =<Text X, target text X'>:
X'={x 1 ,…,M(y i ),…,x n }
Where X' represents the target text mapped into the second sample set, M (-) represents the tag mapping space, y i Denotes an entity tag, x, in a first sample set 1 And x n Representing text in the first sample set.
In one embodiment, the pre-trained language model is a BERT pre-trained model.
In one embodiment, the model fine tuning unit 404 includes:
the text input unit is used for inputting the texts in the second sample set into a BERT pre-training model and outputting corresponding feature codes by the BERT pre-training model;
a probability calculation unit for calculating a probability P that the input text is predicted as the target text based on the feature encoding:
P(x i =x' i |X)=softmax(W LM ·h i )
wherein x is i Ith text data, x 'representing input' i Representing ith target text data, X representing text of a second sample set, LM representing BERT pre-training model, W LM Weight parameter, h, representing the last fully-connected layer of the BERT pre-training model LM i A feature code representing the ith text data;
an optimization updating unit for utilizing the loss function according to
Figure BDA0003807015600000091
And performing optimization updating on the fine tuning training to obtain a fine-tuned BERT pre-training model LM':
Figure BDA0003807015600000092
in an embodiment, the identifying a prediction unit 405 comprises:
the character prediction unit is used for performing character prediction on the specified text by adopting the fine-tuned pre-training language model according to the following formula:
o i =softmax(W LM' ·e i )
Figure BDA0003807015600000093
wherein o is i Indicates the probability of character generation, W LM' Weight parameter representing the trimmed pre-trained language model, e i A feature code representing the ith text data in the specified text,
Figure BDA0003807015600000094
an ith character representing a prediction;
and the character mapping unit is used for constructing the characters generated by prediction into a predicted text and mapping the characters in the predicted text into entity labels by utilizing the label mapping space.
Since the embodiment of the apparatus portion and the embodiment of the method portion correspond to each other, please refer to the description of the embodiment of the method portion for the embodiment of the apparatus portion, and details are not repeated here.
Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the steps provided by the above embodiments can be implemented. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiment of the present invention further provides a computer device, which may include a memory and a processor, where the memory stores a computer program, and the processor may implement the steps provided in the above embodiment when calling the computer program in the memory. Of course, the computer device may also include various network interfaces, power supplies, and the like.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It should also be noted that, in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A named entity recognition method based on small samples is characterized by comprising the following steps:
acquiring sample data, and labeling an entity label on the sample data to construct a first sample set;
selecting pivot characters in the first sample set, and constructing a label mapping space based on the pivot characters;
mapping the first set of samples to a second set of samples using the label mapping space;
fine-tuning a pre-training language model by using the second sample set;
and carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model.
2. The method according to claim 1, wherein the obtaining sample data and labeling the sample data with an entity label to construct a first sample set comprises:
dividing the sample data into a named entity text and a non-named entity text;
labeling an entity label on the named entity text;
marking the non-named entity text as O;
constructing and obtaining the first sample set S based on the labeling structure 1 =<Text X, label Y>。
3. The method according to claim 2, wherein selecting a pivot character from the first sample set and constructing a label mapping space based on the pivot character comprises:
constructing the label mapping space M according to the following formula:
Figure FDA0003807015590000011
wherein x and y represent text and corresponding entity labels in the first sample set, respectively,
Figure FDA0003807015590000012
indicating the representative degree index of the pivot character w to the entity label Li,
Figure FDA0003807015590000013
indicating that the entity label L is selected from all the characters V i Pivot character w, tf (x = w, y = l) with highest representative degree index i ) Means that all are marked with L i Idf (x = w) represents a measure of the general importance of the pivot character w.
4. The method according to claim 3, wherein the mapping the first set of samples to a second set of samples using the label mapping space comprises:
selecting entity labels in a first sample set;
mapping texts corresponding to the entity labels in the first sample set according to the following formula to obtain a second sample set S containing texts and target texts 2 =<Text X, target text X'>:
X'={x 1 ,…,M(y i ),…,x n }
Where X' represents the target text mapped into the second sample set, M (-) represents the tag mapping space, y i Denotes an entity tag, x, in a first sample set 1 And x n Representing text in the first sample set.
5. The small-sample-based named entity recognition method of claim 4, wherein the pre-trained language model is a BERT pre-trained model.
6. The method according to claim 5, wherein the fine-tuning of the pre-trained language model using the second set of samples comprises:
inputting the texts in the second sample set into a BERT pre-training model, and outputting corresponding feature codes by the BERT pre-training model;
based on the feature encoding, according to the probability P of the input text being predicted as the target text:
P(x i =x′ i |X)=softmax(W LM ·h i )
wherein x is i Ith text data, x 'representing input' i Representing the ith target text data, X representing the text of the second sample set, LM representing the BERT pre-training model, W LM Representing weights of last full link layer of BERT pre-training model LMParameter, h i A feature code representing the ith text data;
using the loss function according to
Figure FDA0003807015590000024
And (3) performing optimization updating on the fine tuning training to obtain a fine tuned BERT pre-training model LM':
Figure FDA0003807015590000021
7. the small-sample-based named entity recognition method of claim 6, wherein the conducting named entity recognition prediction on a specified text using a trimmed pre-trained language model comprises:
and performing character prediction on the specified text by adopting the fine-tuned pre-training language model according to the following formula:
o i =softmax(W LM' ·e i )
Figure FDA0003807015590000022
wherein o is i Representing the probability of character generation, W LM' Weight parameter representing the trimmed pre-trained language model, e i A feature code representing the ith text data in the specified text,
Figure FDA0003807015590000023
an ith character representing a prediction;
and constructing characters generated by prediction into a prediction text, and mapping the characters in the prediction text into entity labels by using the label mapping space.
8. A named entity recognition apparatus based on a small sample, comprising:
the label marking unit is used for acquiring sample data and marking an entity label on the sample data so as to construct a first sample set;
the character selection unit is used for selecting pivot characters in the first sample set and constructing a label mapping space based on the pivot characters;
a sample mapping unit, configured to map the first sample set into a second sample set by using the label mapping space;
the model fine-tuning unit is used for fine-tuning a pre-training language model by utilizing the second sample set;
and the recognition prediction unit is used for carrying out named entity recognition prediction on the specified text by adopting the fine-tuned pre-training language model.
9. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method for small sample based named entity recognition according to any one of claims 1 to 7 when the computer program is executed.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out a method for small-sample based named entity recognition as claimed in any one of the claims 1 to 7.
CN202211000683.1A 2022-08-19 2022-08-19 Named entity identification method and device based on small sample and related medium Pending CN115310449A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211000683.1A CN115310449A (en) 2022-08-19 2022-08-19 Named entity identification method and device based on small sample and related medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211000683.1A CN115310449A (en) 2022-08-19 2022-08-19 Named entity identification method and device based on small sample and related medium

Publications (1)

Publication Number Publication Date
CN115310449A true CN115310449A (en) 2022-11-08

Family

ID=83863120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211000683.1A Pending CN115310449A (en) 2022-08-19 2022-08-19 Named entity identification method and device based on small sample and related medium

Country Status (1)

Country Link
CN (1) CN115310449A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861885A (en) * 2023-07-11 2023-10-10 贝壳找房(北京)科技有限公司 Label generation method, device, equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861885A (en) * 2023-07-11 2023-10-10 贝壳找房(北京)科技有限公司 Label generation method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN109992664B (en) Dispute focus label classification method and device, computer equipment and storage medium
CN114492363B (en) Small sample fine adjustment method, system and related device
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN115599901B (en) Machine question-answering method, device, equipment and storage medium based on semantic prompt
CN112036184A (en) Entity identification method, device, computer device and storage medium based on BilSTM network model and CRF model
CN112307741B (en) Insurance industry document intelligent analysis method and device
CN116719520B (en) Code generation method and device
CN116432655B (en) Method and device for identifying named entities with few samples based on language knowledge learning
CN112667782A (en) Text classification method, device, equipment and storage medium
CN112966068A (en) Resume identification method and device based on webpage information
CN117408650B (en) Digital bidding document making and evaluating system based on artificial intelligence
CN115310449A (en) Named entity identification method and device based on small sample and related medium
CN111898339A (en) Ancient poetry generation method, device, equipment and medium based on constraint decoding
CN110941713B (en) Self-optimizing financial information block classification method based on topic model
CN115203206A (en) Data content searching method and device, computer equipment and readable storage medium
AU2019290658B2 (en) Systems and methods for identifying and linking events in structured proceedings
CN114611489A (en) Text logic condition extraction AI model construction method, extraction method and system
CN113688633A (en) Outline determination method and device
CN114626378A (en) Named entity recognition method and device, electronic equipment and computer readable storage medium
CN111126064A (en) Money identification method and device, computer equipment and readable storage medium
CN116860980B (en) Real-time operation feedback classification method and system
CN117973525A (en) Complex sample relation extraction method, system, equipment and medium
Paun et al. Probabilistic Models of Annotation
Marcé Gomis Comparison of active learning methods for automatic document classification
CN114662501A (en) Test question explanation generation method and related device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination