WO2022127124A1

WO2022127124A1 - Meta learning-based entity category recognition method and apparatus, device and storage medium

Info

Publication number: WO2022127124A1
Application number: PCT/CN2021/109617
Authority: WO
Inventors: 刘玉; 徐国强
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2020-12-15
Filing date: 2021-07-30
Publication date: 2022-06-23
Also published as: CN112528662A

Abstract

A meta learning-based entity category recognition method, comprising: acquiring a newly added entity category, and querying a reference sample corresponding to the newly added entity category; acquiring data to be recognized; inputting the reference sample and the data into a pre-generated entity category recognition model so as to recognize a newly added entity category corresponding to a reference sample in each piece of the data, wherein the entity category recognition model is obtained by training on the basis of a meta learning manner.

Description

Meta-Learning-Based Entity Category Recognition Method, Apparatus, Equipment and Storage Medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the priority of the Chinese patent application filed on December 15, 2020, with the application number of 202011472865.X and the application title of "Meta-Learning-Based Entity Class Recognition Method, Apparatus, Equipment and Storage Medium", The entire contents of which are incorporated herein by reference.

technical field

At present, there are many researches on named entity recognition in the field of artificial intelligence, but there are not many datasets related to named entity recognition, especially the Chinese named entity recognition dataset is very rare. Moreover, although there is a set of relatively mature named entity recognition models on the market, these models usually only distinguish three common entity categories: person name, institution, and address. When some new entity classes appear, these models can't handle it.

However, the inventor realized that the traditional open-source Chinese named entity recognition datasets mainly include MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets. The size of these five datasets is not very large and the number of entity categories is not enough. All together there are less than 30 entity classes. However, the entity category tree in the real world is much larger than 30. The traditional idea is to label as many data as there are entity categories, but this is unrealistic. The usual situation is that when a new entity category appears, there are often only 10 to 100 samples of the new category. It is not realistic to retrain the model with these samples, because the model is bound to suffer from class imbalance and overfitting. influences.

Therefore, there is an urgent need for a method that can accurately identify the corresponding entity category in the data when a new entity category appears.

SUMMARY OF THE INVENTION

According to various embodiments disclosed in the present application, a meta-learning-based entity category identification method, apparatus, device, and storage medium are provided.

A meta-learning-based entity category recognition method, including:

Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;

obtain data to be identified; and

Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.

A meta-learning-based entity category recognition device, comprising:

A new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;

a data-to-be-identified acquisition module for acquiring data to be identified; and

An entity identification module, configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.

A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:

obtain data to be identified; and

One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:

obtain data to be identified; and

Computer-readable instructions Computer-readable instructions Computer-readable instructions Computer-readable instructions The above-mentioned meta-learning-based entity category identification method, apparatus, device, and storage medium determine a reference sample according to the newly added entity category and use the reference sample and data to be identified. Input into the pre-generated entity category recognition model to identify the new entity category corresponding to the reference sample in each of the data to be identified, without manual intervention, without the need for specialized knowledge in the field of artificial intelligence, which greatly reduces manpower When there is a new entity category, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified to determine whether there is an entity category.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and advantages of the present application will be apparent from the description, drawings, and claims.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings required in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

FIG. 1 is a schematic flowchart of a meta-learning-based entity category identification method according to one or more embodiments.

FIG. 2 is a schematic flowchart of a meta-learning-based entity category identification method according to another or more embodiments.

FIG. 3 is a structural block diagram of an apparatus for identifying entity categories based on meta-learning according to one or more embodiments.

4 is a block diagram of a computer device in accordance with one or more embodiments.

Detailed ways

In order to make the technical solutions and advantages of the present application clearer, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

In one of the embodiments, as shown in FIG. 1 , a meta-learning-based entity category identification method is provided. This embodiment is illustrated by applying the method to a terminal. It can be understood that the method can also be applied to a terminal. The server can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server. In this embodiment, the method includes the following steps:

S102: Acquire the newly added entity category, and query the reference samples corresponding to the newly added entity category.

Specifically, the newly added entity category may be the name of the newly added entity, and the newly added entity category may be at least one. The reference samples are samples belonging to the newly added entity category, wherein the number of reference samples may be 10 or more, but the number is not too large. The server may establish the corresponding relationship between the newly added entity category and the corresponding reference sample, for example, by grouping. In addition, it should be noted that the same reference sample may belong to multiple new entity categories, that is, a reference sample may be labeled with multiple entity categories.

S104: Acquire the data to be identified.

Specifically, the data to be identified is data that needs to be processed by entity category, which may be newly added data or previous data.

S106: Input the reference sample and the data to be identified into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, wherein the entity category identification model is trained based on meta-learning owned.

Specifically, the entity category recognition model is trained based on meta-learning, wherein multiple meta-training tasks are constructed according to sample data, and then the entity category recognition model is obtained by training through the constructed meta-training tasks. The meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain an entity that can identify the entity class of the data to be identified with a few samples of new entity categories. Class recognition model.

The server inputs the reference samples and the data to be identified into the pre-generated entity category identification model, so that the entity category identification model identifies the newly added entity category corresponding to the reference sample in each to-be-identified data.

Wherein, the process of identifying the new entity category corresponding to the reference sample in each to-be-identified data by the entity type identification model may include a process of processing the reference sample and the to-be-identified data, and calculating high-level features of the to-be-identified data by using the processed reference samples The step of representing, and the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-identified data.

The process of processing the reference samples and the data to be recognized may include: serializing the words in the reference samples and the data to be recognized, and performing high-level representation on the serialized words, and finally performing an average pooling operation on the high-level representation. The last word is processed to obtain the reference sample and the vector representation corresponding to the data to be recognized.

The step of calculating the high-level feature representation of the data to be identified by the server through the processed reference samples may be performed according to the following formula:

in,

It is the high-level feature of the data to be identified obtained by q _j after modeling the reference sample, which models the relationship between the reference sample and the data to be identified to a certain extent. The atten function is used to calculate the contribution of each reference sample to the recognition of named entities in the data to be recognized.

Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.

Finally, the step of processing according to the high-level feature representation to determine the new entity category corresponding to the reference sample in each to-be-recognized data includes: inputting high-level features into a predetermined fully-connected layer, converting the feature vector of each word into The dimension is mapped to a preset dimension, such as 3-dimension. The three-dimension represents the label of the word is O, B, I, that is, it does not belong to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence.

It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned new entity category and the entity category of the data to be identified, the above-mentioned new entity category and the entity category of the data to be identified can also be stored in a node of a blockchain.

The above-mentioned meta-learning-based entity category identification method determines a reference sample according to the newly added entity category, and inputs the reference sample and the data to be identified into a pre-generated entity category identification model, so as to identify the corresponding The new entity category of the reference sample does not require manual intervention, and does not require special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when there are new entity categories, there is no need to retrain the model, and only a few references are required. The sample can then identify the data to be identified to determine whether an entity class exists.

In one embodiment, the reference samples and the data to be identified are input into a pre-generated entity category recognition model to identify a new entity category corresponding to the reference sample in each to-be-identified data, including: combining the reference samples and the to-be-identified data The words in the data are serialized, and the serialized words are represented by high-order features; the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the reference sample and the data to be recognized; the vector representation of the reference sample The vectorized representation of the data to be identified is processed to obtain high-level features of the data to be identified; the high-level features are processed to obtain a new entity category corresponding to the reference sample in the data to be identified.

Specifically, it is assumed that the word sequences of the reference samples are respectively

Then the input of the reference sample is

The word sequence of the data to be recognized is

Then the input of the data to be recognized is

After inputting the reference sample and the data to be recognized into BERT, the high-level feature representation of each word of the reference sample and the data to be recognized can be obtained by the following formula:

in

and

are the i-th and j-th words of the reference sample and the data to be recognized, respectively, and s _i and q _j are the high-order feature representations obtained by these two words after passing through BERT.

After obtaining the high-level feature representation of these words, the average pooling operation is used to obtain a unified vector representation, which is used to represent the entire reference sample and the data to be recognized:

s _rep = MEAN_POOLING _i (s _i )

q _rep = MEAN_POOLING _j (q _j )

The obtained s _rep represents the feature representation of the entire reference sample, and q _rep represents the feature representation of the entire data to be identified.

After obtaining the feature representation of the entire reference sample and the data to be identified, the higher-order feature representation of the data to be identified can be obtained according to the reference sample:

in

Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by the atten function. k represents the serial number of the reference sample, which is related to the number of samples of the reference sample.

The server obtains the final feature representation of each word in the query sample, and then goes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions. These three dimensions represent the labels of the words are O, B, I, that is, no It belongs to this category, belongs to this category and is located at the beginning of the sentence, belongs to this category and is located in the middle of the sentence, that is, the new entity category corresponding to the reference sample in the data to be identified.

In one of the embodiments, the training method of the entity category recognition model includes: acquiring sample data, and constructing multiple groups of meta training samples according to the sample data; and obtaining the entity category recognition model by training according to the meta training samples.

Specifically, the sample data may be preset samples that have been classified. The meta-training samples are processed according to the sample data, wherein each meta-training sample may include multiple support samples and multiple query samples, and the support samples may include multiple grouped sample data, that is, sample data belonging to different categories , the corresponding query sample is also the query sample in the corresponding group. The number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples, for example, training is performed through the meta-training samples in turn until the accuracy rate of the entity category recognition model reaches the expectation , the calculation of the accuracy of the entity category recognition model can be processed according to the meta-training samples, for example, the support samples and query samples in the meta-training samples are input into the entity category recognition model to determine the entity category corresponding to the query sample. Compared with the real entity category of the query sample, the model training is completed.

In one embodiment, acquiring sample data and constructing multiple groups of training samples based on the sample data includes: acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one group from the groups; determining the extracted The first quantity of sample data in at least one grouping is a support sample, and the second quantity of sample data is a query sample; a set of meta-training samples is obtained according to the support sample and the query sample; at least one group is repeatedly randomly selected from the grouping to obtain multiple groups meta-training samples.

Specifically, the server first obtains the original sample data, then processes the original sample data to obtain a data set corresponding to each category, and then starts to construct a training set. In order to train a meta-learning model, it is first necessary to construct some columns of meta-training samples. The construction rules are as follows:

From the processed entity classes, such as 12 entity classes, randomly select several groups, such as 3 classes, which may be expressed as, l ₁ , l ₂ , ..., l ₃ . From l ₁ , l ₂ , ... , l ₃ In these three categories, each category randomly selects the first number, such as 10 samples as support samples, and each category randomly selects the second number, such as 100 samples as query samples, so a total of 30 supports will be obtained sample, 3000 query samples.

The server converts the data set constructed in this way into a meta-training task, the purpose of which is to train the model to classify the query samples given the support samples. To train the model, the server can build 10,000 such meta-training tasks.

In one embodiment, obtaining sample data and grouping the sample data according to entity categories includes: obtaining sample data grouped according to initial entity categories, and grouping sample data in the initial entity categories according to target entity categories; Standardize the sample data grouped by the target entity category; combine the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity category.

Specifically, the initial entity category is the open source MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets collected from the Internet. Since the annotation formats of these datasets are not uniform, these data must be preprocessed and unified into BIO annotations. The data in the format, specifically the traditional named entity recognition data set when labeling the data set, some use the BIO format, some use the BIEO format, here is the conversion of the BIEO format to the BIO format.

The target entity is marked in the initial category entity, such as PER person name, LOC location, ORG organization, TIM time, COM company name, ADD specific address, GAME game name, GOV government department, SCENCE attractions, BOOK books, MOVIE movies and PRODUCT products, etc. The server counts the entity categories marked in each dataset in the MSRA, People's Daily, Weibo, CLUENER, and BOSON datasets. Among them, MSRA has marked the three entity categories of PER, LOC and ORG, then let L(MSRA) be marked by MSRA , then L(MSRA)={PER, LOC, ORG}. Similarly, the server can obtain L(People's Daily)={PER, LOC, ORG, TIM}, L{Weibo}={PER, ORG, LOC}, L(CLUENER)={PER, LOC, ORG, COM, ADD, GAME, GOV, SCENCE, BOOK, MOVIE}, L(BOSON)={PER, LOC, ORG, COM, TIM, PRODUCT}

According to the dataset labeling entity category set obtained in the previous step, each dataset is divided into new entity categories according to a single entity category. For example, for the MSRA dataset, L(MSRA)={PER, LOC, ORG}, first consider PER Category, keep all PER positive samples in the MSRA data set, other positive samples such as LOC and ORG are all marked as negative samples, the original negative samples in MSRA remain unchanged, then the newly obtained data set only The positive samples of the PER category are included, and the positive samples of other categories are all turned into negative samples. Finally, the sentences in which the entire sentence is a negative sample are removed, and such a data set is called MSRA-PER. Similarly, the server can get MSRA - ORG, MSRA-LOC dataset. For the other four datasets, the server can also access datasets such as CLUENER-PER, CLUENER-ADD...etc. Specifically, for example, my name is AB, I live in CD, and I work in EF. Among them, AB is the PER entity, CD is the LOC entity, and EF is the ORG entity. These are all positive samples, and my name, living, working, and work are all negative samples.

After that, the server can get MSRA-PER, People's Daily-PER, CLUENER-PER, Weibo-PER, BOSON-PER, which are five PER-related datasets. Entities in the PER category and entities in other categories are all negative samples, so these five PER-related datasets are mixed to form a new dataset, denoted as the ZH-PER dataset. Similarly, the server can access a total of 12 datasets such as ZH-LOC, ZH-ORG, ZH-TIM, ZH-ADD, ZH-COM, ZH-BOOK, etc.

In one embodiment, referring to FIG. 2, FIG. 2 is a flow chart of the training process of the entity category recognition model in one embodiment. The entity category recognition model is obtained by training according to the meta-training samples, including: combining the support samples and the query The words in the sample are serialized, and the serialized words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and query samples; according to the entity category recognition model The vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain a new entity category corresponding to the support sample in the query sample; The new entity category corresponding to the support sample in the sample and the real entity category of the query sample are input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.

Specifically, after the meta-training task is constructed, the server starts to construct the model. In this paper, the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence. The main structure of the model is as follows:

Let the word sequences of the support samples be

Then the input of the support sample is

The word sequence of the query sample is

Then the input of the query sample is

After inputting support samples and query samples into BERT, the server obtains the high-order feature representation of each word of these samples by the following formula:

in

and

are the i-th and j-th words of the support sample and query sample, respectively, and s _i and q _j are the high-order feature representations of these two words after passing through BERT, respectively.

After obtaining the high-level feature representations of these words, the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:

s _rep = MEAN_POOLING _i (s _i )

q _rep = MEAN_POOLING _j (q _j )

The obtained s _rep represents the feature representation of the entire support sample, and q _rep represents the feature representation of the entire query sample.

After obtaining the feature representation of the entire sample, the server obtains the higher-order feature representation of the query sample based on the support sample:

in

is the higher-level feature representation obtained by q _j after modeling the support samples, which models the relationship between the support samples and the query samples to a certain extent. The atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample.

Represents the splicing of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the support sample, because 10 support samples are selected for each category, because k is 10 at most.

In this way, the server obtains the final feature representation of each word in the query sample, and then passes through a fully connected layer to map the feature vector dimension of each word to 3 dimensions, which represent the labels of the words O, B, I respectively. , that is, not in the category, in the category and at the beginning of the sentence, in the category and in the middle of the sentence. After mapping each word to 3 dimensions through a fully connected layer, a conditional random field CRF layer is then used to calculate the final loss. The model is trained with a loss function.

In one of the embodiments, the words in the support samples and the query samples are serialized, and the serialized words are represented by high-order features, which includes: expressing the vector of the query samples according to the following formula through the vectorized representation of the support samples The representation is processed to obtain the high-level features of the query sample:

in,

is the high-level feature of the query sample obtained by q _j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;

Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.

It should be understood that although the steps in the flowcharts of FIGS. 1 and 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 1 and 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.

In one embodiment, as shown in Figure 3, a meta-learning-based entity category identification device is provided, including: a newly added entity category acquisition module 100, a data acquisition module 200 to be identified, and an entity identification module 300, wherein:

A newly added entity category acquisition module 100 is used to acquire the newly added entity category and query the reference samples corresponding to the newly added entity category;

an acquisition module 200 for data to be identified, configured to acquire data to be identified;

The entity identification module 300 is used for inputting the reference samples and the data to be identified into the pre-generated entity category identification model to identify the newly added entity category corresponding to the reference sample in each to be identified data, wherein the entity category identification model is based on Trained in a meta-learning way.

In one embodiment, the above entity identification module 300 may include:

The conversion unit is used to serialize the words in the reference sample and the data to be recognized, and perform high-level feature representation on the serialized words;

The first vectorization unit is used to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be recognized;

a first high-level feature representation unit, configured to process the vectorized representation of the data to be identified by referring to the vectorized representation of the sample to obtain high-level features of the data to be identified;

The identification unit is used to process the high-level features to obtain the newly added entity category corresponding to the reference sample in the data to be identified.

In one of the embodiments, the above-mentioned device for identifying entity categories based on meta-learning includes:

The sample acquisition module is used to acquire sample data and construct multi-group training samples according to the sample data;

The training is fast, and it is used to train the entity category recognition model according to the meta-training samples.

In one embodiment, the above-mentioned sample acquisition module may include:

a grouping unit, used for acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;

an extraction unit, configured to determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;

The combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;

The loop unit is used to repeatedly randomly extract at least one group from the groups to obtain multiple groups of training samples.

In one embodiment, the above-mentioned grouping unit may include:

The grouping subunit is used to obtain sample data grouped according to the initial entity category, and group the sample data in the initial entity category according to the target entity category;

The standardization subunit is used to standardize the sample data grouped according to the target entity category;

The merging subunit is used for merging the standardized target entity categories corresponding to each initial entity category to obtain groups corresponding to the target entity categories.

In one embodiment, the above-mentioned training module may include:

The second vectorization unit is used to serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain support vector representation of samples and query samples;

The second high-level feature representation unit is used to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain the high-level feature of the query sample;

The category identification unit is used to process the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;

The loss function generation unit is used for inputting the newly added entity category of the obtained query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate the loss function;

The training unit is used to train the entity category recognition model through the loss function.

In one of the embodiments, the above-mentioned second vectorization unit is further configured to process the vectorized representation of the query sample through the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:

in,

For the specific definition of the meta-learning-based entity category identification device, reference may be made to the above definition of the meta-learning-based entity category identification method, which will not be repeated here. Each module in the above-mentioned meta-learning-based entity category identification device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one of the embodiments, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 4 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions, when executed by a processor, implement a meta-learning-based entity class recognition method.

Those skilled in the art can understand that the structure shown in FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

A computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes the one or more processors to perform the following steps: acquiring a new entity category , and query the reference sample corresponding to the newly added entity category; obtain the data to be identified; input the reference sample and the data to be identified into the pre-generated entity category recognition model to identify the new addition of the corresponding reference sample in each data to be identified Entity category, where the entity category recognition model is trained based on meta-learning.

In one embodiment, when the processor executes the computer-readable instructions, the reference samples and the data to be identified are input into the pre-generated entity category recognition model, so as to identify new additions corresponding to the reference samples in each data to be identified. Entity categories, including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing an average pooling operation on the words represented by the high-level features to obtain the reference samples and the words to be identified. Recognize the vector representation of the data; process the vectorized representation of the data to be recognized by referring to the vectorized representation of the sample to obtain high-level features of the data to be recognized; process the high-level features to obtain a new entity category corresponding to the reference sample in the data to be recognized.

In one embodiment, the training method of the entity category recognition model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain entities Class recognition model.

In one embodiment, acquiring sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data, includes: acquiring sample data, grouping the sample data according to entity categories, and randomly Extract at least one grouping from the grouping; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat random At least one group is drawn from the groups to obtain multi-group meta training samples.

In one of the embodiments, obtaining sample data and grouping the sample data according to entity categories when the processor executes the computer-readable instructions includes: obtaining sample data grouped according to initial entity categories, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain a group corresponding to the target entity category.

In one embodiment, when the processor executes the computer-readable instructions, the entity category recognition model obtained by training according to the meta-training sample includes: serializing the words in the support sample and the query sample, and serializing the serialized The words represented by the high-order features are represented by high-level features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vectorized representation of the support samples is used to vectorize the query samples. Indicates that processing is performed to obtain the high-level features of the query sample; the high-level features of the query sample are processed to obtain the new entity category corresponding to the support sample in the query sample; the new entity category corresponding to the support sample in the obtained query sample is compared with the query sample. The real entity category is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.

In one embodiment, when the processor executes the computer-readable instructions, serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words, includes: according to the following formula: The vectorized representation of the support sample processes the vectorized representation of the query sample to obtain the high-level features of the query sample:

in,

One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtain the newly added entity category, and query The reference sample corresponding to the newly added entity category; the data to be identified is obtained; the reference sample and the to-be-identified data are input into the pre-generated entity category recognition model to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, Among them, the entity category recognition model is trained based on meta-learning.

Wherein, the computer-readable storage medium may be non-volatile or volatile.

In one embodiment, when the computer-readable instructions are executed by the processor, the reference samples and the data to be identified are input into a pre-generated entity category recognition model, so as to identify new data corresponding to the reference samples in each data to be identified. Adding entity categories, including: serializing the reference samples and words in the data to be recognized, and expressing the serialized words with high-level features; performing average pooling on the words after the high-level feature representation to obtain the reference samples and The vector representation of the data to be recognized; the vectorized representation of the data to be recognized is processed by referring to the vectorized representation of the sample to obtain the high-level features of the data to be recognized; the high-level features are processed to obtain the new entity category corresponding to the reference sample in the data to be recognized .

In one embodiment, the training method of the entity category recognition model implemented when the computer-readable instructions are executed by the processor includes: acquiring sample data, and constructing multiple groups of meta-training samples according to the sample data; training according to the meta-training samples to obtain Entity class recognition model.

In one embodiment, acquiring sample data when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta-training samples according to the sample data, includes: acquiring sample data, grouping the sample data according to entity categories, and Randomly extract at least one grouping from the groupings; determine that the first quantity of sample data in the at least one extracted grouping is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeat At least one group is randomly selected from the groups to obtain multiple sets of meta training samples.

In one embodiment, obtaining sample data and grouping the sample data according to entity categories when the computer-readable instructions are executed by the processor includes: obtaining sample data grouped according to the initial entity category, The sample data is grouped according to the target entity category; the sample data grouped according to the target entity category is standardized; the standardized target entity categories corresponding to each initial entity category are merged to obtain the grouping corresponding to the target entity category.

In one embodiment, when the computer-readable instructions are executed by the processor, the entity category recognition model obtained by training according to the meta-training samples includes: serializing the words in the support samples and the query samples, and serializing the words in the query samples. The following words are represented by high-order features, and the average pooling operation is performed on the words after the high-order feature representation to obtain the vector representation of the support samples and the query samples; according to the entity category recognition model, the vector representation of the support samples is used to represent the vector of the query samples. processing the high-level features of the query samples; processing the high-level features of the query samples to obtain the newly added entity categories corresponding to the supporting samples in the query samples; adding the new entity categories corresponding to the supporting samples in the obtained query samples with the query The real entity category of the sample is input into the random field layer to calculate the loss function; the entity category recognition model is trained through the loss function.

In one embodiment, when the computer-readable instructions are executed by the processor, the words in the support sample and the query sample are serialized, and the serialized words are represented by high-order features, including: according to the following formula The vectorized representation of the query sample is processed by the vectorized representation of the support sample to obtain the high-level features of the query sample:

in,

The blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium, the storage medium may be non-volatile or volatile. When executed, the computer-readable instructions may include the processes of the above-described method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.

The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are relatively specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can also be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims

A meta-learning-based entity category recognition method, including:

Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;

obtain data to be identified; and

Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
The method according to claim 1, wherein the inputting the reference sample and the data to be identified into a pre-generated entity category recognition model to identify the reference sample corresponding to each of the data to be identified New entity categories for , including:

Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;

Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;

processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and

The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
The method according to claim 1 or 2, wherein the training method of the entity category recognition model comprises:

obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and

The entity category recognition model is obtained by training according to the meta-training samples.
The method according to claim 3, wherein the acquiring sample data and constructing multiple groups of training samples according to the sample data comprises:

acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;

determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;

obtaining a set of meta-training samples from the support samples and the query samples; and

Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.
The method according to claim 4, wherein the acquiring sample data and grouping the sample data according to entity categories, comprising:

obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category;

normalize sample data grouped by target entity category; and

The standardized target entity categories corresponding to each initial entity category are combined to obtain groups corresponding to the target entity categories.
The method according to claim 5, wherein the entity category recognition model obtained by training according to the meta-training samples comprises:

Serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain the support samples and a vector representation of the query sample;

Process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain high-level features of the query sample;

processing the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;

Inputting the obtained new entity category of the query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate a loss function; and

The entity category recognition model is trained through the loss function.
The method according to claim 6, wherein the performing serialization on the words in the support sample and the query sample, and performing high-order feature representation on the serialized words, comprises:

The vectorized representation of the query sample is processed by the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:

in,
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
A meta-learning-based entity category recognition device, comprising:

A new entity category acquisition module is added, which is used to acquire the newly added entity category and query the reference samples corresponding to the said newly added entity category;

a data-to-be-identified acquisition module for acquiring data to be identified; and

An entity identification module, configured to input the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein , the entity category recognition model is trained based on meta-learning.
The apparatus of claim 8, wherein the entity identification module comprises:

a conversion unit, used to serialize the reference samples and the words in the data to be recognized, and perform high-order feature representation on the serialized words;

a first vectorization unit, configured to perform an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;

a first high-level feature representation unit, configured to process the vectorized representation of the data to be identified through the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and

An identification unit, configured to process the high-level feature to obtain a new entity category corresponding to the reference sample in the data to be identified.
A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored in the memory that, when executed by the one or more processors, cause the one or more processors to Each processor performs the following steps:

Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;

obtain data to be identified; and

Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
The computer device according to claim 10, wherein the inputting the reference sample and the data to be identified into a pre-generated entity category identification model implemented when the processor executes the computer-readable instructions , to identify the newly added entity category corresponding to the reference sample in each of the to-be-identified data, including:

Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;

Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;

processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and

The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
The computer device according to claim 10 or 11, wherein the training method of the entity category recognition model involved when the processor executes the computer-readable instructions comprises:

obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and

The entity category recognition model is obtained by training according to the meta-training samples.
The computer device according to claim 12, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, and constructing multiple groups of meta training samples according to the sample data, comprising:

acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;

determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;

obtaining a set of meta-training samples from the support samples and the query samples; and

Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.
The computer device according to claim 13, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, grouping the sample data according to entity categories, comprising:

obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category;

normalize sample data grouped by target entity category; and

The standardized target entity categories corresponding to each initial entity category are combined to obtain groups corresponding to the target entity categories.
The computer device according to claim 14, wherein the entity category recognition model obtained by the training according to the meta-training samples, which is implemented when the processor executes the computer-readable instructions, comprises:

Serialize the words in the support sample and the query sample, perform high-level feature representation on the serialized words, and perform an average pooling operation on the words after the high-level feature representation to obtain the support samples and a vector representation of the query sample;

Process the vectorized representation of the query sample through the vectorized representation of the support sample according to the entity category recognition model to obtain high-level features of the query sample;

processing the high-level features of the query sample to obtain a new entity category corresponding to the support sample in the query sample;

Inputting the obtained new entity category of the query sample corresponding to the support sample and the real entity category of the query sample into the random field layer to calculate a loss function; and

The entity category recognition model is trained through the loss function.
16. The computer device of claim 15, wherein the processor executes the computer-readable instructions to serialize the words in the support sample and the query sample, and to serialize the words in the query sample. The subsequent words are represented by high-order features, including:

The vectorized representation of the query sample is processed by the vectorized representation of the support sample according to the following formula to obtain high-level features of the query sample:

in,
is the high-level feature of the query sample obtained by q j after modeling the support sample; the atten function is used to calculate the contribution of each support sample to the named entity recognition of the query sample;
Represents the splicing of two vectors into a new vector, T is a real number, used to control the sharpness of the distribution obtained by the atten function; k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample.
One or more non-volatile computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:

Obtain the newly added entity category, and query the reference samples corresponding to the newly added entity category;

obtain data to be identified; and

Inputting the reference sample and the data to be identified into a pre-generated entity category identification model to identify a new entity category corresponding to the reference sample in each of the to-be-identified data, wherein the entity category identifies The model is trained based on meta-learning.
18. The storage medium of claim 17, wherein the inputting the reference sample and the data to be identified into a pre-generated entity class identification model implemented when the computer-readable instructions are executed by the processor , to identify the newly added entity category corresponding to the reference sample in each of the data to be identified, including:

Serialize the reference sample and the words in the data to be recognized, and perform high-level feature representation on the serialized words;

Performing an average pooling operation on the words represented by the high-order features to obtain the vector representation of the reference sample and the data to be identified;

processing the vectorized representation of the data to be identified by the vectorized representation of the reference sample to obtain high-level features of the data to be identified; and

The high-level feature is processed to obtain a new entity category corresponding to the reference sample in the to-be-identified data.
The storage medium according to claim 17 or 18, wherein the training method of the entity category recognition model involved when the computer readable instructions are executed by the processor comprises:

obtaining sample data, and constructing multiple sets of meta training samples based on the sample data; and

The entity category recognition model is obtained by training according to the meta-training samples.
The storage medium according to claim 19, wherein the acquiring sample data realized when the computer-readable instructions are executed by the processor, and constructing multiple groups of meta training samples according to the sample data, comprising:

acquiring sample data, grouping the sample data according to entity categories, and randomly extracting at least one grouping from the grouping;

determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample;

obtaining a set of meta-training samples from the support samples and the query samples; and

Repeatedly randomly drawing at least one group from the groups to obtain multiple sets of meta training samples.