WO2022134586A1

WO2022134586A1 - Meta-learning-based target classification method and apparatus, device and storage medium

Info

Publication number: WO2022134586A1
Application number: PCT/CN2021/109571
Authority: WO
Inventors: 刘玉; 徐国强
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2020-12-21
Filing date: 2021-07-30
Publication date: 2022-06-30
Also published as: CN112613555A

Abstract

A meta-learning-based target classification method, related to the technical field of artificial intelligence. The method comprises: obtaining newly added data, and constructing a reference sample according to the newly added data (S102); obtaining a target to be classified according to the newly added data and the reference sample (S104); inputting the reference sample and said target into a pre-generated target classification model to determine a first probability that said target belongs to a classification to which the reference sample belongs (S106), wherein the target classification model is trained on the basis of a meta-learning mode; and determining the classification to which said target belongs according to the first probability (S108).

Description

Meta-learning-based object classification method, apparatus, device and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the priority of the Chinese patent application filed on December 21, 2020 with the application number 2020115233480 and the application title is "Meta-learning-based object classification method, device, equipment and storage medium", the entire content of which is Incorporated herein by reference.

technical field

The present application relates to a meta-learning-based object classification method, apparatus, device and storage medium.

Background technique

With the development of artificial intelligence technology, technologies such as computer vision, natural language processing, and speech recognition have emerged. However, different people have their own research focuses on the huge field of artificial intelligence. For example, the sub-tasks in the field of computer vision have now reached There are more than 500 sub-tasks in the field of natural language processing, and there are also more than 300 sub-tasks. For the numerous and complex academic papers, scholars in the field of artificial intelligence urgently need a system to classify and label emerging papers.

However, the inventors realized that traditional machine learning-based paper classification models can only deal with paper categories that have appeared in the training set. Once new categories of papers come, these models cannot correctly classify these papers. In addition, the papers of the new category have less data at the beginning. Since machine learning models usually require a large number of training samples for training, even if the papers of the new category are used as training data for training, it is impossible to obtain a classification with high accuracy. model, resulting in poor performance of the model on the test set, which in turn leads to inaccurate classification of emerging papers.

SUMMARY OF THE INVENTION

According to various embodiments disclosed in the present application, a meta-learning-based object classification method, apparatus, device, and storage medium are provided.

A meta-learning-based object classification method including:

acquiring new data, and constructing a reference sample according to the new data;

obtaining the target to be classified according to the newly added data and the reference sample;

The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and

The category to which the object to be classified belongs is determined according to the first probability.

A target classification device based on meta-learning, comprising:

a new data acquisition module, used for acquiring new data, and constructing a reference sample according to the new data;

an acquisition module for a target to be classified, configured to obtain a target to be classified according to the newly added data and the reference sample;

A model processing module, configured to input the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, wherein the The object classification model is trained based on meta-learning; and

A classification module, configured to determine the classification to which the object to be classified belongs according to the probability.

A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:

One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:

The above-mentioned meta-learning-based target classification method, device, equipment and storage medium determine the reference sample according to the newly added data, so that only the reference sample and the target to be classified need to be input into the pre-generated target classification model, that is, the to-be-classified model can be obtained. The classification described in the target can automatically classify the target in the field of artificial intelligence, and does not require manual intervention, does not require special knowledge in the field of artificial intelligence, greatly reduces labor costs, and when new types of data come, There is no need to retrain the model, and only a few supporting samples are needed to label the classification target for classification.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and advantages of the present application will be apparent from the description, drawings, and claims.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings required in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

FIG. 1 is a schematic flowchart of a meta-learning-based object classification method according to one or more embodiments.

FIG. 2 is a schematic flowchart of a meta-learning-based object classification method according to another or more embodiments.

FIG. 3 is a structural block diagram of an apparatus for object classification based on meta-learning according to one or more embodiments.

4 is a diagram of the internal structure of a computer device in accordance with one or more embodiments.

Detailed ways

In order to make the technical solutions and advantages of the present application clearer, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

In one of the embodiments, as shown in FIG. 1 , a meta-learning-based target classification method is provided. In this embodiment, the method is applied to a terminal for illustration. It can be understood that the method can also be applied to a server. , can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server. In this embodiment, the method includes the following steps:

S102: Acquire new data, and construct a reference sample according to the new data.

Specifically, newly added data is newly added data; taking papers as an example, when there are papers in a new category, papers belonging to the new category belong to newly added data. The reference sample is constructed based on the newly added data. The reference sample is a subset of the newly added data, that is, a large amount of newly added data has been added over a period of time, and a small part of the new data has been classified to obtain the reference sample. That is, part of the data is extracted from the newly added data, and then the extracted part of the data is classified to obtain a reference sample. And the amount of this part of the data is small, for example, less than a threshold, for example, 10 articles.

S104: Obtain the target to be classified according to the newly added data and the reference sample.

Specifically, the target to be classified is the data other than the reference sample in the newly added data, taking the paper as an example, that is, the newly added unclassified paper. That is to say, the target to be classified and the reference sample constitute all the newly added data, so that the reference sample is labeled with a label, and the number is small, such as 10, etc., and the remaining large number is the target to be classified.

S106: Input the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is obtained by training based on meta-learning.

Specifically, the target classification model is obtained by training based on meta-learning in the field of artificial intelligence technology, wherein multiple meta-training tasks are constructed according to sample data, and then the target classification model is obtained by training through the constructed meta-training tasks. The meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain a target classification model that can distinguish new categories of data with fewer new categories of samples.

The server inputs the reference sample and the target to be classified into the pre-generated target classification model, so that the target classification model processes the reference sample and the sample to be classified, and calculates the probability that the processed target to be classified belongs to the category to which the reference sample belongs. .

The target classification model processing the reference samples may include: a process of vectorized representation of the reference samples and the target to be classified, and a step of calculating the first probability of the target to be classified according to the vectorized reference samples.

The process of vectorized representation may include: calculating the word sequence of the reference sample and the target to be classified, and then processing the reference sample and the word sequence of the target to be classified to obtain the high-level feature representation of each word, for example, inputting it into the BERT model for Process, and finally, perform an average pooling operation on the high-level features of each word after the high-level representation of the reference sample and the target to be classified, respectively, to obtain the vectorized representation of the corresponding reference sample and the target to be classified.

Wherein, the step of calculating the first probability of the object to be classified according to the vectorized reference sample may include: calculating the first probability of the object to be classified according to the vectorized reference sample according to a pre-trained model:

The output of the sigmod activation function is a real number between 0 and N, so it can be determined whether the categories of the target to be classified and the reference sample are the same according to P. The atten function is used to calculate the contribution of each reference sample to the classification target classification. ⊙ represents the inner product of two vectors, and T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the reference sample, and its value is related to the number of samples in the reference sample.

S108: Determine the category to which the target to be classified belongs according to the first probability.

Specifically, the server may preset a probability threshold, and use the probability threshold to determine the category to which the object to be classified belongs. And since the output of the sigmod activation function is a real number between 0 and N, such as a real number between 0 and 1, it is equivalent to a binary classification problem, so greater than 0.5 means the same, and less than 0.5 means different. In other embodiments. The preset probability threshold may be determined according to the range of the output of the sigmod activation function.

It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned new data and the classification corresponding to the new data, the above-mentioned new data and the corresponding classification of the new data can also be stored in a node of a blockchain.

The above-mentioned meta-learning-based target classification method determines the reference samples according to the newly added data. In this way, only the reference samples and the target to be classified need to be input into the pre-generated target classification model, that is, the classification of the target to be classified can be obtained automatically. Classify the targets in the field of artificial intelligence without manual intervention or special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when new types of data come, there is no need to retrain the model, only a few With a few supporting samples, the classification target can be labeled for classification.

In one embodiment, the newly added data includes a plurality of categories; constructing a reference sample according to the newly added data includes: grouping the newly added data according to the categories, and constructing a reference sample corresponding to each grouping; Input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, including: inputting the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to The first probability for each class.

Specifically, since the newly added data may belong to multiple groups, or the same data, for example, the same paper may have multiple tags, the server first obtains a small amount of sample data from the newly added data, and then performs the analysis on these sample data. group, and construct the reference samples corresponding to each group. It should be noted that the obtained sample data can be allocated to multiple groups at the same time, that is, the same data has multiple tags, so that the constructed multiple groups can be divided into multiple groups. There may be duplicate reference samples.

Correspondingly, the probability that the target to be classified belongs to the classification of the reference sample obtained by the target classification model is also multiple, that is, it is related to the number of groupings of the reference sample, and the two are equal, that is, after the target classification model is obtained. The probability that the object to be classified belongs to the classification of each reference sample, so that the server can determine the multiple classifications to which the object to be classified belongs according to the probability judgment, thus realizing the technical effect of being able to label a paper with multiple labels at the same time. The existing paper classification models based on machine learning are generally based on a single label, that is to say, a paper can only belong to a small category, but the reality is that a paper can actually have multiple labels, and some papers may be It is really inappropriate to label the paper with only one label that spans multiple fields.

In the above embodiment, since multiple sets of reference samples are constructed, it is possible to simultaneously tag a paper with multiple labels.

In one of the embodiments, the training method of the target classification model includes: acquiring sample data, constructing multiple groups of meta-training samples according to the sample data; and performing training according to the meta-training samples to obtain the target classification model.

Specifically, the sample data may be preset samples that have been classified, such as papers that have been classified. The meta-training samples are processed according to the sample data, in which each employee training sample can include multiple support samples and multiple query samples, and the support samples can include multiple grouped sample data, that is, sample data belonging to different categories. , the corresponding query sample is also the query sample in the corresponding group. The number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples. The calculation of the accuracy of the target classification model can be processed according to the meta-training samples. For example, the support samples and the query samples in the meta-training samples are input into the target classification model to determine the classification corresponding to the query samples. If the actual classification is compared and the expectation is met, the model training is completed.

In one embodiment, the target classification model is obtained by training according to the meta-training samples, including: serializing the words of each support sample and query sample of each group of meta-training samples; The higher-order feature processing obtains the corresponding higher-order feature representation; the average pooling operation is performed on the higher-order feature representation to obtain the vector representation corresponding to each support sample and the vector representation corresponding to each query sample; according to the vector representation corresponding to each support sample Representation and the vector representation corresponding to each query sample are trained to obtain the target classification model.

Specifically, word serialization refers to converting each word in the support sample and the query sample into an ordered sequence. For example, let the word sequences of the support sample be respectively

Then the input of the support sample is

The word sequence of the query sample is

Then the input of the query sample is

Among them, CLS and SEP are two unique words in BERT. When BERT is pre-training, these two special words are added to allow the model to locate the sentence that is due to be input. Therefore, BERT is used to fine-tune downstream tasks. When , you must also add these two special words, one at the beginning and the other at the end, S is actually the first letter of support, representing

is the word that supports the sample, Q is actually the first letter of the query, representing

is the word of the query sample, m represents the total number of words in the support sample, and n represents the total number of words in the query sample.

The high-order feature representation can be performed by the BERT model, for example, the high-order feature representation of each word can be obtained by the following formula:

in

and

are the ith and jth words of the support sample and query sample, respectively.

The vectorized representation can be obtained by an average pooling operation, for example by the following formula:

s _rep = MEAN_POOLING _i (s _i )

q _rep = MEAN_POOLING _j (q _j )

The obtained s _rep represents the feature representation of the entire support sample, and q _rep represents the feature representation of the entire query sample.

In one embodiment, obtaining sample data, constructing multiple groups of training samples according to the sample data, including crawling the sample data that has been classified on a preset website, and grouping the sample data according to the classification; randomly extracting at least one sample from the grouping grouping, and determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeatedly randomly extract at least A grouping step to get multiple sets of meta training samples.

Specifically, because there are relatively accurate classifications for papers on the Internet, such as the relatively mature artificial intelligence field classification and tagging website papers with codes, the network already has manually sorted categories of papers in the field of human intelligence, and For various papers in this category, crawling these data can form some labeled paper-category datasets without re-labeling yourself, which can greatly reduce the workload. Crawling sub-tasks in various fields from this website, there are about 16 major categories, more than 400 middle categories, and more than 1200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.

Specifically, the server randomly selects at least one group from the above 1200 or more sub-categories, for example, 10 groups, which can be expressed as: l ₁ , l ₂ , ..., l ₁₀ from l ₁ , l ₂ , ..., l ₁₀ Among the 10 groups, each group randomly selects the first number, for example, 10 samples as support samples, and each group randomly selects the second number, such as 100 samples as query samples, so a total of You will get 100 support samples and 1000 query samples. The dataset thus constructed is a meta-training task whose purpose is to train the model to classify the query samples given the support samples. To train the model, 10,000 such meta-training tasks can be constructed.

In one of the embodiments, randomly extracting at least one group from the group includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; according to the vector representation corresponding to each support sample and each The vector representation corresponding to the query sample is trained to obtain the target classification model, including: obtaining the real classification corresponding to the query sample; calculating the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.

In order to achieve multi-labeling, a paper can be labeled with multiple labels at the same time. In this embodiment, by setting the number of groups, the multi-label labeling of the target to be classified is realized, that is, the target to be classified is obtained after passing through the target classification model. The probability that the target belongs to the category to which the reference sample belongs is also multiple, that is, it is related to the number of groups of the reference sample, and the two are equal, that is, the target to be classified obtained after passing through the target classification model belongs to the probability of each reference sample belonging to the category , so that the server can determine the multiple categories to which the target to be classified belongs according to the probability judgment, thereby realizing the technical effect of being able to label a paper with multiple labels at the same time.

In one embodiment, the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, including: calculating the class probability of the support sample corresponding to each query sample according to the following formula:

Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; the target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.

Specifically, calculate the probability that the query sample belongs to this category. Since each meta-training task contains 10 categories, 10 such probabilities will be obtained. By whether this probability is greater than 0.5, we can know whether the query sample belongs to this category. category, and compare the obtained model category of the query sample with the real grouping of the query sample to construct a loss function, etc., to train the above formula, so as to obtain the target classification model, such as the above Sigmod activation function, atten function in the parameters for training.

Specifically, referring to Fig. 2, Fig. 2 is a flowchart of a meta-learning-based target classification method in another embodiment. In this embodiment, we first draw lessons from the relatively mature artificial intelligence field classification and labeling website papers with codes, the network already has manually organized categories of papers in the field of human intelligence, as well as various papers in this category. Crawling these data allows us to form some labeled paper-category datasets without using our own Relabel. Crawling sub-tasks in various fields from this website, there are about 16 categories, more than 400 middle categories, and more than 1,200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.

Second, after crawling the categories, titles, and abstracts of these papers, start building the training set. The title and abstract are concatenated as the model input, and the category of the paper is used as the label. In order to train the model, this paper first needs to construct some columns of meta-training samples. The construction rules are as follows: 10 categories are randomly selected from 1200 categories, which may be expressed as, l ₁ , l ₂ , ..., l ₁₀ . From the 10 categories l ₁ , l ₂ , ..., l ₁₀ , 10 samples are randomly selected for each category as support samples, and 100 samples are randomly selected for each category as query samples, so A total of 100 support samples and 1000 query samples will be obtained. In this embodiment, the data set constructed once is a meta-training task, and the purpose of the task is to train the model to classify the query samples under the premise of given support samples. To train the model, this example constructs 10,000 such meta-training tasks.

After building 10,000 meta-training tasks, we start building the model. In this embodiment, the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence. The main structure of the model is as follows:

Let the word sequences of the support samples be

Then the input of the support sample is

The word sequence of the query sample is

Then the input of the query sample is

After inputting support samples and query samples into BERT, the high-order feature representation of each word of these samples is obtained by the following formula:

in

and

are the ith and jth words of the support sample and query sample, respectively.

After obtaining the high-level feature representations of these words, the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:

s _rep = MEAN_POOLING _i (s _i )

q _rep = MEAN_POOLING _j (q _j )

After obtaining the feature representation of the entire sample, the server calculates the class probability of the query sample according to the support sample:

The output of the sigmod activation function is a real number between 0 and 1, so we can use P to determine whether the categories of the query sample and the support sample are the same. The atten function is used to calculate the contribution of each support sample to the query sample classification. ⊙ represents the inner product of two vectors, and T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the support sample, because this paper selects 10 support samples for each category, because k is the maximum value of 10.

In this way, for a certain category, the server can calculate the probability that the query sample is of this category. Since each meta-training task contains 10 categories, the server will get 10 such probabilities, and the query can be obtained by whether the probability is greater than 0.5. The sample does not belong to this category.

It should be understood that although the steps in the flowcharts of FIG. 1 and FIG. 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIG. 1 and FIG. 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with other steps or sub-steps of other steps or at least a portion of a stage.

In one of the embodiments, as shown in FIG. 3, a meta-learning-based target classification device is provided, including: a newly added data acquisition module 100, a target acquisition module 200 to be classified, a model processing module 300, and a classification module 400, in:

A new data acquisition module 100 is added for acquiring new data and constructing a reference sample according to the new data;

The target to be classified acquisition module 200 is used for obtaining the target to be classified according to the newly added data and the reference sample;

The model processing module 300 is used for inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on a meta-learning method trained;

The classification module 400 is configured to determine the classification to which the object to be classified belongs according to the probability.

In one embodiment, the above-mentioned newly added data includes multiple categories, and the above-mentioned newly added data acquisition module 100 includes:

The grouping unit is used to group the newly added data according to the classification, and construct a reference sample corresponding to each grouping;

The above-mentioned model processing module 300 is further configured to input the reference sample and the object to be classified into the pre-generated object classification model, so as to determine the first probability that the object to be classified belongs to each classification.

In one embodiment, the above-mentioned meta-learning-based target classification device further includes:

The sample data acquisition module is used to acquire sample data and construct multiple groups of training samples according to the sample data;

The training module is used to train the target classification model according to the meta-training samples.

In one embodiment, the above-mentioned training module includes:

The serialization unit is used to serialize the words of each support sample and query sample of each group of training samples;

The feature processing unit is used to perform high-level feature processing on each serialized word to obtain a corresponding high-level feature representation;

The vectorization unit is used to perform an average pooling operation on the high-order feature representation to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample;

The training unit is used for training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample to obtain the target classification model.

In one embodiment, the above-mentioned sample data acquisition module may include:

The grouping unit is used to crawl the classified sample data on the preset website, and group the sample data according to the classification;

an extraction unit, configured to randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;

The combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;

The loop unit is used for repeating the step of randomly extracting at least one group from the groups to obtain multiple groups of meta training samples.

In one of the embodiments, the above-mentioned extraction unit is also used to randomly extract a preset number of groups from the group, and the preset number of groups is greater than or equal to 2;

The above training unit includes:

The real classification acquisition sub-unit is used to obtain the real classification corresponding to the query sample;

The model classification acquisition subunit is used to calculate the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, and the model classification includes the number corresponding to the preset number. the second probability of ;

The training subunit is used for training according to the real classification and the second probability to obtain the target classification model.

In one embodiment, the above-mentioned training module may include:

The category probability calculation unit is used to calculate the category probability of the support sample corresponding to each query sample according to the following formula:

Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;

The target classification model generation unit is used for training the formula according to the real grouping and the category probability of each query sample to obtain the target classification model.

For the specific definition of the meta-learning-based target classification device, reference may be made to the above definition of the meta-learning-based target classification method, which will not be repeated here. Each module in the above-mentioned device for object classification based on meta-learning can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, a computer device is provided, the computer device may be a server, and the internal structure diagram of which may be shown in FIG. 4 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used for storing newly added data and its corresponding classified data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instructions, when executed by a processor, implement a meta-learning based object classification method.

Those skilled in the art can understand that the structure shown in FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

A computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes one or more processors to perform the following steps: acquiring new data , and construct a reference sample according to the newly added data; obtain the target to be classified according to the newly added data and the reference sample; input the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to the classification of the reference sample The first probability, wherein the target classification model is obtained by training based on meta-learning; the category to which the target to be classified belongs is determined according to the first probability.

In one embodiment, the newly added data realized when the processor executes the computer-readable instruction includes a plurality of categories; the construction of a reference sample according to the newly-added data realized when the processor executes the computer-readable instruction includes: classifying the newly-added data according to classifying and grouping, and constructing a reference sample corresponding to each group; inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The samples and the objects to be classified are input into a pre-generated object classification model to determine the first probability that the objects to be classified belong to each classification.

In one embodiment, the training method of the target classification model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, constructing multiple sets of meta-training samples according to the sample data; and training according to the meta-training samples to obtain the target classification model.

In one embodiment, when the processor executes the computer-readable instructions, the target classification model obtained by training according to the meta-training samples includes: serializing the words of each support sample and query sample of each group of meta-training samples ; Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform an average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the vector corresponding to each query sample Representation; the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.

In one embodiment, obtaining sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data, includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data according to classifying and grouping; randomly extract at least one group from the group, and determine that the first quantity of sample data in the at least one group extracted is a support sample, and the second quantity of sample data is a query sample; obtain a group according to the support sample and the query sample meta-training samples; repeating the step of randomly selecting at least one group from the groups to obtain multiple sets of meta-training samples.

In one embodiment, the random extraction of at least one group from the group achieved by the processor executing the computer-readable instructions includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; The vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , and calculate the model classification corresponding to each query sample, where the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.

In one embodiment, when the processor executes the computer-readable instructions, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to the query sample:

The target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.

One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: acquire newly added data and, according to the new The reference sample is constructed by adding data; the target to be classified is obtained according to the newly added data and the reference sample; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, The target classification model is obtained by training based on meta-learning; the classification to which the target to be classified belongs is determined according to the first probability.

Wherein, the computer-readable storage medium may be non-volatile or volatile.

In one embodiment, the newly added data realized when the computer readable instructions are executed by the processor includes a plurality of categories; the construction of a reference sample according to the newly added data realized when the computer readable instructions are executed by the processor includes: adding new data The data is grouped according to the classification, and a reference sample corresponding to each group is constructed; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to each classification.

In one embodiment, the training method of the target classification model realized when the computer readable instructions are executed by the processor includes: acquiring sample data, constructing multiple groups of meta training samples according to the sample data; training according to the meta training samples to obtain the target classification model .

In one embodiment, when the computer-readable instructions are executed by the processor, the target classification model obtained by training according to the meta-training samples includes: performing a sequence sequence on each support sample of each group of meta-training samples and the words of the query sample Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the corresponding high-order feature representation of each query sample. Vector representation; the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.

In one embodiment, the acquisition of sample data realized by the computer-readable instructions being executed by the processor, and the construction of multiple groups of meta-training samples according to the sample data, includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data. Grouping according to classification; randomly extracting at least one grouping from the grouping, and determining that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample; obtain a sample data according to the support sample and the query sample. Component training samples; repeating the step of randomly extracting at least one group from the groups to obtain multiple component training samples.

In one embodiment, the random extraction of at least one group from the group by the computer-readable instructions when executed by the processor includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; The vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector corresponding to each query sample means that the model classification corresponding to each query sample is calculated, and the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.

In one embodiment, when the computer-readable instructions are executed by the processor, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to each query sample:

The blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

The embodiments of the present application may acquire and process related data based on artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .

The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium, the storage medium may be volatile or non-volatile. When executed, the computer-readable instructions may include the processes of the above-described method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.

The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims

A meta-learning-based object classification method including:

acquiring new data, and constructing a reference sample according to the new data;

obtaining the target to be classified according to the newly added data and the reference sample;

The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and

The category to which the object to be classified belongs is determined according to the first probability.
The method according to claim 1, wherein the newly added data includes a plurality of categories; and the constructing a reference sample according to the newly added data comprises:

Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and

The inputting the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample includes:

The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
The method according to claim 1 or 2, wherein the training method of the target classification model comprises:

obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and

The target classification model is obtained by training according to the meta-training samples.
The method according to claim 3, wherein the target classification model obtained by training according to the meta-training samples comprises:

Serialize the words of each support sample and query sample of each group of training samples;

Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;

performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and

The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
The method according to claim 4, wherein, the acquisition of sample data, according to the sample data to construct multiple groups of training samples, comprising:

Crawling the sample data that has been classified on the preset website, and grouping the sample data according to the classification;

Randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;

obtaining a set of meta-training samples from the support samples and the query samples; and

The step of randomly extracting at least one group from the groups is repeated to obtain multiple sets of meta training samples.
The method of claim 5, wherein said randomly extracting at least one grouping from said groupings comprises:

Randomly extract a preset number of groups from the grouping, and the preset number of groups is greater than or equal to 2;

The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, including:

obtaining the real classification corresponding to the query sample;

According to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, a model classification corresponding to each query sample is calculated, where the model classification includes a number of second probabilities corresponding to a preset number of samples ;and

A target classification model is obtained by training according to the true classification and the second probability.
The method according to claim 4, wherein the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample comprises:

The class probability of the support sample corresponding to each query sample is calculated according to the following formula:

Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; and

The target classification model is obtained by training the formula according to the real grouping of each query sample and the class probability.
A target classification device based on meta-learning, comprising:

a new data acquisition module, used for acquiring new data, and constructing a reference sample according to the new data;

an acquisition module for a target to be classified, configured to obtain a target to be classified according to the newly added data and the reference sample;

A model processing module, configured to input the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, wherein the The object classification model is trained based on meta-learning; and

A classification module, configured to determine the classification to which the object to be classified belongs according to the probability.
The device according to claim 8, wherein the newly added data includes a plurality of categories, and the newly added data acquisition module comprises:

a grouping unit for grouping the newly added data according to classification, and constructing a reference sample corresponding to each grouping; and

The model processing module is configured to input the reference sample and the object to be classified into a pre-generated object classification model, so as to determine the first probability that the object to be classified belongs to each classification.
A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored in the memory that, when executed by the one or more processors, cause the one or more processors to Each processor performs the following steps:

acquiring new data, and constructing a reference sample according to the new data;

obtaining the target to be classified according to the newly added data and the reference sample;

The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and

The category to which the object to be classified belongs is determined according to the first probability.
The computer device of claim 10, wherein the newly added data involved when the processor executes the computer-readable instructions includes a plurality of categories; The realized construction of a reference sample according to the newly added data includes:

Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and

The inputting the reference sample and the target to be classified into a pre-generated target classification model, which is implemented when the processor executes the computer-readable instructions, to determine that the target to be classified belongs to the reference sample The first probability of the class to which it belongs, including:

The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
The computer device according to claim 10 or 11, wherein the training method of the target classification model involved when the processor executes the computer-readable instructions comprises:

obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and

The target classification model is obtained by training according to the meta-training samples.
The computer device according to claim 12, wherein the obtaining the target classification model by training according to the meta-training samples, which is implemented when the processor executes the computer-readable instructions, comprises:

Serialize the words of each support sample and query sample of each group of training samples;

Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;

performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and

The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
The computer device according to claim 13, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, and constructing multiple groups of training samples according to the sample data, comprising:

Crawling the sample data that has been classified on the preset website, and grouping the sample data according to the classification;

Randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;

obtaining a set of meta-training samples from the support samples and the query samples; and

The step of randomly extracting at least one group from the groups is repeated to obtain multiple sets of meta training samples.
The computer device of claim 1, wherein the randomly extracting at least one packet from the packets, implemented by the processor when executing the computer-readable instructions, comprises:

Randomly extract a preset number of groups from the grouping, and the preset number of groups is greater than or equal to 2;

When the processor executes the computer-readable instructions, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes:

obtaining the real classification corresponding to the query sample;

According to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, a model classification corresponding to each query sample is calculated, where the model classification includes a number of second probabilities corresponding to a preset number of samples ;and

A target classification model is obtained by training according to the true classification and the second probability.
The computer device according to claim 13, wherein the performing the process according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, which is implemented when the processor executes the computer-readable instructions The target classification model is obtained by training, including:

The class probability of the support sample corresponding to each query sample is calculated according to the following formula:

Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; and

The target classification model is obtained by training the formula according to the real grouping of each query sample and the class probability.
One or more non-volatile computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:

acquiring new data, and constructing a reference sample according to the new data;

obtaining the target to be classified according to the newly added data and the reference sample;

The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and

The category to which the object to be classified belongs is determined according to the first probability.
18. The storage medium of claim 17, wherein the newly added data involved when the computer-readable instructions are executed by the processor includes a plurality of categories; the computer-readable instructions are executed by the processor The construction of the reference sample according to the newly added data realized at the time includes:

Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and

The inputting the reference sample and the object to be classified into a pre-generated object classification model implemented when the computer readable instructions are executed by the processor, so as to determine that the object to be classified belongs to the reference The first probability of the class to which the sample belongs, including:

The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
The storage medium according to claim 17 or 18, wherein the training manner of the target classification model involved when the computer readable instructions are executed by the processor comprises:

obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and

The target classification model is obtained by training according to the meta-training samples.
The storage medium according to claim 19, wherein the target classification model obtained by training according to the meta-training samples, which is implemented when the computer-readable instructions are executed by the processor, comprises:

Serialize the words of each support sample and query sample of each group of training samples;

Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;

performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and

The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.