WO2022134586A1 - Meta-learning-based target classification method and apparatus, device and storage medium - Google Patents

Meta-learning-based target classification method and apparatus, device and storage medium Download PDF

Info

Publication number
WO2022134586A1
WO2022134586A1 PCT/CN2021/109571 CN2021109571W WO2022134586A1 WO 2022134586 A1 WO2022134586 A1 WO 2022134586A1 CN 2021109571 W CN2021109571 W CN 2021109571W WO 2022134586 A1 WO2022134586 A1 WO 2022134586A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
target
training
classified
classification model
Prior art date
Application number
PCT/CN2021/109571
Other languages
French (fr)
Chinese (zh)
Inventor
刘玉
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022134586A1 publication Critical patent/WO2022134586A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present application relates to a meta-learning-based object classification method, apparatus, device and storage medium.
  • a meta-learning-based object classification method According to various embodiments disclosed in the present application, a meta-learning-based object classification method, apparatus, device, and storage medium are provided.
  • a meta-learning-based object classification method including:
  • the reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning;
  • the category to which the object to be classified belongs is determined according to the first probability.
  • a target classification device based on meta-learning comprising:
  • a new data acquisition module used for acquiring new data, and constructing a reference sample according to the new data
  • an acquisition module for a target to be classified configured to obtain a target to be classified according to the newly added data and the reference sample
  • a model processing module configured to input the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, wherein the The object classification model is trained based on meta-learning;
  • a classification module configured to determine the classification to which the object to be classified belongs according to the probability.
  • a computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:
  • the reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning;
  • the category to which the object to be classified belongs is determined according to the first probability.
  • One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
  • the reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning;
  • the category to which the object to be classified belongs is determined according to the first probability.
  • the above-mentioned meta-learning-based target classification method, device, equipment and storage medium determine the reference sample according to the newly added data, so that only the reference sample and the target to be classified need to be input into the pre-generated target classification model, that is, the to-be-classified model can be obtained.
  • the classification described in the target can automatically classify the target in the field of artificial intelligence, and does not require manual intervention, does not require special knowledge in the field of artificial intelligence, greatly reduces labor costs, and when new types of data come, There is no need to retrain the model, and only a few supporting samples are needed to label the classification target for classification.
  • FIG. 1 is a schematic flowchart of a meta-learning-based object classification method according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of a meta-learning-based object classification method according to another or more embodiments.
  • FIG. 3 is a structural block diagram of an apparatus for object classification based on meta-learning according to one or more embodiments.
  • FIG. 4 is a diagram of the internal structure of a computer device in accordance with one or more embodiments.
  • a meta-learning-based target classification method is provided.
  • the method is applied to a terminal for illustration. It can be understood that the method can also be applied to a server. , can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server.
  • the method includes the following steps:
  • S102 Acquire new data, and construct a reference sample according to the new data.
  • newly added data is newly added data; taking papers as an example, when there are papers in a new category, papers belonging to the new category belong to newly added data.
  • the reference sample is constructed based on the newly added data.
  • the reference sample is a subset of the newly added data, that is, a large amount of newly added data has been added over a period of time, and a small part of the new data has been classified to obtain the reference sample. That is, part of the data is extracted from the newly added data, and then the extracted part of the data is classified to obtain a reference sample. And the amount of this part of the data is small, for example, less than a threshold, for example, 10 articles.
  • the target to be classified is the data other than the reference sample in the newly added data, taking the paper as an example, that is, the newly added unclassified paper. That is to say, the target to be classified and the reference sample constitute all the newly added data, so that the reference sample is labeled with a label, and the number is small, such as 10, etc., and the remaining large number is the target to be classified.
  • S106 Input the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is obtained by training based on meta-learning.
  • the target classification model is obtained by training based on meta-learning in the field of artificial intelligence technology, wherein multiple meta-training tasks are constructed according to sample data, and then the target classification model is obtained by training through the constructed meta-training tasks.
  • the meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain a target classification model that can distinguish new categories of data with fewer new categories of samples.
  • the server inputs the reference sample and the target to be classified into the pre-generated target classification model, so that the target classification model processes the reference sample and the sample to be classified, and calculates the probability that the processed target to be classified belongs to the category to which the reference sample belongs. .
  • the target classification model processing the reference samples may include: a process of vectorized representation of the reference samples and the target to be classified, and a step of calculating the first probability of the target to be classified according to the vectorized reference samples.
  • the process of vectorized representation may include: calculating the word sequence of the reference sample and the target to be classified, and then processing the reference sample and the word sequence of the target to be classified to obtain the high-level feature representation of each word, for example, inputting it into the BERT model for Process, and finally, perform an average pooling operation on the high-level features of each word after the high-level representation of the reference sample and the target to be classified, respectively, to obtain the vectorized representation of the corresponding reference sample and the target to be classified.
  • the step of calculating the first probability of the object to be classified according to the vectorized reference sample may include: calculating the first probability of the object to be classified according to the vectorized reference sample according to a pre-trained model:
  • the output of the sigmod activation function is a real number between 0 and N, so it can be determined whether the categories of the target to be classified and the reference sample are the same according to P.
  • the atten function is used to calculate the contribution of each reference sample to the classification target classification.
  • represents the inner product of two vectors
  • T is a real number that controls the sharpness of the distribution obtained by atten.
  • k represents the serial number of the reference sample, and its value is related to the number of samples in the reference sample.
  • S108 Determine the category to which the target to be classified belongs according to the first probability.
  • the server may preset a probability threshold, and use the probability threshold to determine the category to which the object to be classified belongs. And since the output of the sigmod activation function is a real number between 0 and N, such as a real number between 0 and 1, it is equivalent to a binary classification problem, so greater than 0.5 means the same, and less than 0.5 means different. In other embodiments.
  • the preset probability threshold may be determined according to the range of the output of the sigmod activation function.
  • the above-mentioned new data and the corresponding classification of the new data can also be stored in a node of a blockchain.
  • the above-mentioned meta-learning-based target classification method determines the reference samples according to the newly added data. In this way, only the reference samples and the target to be classified need to be input into the pre-generated target classification model, that is, the classification of the target to be classified can be obtained automatically.
  • the newly added data includes a plurality of categories; constructing a reference sample according to the newly added data includes: grouping the newly added data according to the categories, and constructing a reference sample corresponding to each grouping; Input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, including: inputting the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to The first probability for each class.
  • the server since the newly added data may belong to multiple groups, or the same data, for example, the same paper may have multiple tags, the server first obtains a small amount of sample data from the newly added data, and then performs the analysis on these sample data. group, and construct the reference samples corresponding to each group. It should be noted that the obtained sample data can be allocated to multiple groups at the same time, that is, the same data has multiple tags, so that the constructed multiple groups can be divided into multiple groups. There may be duplicate reference samples.
  • the probability that the target to be classified belongs to the classification of the reference sample obtained by the target classification model is also multiple, that is, it is related to the number of groupings of the reference sample, and the two are equal, that is, after the target classification model is obtained.
  • the probability that the object to be classified belongs to the classification of each reference sample so that the server can determine the multiple classifications to which the object to be classified belongs according to the probability judgment, thus realizing the technical effect of being able to label a paper with multiple labels at the same time.
  • the existing paper classification models based on machine learning are generally based on a single label, that is to say, a paper can only belong to a small category, but the reality is that a paper can actually have multiple labels, and some papers may be It is really inappropriate to label the paper with only one label that spans multiple fields.
  • the training method of the target classification model includes: acquiring sample data, constructing multiple groups of meta-training samples according to the sample data; and performing training according to the meta-training samples to obtain the target classification model.
  • the sample data may be preset samples that have been classified, such as papers that have been classified.
  • the meta-training samples are processed according to the sample data, in which each employee training sample can include multiple support samples and multiple query samples, and the support samples can include multiple grouped sample data, that is, sample data belonging to different categories. , the corresponding query sample is also the query sample in the corresponding group.
  • the number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples.
  • the calculation of the accuracy of the target classification model can be processed according to the meta-training samples. For example, the support samples and the query samples in the meta-training samples are input into the target classification model to determine the classification corresponding to the query samples. If the actual classification is compared and the expectation is met, the model training is completed.
  • the target classification model is obtained by training according to the meta-training samples, including: serializing the words of each support sample and query sample of each group of meta-training samples; The higher-order feature processing obtains the corresponding higher-order feature representation; the average pooling operation is performed on the higher-order feature representation to obtain the vector representation corresponding to each support sample and the vector representation corresponding to each query sample; according to the vector representation corresponding to each support sample Representation and the vector representation corresponding to each query sample are trained to obtain the target classification model.
  • word serialization refers to converting each word in the support sample and the query sample into an ordered sequence. For example, let the word sequences of the support sample be respectively Then the input of the support sample is The word sequence of the query sample is Then the input of the query sample is Among them, CLS and SEP are two unique words in BERT. When BERT is pre-training, these two special words are added to allow the model to locate the sentence that is due to be input. Therefore, BERT is used to fine-tune downstream tasks.
  • S is actually the first letter of support, representing is the word that supports the sample
  • Q is actually the first letter of the query, representing is the word of the query sample
  • m represents the total number of words in the support sample
  • n represents the total number of words in the query sample.
  • the high-order feature representation can be performed by the BERT model, for example, the high-order feature representation of each word can be obtained by the following formula:
  • the vectorized representation can be obtained by an average pooling operation, for example by the following formula:
  • the obtained s rep represents the feature representation of the entire support sample
  • q rep represents the feature representation of the entire query sample.
  • obtaining sample data constructing multiple groups of training samples according to the sample data, including crawling the sample data that has been classified on a preset website, and grouping the sample data according to the classification; randomly extracting at least one sample from the grouping grouping, and determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeatedly randomly extract at least A grouping step to get multiple sets of meta training samples.
  • the network already has manually sorted categories of papers in the field of human intelligence, and For various papers in this category, crawling these data can form some labeled paper-category datasets without re-labeling yourself, which can greatly reduce the workload.
  • Crawling sub-tasks in various fields from this website there are about 16 major categories, more than 400 middle categories, and more than 1200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.
  • the server randomly selects at least one group from the above 1200 or more sub-categories, for example, 10 groups, which can be expressed as: l 1 , l 2 , ..., l 10 from l 1 , l 2 , ..., l 10 Among the 10 groups, each group randomly selects the first number, for example, 10 samples as support samples, and each group randomly selects the second number, such as 100 samples as query samples, so a total of You will get 100 support samples and 1000 query samples.
  • the dataset thus constructed is a meta-training task whose purpose is to train the model to classify the query samples given the support samples. To train the model, 10,000 such meta-training tasks can be constructed.
  • randomly extracting at least one group from the group includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; according to the vector representation corresponding to each support sample and each The vector representation corresponding to the query sample is trained to obtain the target classification model, including: obtaining the real classification corresponding to the query sample; calculating the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
  • a paper can be labeled with multiple labels at the same time.
  • the multi-label labeling of the target to be classified is realized, that is, the target to be classified is obtained after passing through the target classification model.
  • the probability that the target belongs to the category to which the reference sample belongs is also multiple, that is, it is related to the number of groups of the reference sample, and the two are equal, that is, the target to be classified obtained after passing through the target classification model belongs to the probability of each reference sample belonging to the category , so that the server can determine the multiple categories to which the target to be classified belongs according to the probability judgment, thereby realizing the technical effect of being able to label a paper with multiple labels at the same time.
  • the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, including: calculating the class probability of the support sample corresponding to each query sample according to the following formula:
  • the output of the sigmod activation function is a real number between 0 and 1
  • atten is used to calculate the contribution of each support sample to the query sample classification
  • represents the inner product of two vectors
  • T is a real number, used to control
  • k represents the serial number of the support sample
  • the value of k is related to the number of samples of the support sample
  • the target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
  • Fig. 2 is a flowchart of a meta-learning-based target classification method in another embodiment.
  • the network already has manually organized categories of papers in the field of human intelligence, as well as various papers in this category. Crawling these data allows us to form some labeled paper-category datasets without using our own Relabel. Crawling sub-tasks in various fields from this website, there are about 16 categories, more than 400 middle categories, and more than 1,200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.
  • this paper After crawling the categories, titles, and abstracts of these papers, start building the training set.
  • the title and abstract are concatenated as the model input, and the category of the paper is used as the label.
  • this paper first needs to construct some columns of meta-training samples.
  • the construction rules are as follows: 10 categories are randomly selected from 1200 categories, which may be expressed as, l 1 , l 2 , ..., l 10 . From the 10 categories l 1 , l 2 , ..., l 10 , 10 samples are randomly selected for each category as support samples, and 100 samples are randomly selected for each category as query samples, so A total of 100 support samples and 1000 query samples will be obtained.
  • the data set constructed once is a meta-training task, and the purpose of the task is to train the model to classify the query samples under the premise of given support samples. To train the model, this example constructs 10,000 such meta-training tasks.
  • the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence.
  • the main structure of the model is as follows:
  • the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:
  • the obtained s rep represents the feature representation of the entire support sample
  • q rep represents the feature representation of the entire query sample.
  • the server calculates the class probability of the query sample according to the support sample:
  • the output of the sigmod activation function is a real number between 0 and 1, so we can use P to determine whether the categories of the query sample and the support sample are the same.
  • the atten function is used to calculate the contribution of each support sample to the query sample classification.
  • represents the inner product of two vectors
  • T is a real number that controls the sharpness of the distribution obtained by atten.
  • k represents the serial number of the support sample, because this paper selects 10 support samples for each category, because k is the maximum value of 10.
  • the server can calculate the probability that the query sample is of this category. Since each meta-training task contains 10 categories, the server will get 10 such probabilities, and the query can be obtained by whether the probability is greater than 0.5. The sample does not belong to this category.
  • steps in the flowcharts of FIG. 1 and FIG. 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIG. 1 and FIG. 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with other steps or sub-steps of other steps or at least a portion of a stage.
  • a meta-learning-based target classification device including: a newly added data acquisition module 100, a target acquisition module 200 to be classified, a model processing module 300, and a classification module 400, in:
  • a new data acquisition module 100 is added for acquiring new data and constructing a reference sample according to the new data
  • the target to be classified acquisition module 200 is used for obtaining the target to be classified according to the newly added data and the reference sample;
  • the model processing module 300 is used for inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on a meta-learning method trained;
  • the classification module 400 is configured to determine the classification to which the object to be classified belongs according to the probability.
  • the above-mentioned newly added data includes multiple categories, and the above-mentioned newly added data acquisition module 100 includes:
  • the grouping unit is used to group the newly added data according to the classification, and construct a reference sample corresponding to each grouping;
  • the above-mentioned model processing module 300 is further configured to input the reference sample and the object to be classified into the pre-generated object classification model, so as to determine the first probability that the object to be classified belongs to each classification.
  • the above-mentioned meta-learning-based target classification device further includes:
  • the sample data acquisition module is used to acquire sample data and construct multiple groups of training samples according to the sample data;
  • the training module is used to train the target classification model according to the meta-training samples.
  • the above-mentioned training module includes:
  • the serialization unit is used to serialize the words of each support sample and query sample of each group of training samples
  • the feature processing unit is used to perform high-level feature processing on each serialized word to obtain a corresponding high-level feature representation
  • the vectorization unit is used to perform an average pooling operation on the high-order feature representation to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample;
  • the training unit is used for training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample to obtain the target classification model.
  • the above-mentioned sample data acquisition module may include:
  • the grouping unit is used to crawl the classified sample data on the preset website, and group the sample data according to the classification
  • an extraction unit configured to randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
  • the combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;
  • the loop unit is used for repeating the step of randomly extracting at least one group from the groups to obtain multiple groups of meta training samples.
  • the above-mentioned extraction unit is also used to randomly extract a preset number of groups from the group, and the preset number of groups is greater than or equal to 2;
  • the above training unit includes:
  • the real classification acquisition sub-unit is used to obtain the real classification corresponding to the query sample
  • the model classification acquisition subunit is used to calculate the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, and the model classification includes the number corresponding to the preset number. the second probability of ;
  • the training subunit is used for training according to the real classification and the second probability to obtain the target classification model.
  • the above-mentioned training module may include:
  • the category probability calculation unit is used to calculate the category probability of the support sample corresponding to each query sample according to the following formula:
  • the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⁇ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
  • the target classification model generation unit is used for training the formula according to the real grouping and the category probability of each query sample to obtain the target classification model.
  • Each module in the above-mentioned device for object classification based on meta-learning can be implemented in whole or in part by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device in one embodiment, the computer device may be a server, and the internal structure diagram of which may be shown in FIG. 4 .
  • the computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used for storing newly added data and its corresponding classified data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instructions when executed by a processor, implement a meta-learning based object classification method.
  • FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes one or more processors to perform the following steps: acquiring new data , and construct a reference sample according to the newly added data; obtain the target to be classified according to the newly added data and the reference sample; input the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to the classification of the reference sample The first probability, wherein the target classification model is obtained by training based on meta-learning; the category to which the target to be classified belongs is determined according to the first probability.
  • the newly added data realized when the processor executes the computer-readable instruction includes a plurality of categories; the construction of a reference sample according to the newly-added data realized when the processor executes the computer-readable instruction includes: classifying the newly-added data according to classifying and grouping, and constructing a reference sample corresponding to each group; inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The samples and the objects to be classified are input into a pre-generated object classification model to determine the first probability that the objects to be classified belong to each classification.
  • the training method of the target classification model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, constructing multiple sets of meta-training samples according to the sample data; and training according to the meta-training samples to obtain the target classification model.
  • the target classification model obtained by training according to the meta-training samples includes: serializing the words of each support sample and query sample of each group of meta-training samples ; Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform an average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the vector corresponding to each query sample Representation; the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
  • obtaining sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data according to classifying and grouping; randomly extract at least one group from the group, and determine that the first quantity of sample data in the at least one group extracted is a support sample, and the second quantity of sample data is a query sample; obtain a group according to the support sample and the query sample meta-training samples; repeating the step of randomly selecting at least one group from the groups to obtain multiple sets of meta-training samples.
  • the random extraction of at least one group from the group achieved by the processor executing the computer-readable instructions includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2;
  • the vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , and calculate the model classification corresponding to each query sample, where the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
  • the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to the query sample:
  • the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⁇ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
  • the target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
  • One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: acquire newly added data and, according to the new The reference sample is constructed by adding data; the target to be classified is obtained according to the newly added data and the reference sample; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs,
  • the target classification model is obtained by training based on meta-learning; the classification to which the target to be classified belongs is determined according to the first probability.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the newly added data realized when the computer readable instructions are executed by the processor includes a plurality of categories; the construction of a reference sample according to the newly added data realized when the computer readable instructions are executed by the processor includes: adding new data The data is grouped according to the classification, and a reference sample corresponding to each group is constructed; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to each classification.
  • the training method of the target classification model realized when the computer readable instructions are executed by the processor includes: acquiring sample data, constructing multiple groups of meta training samples according to the sample data; training according to the meta training samples to obtain the target classification model .
  • the target classification model obtained by training according to the meta-training samples includes: performing a sequence sequence on each support sample of each group of meta-training samples and the words of the query sample Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the corresponding high-order feature representation of each query sample.
  • Vector representation the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
  • the acquisition of sample data realized by the computer-readable instructions being executed by the processor, and the construction of multiple groups of meta-training samples according to the sample data includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data. Grouping according to classification; randomly extracting at least one grouping from the grouping, and determining that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample; obtain a sample data according to the support sample and the query sample.
  • Component training samples repeating the step of randomly extracting at least one group from the groups to obtain multiple component training samples.
  • the random extraction of at least one group from the group by the computer-readable instructions when executed by the processor includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2;
  • the vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector corresponding to each query sample means that the model classification corresponding to each query sample is calculated, and the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
  • the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to each query sample:
  • the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⁇ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
  • the target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
  • the blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the embodiments of the present application may acquire and process related data based on artificial intelligence technology.
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
  • the basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A meta-learning-based target classification method, related to the technical field of artificial intelligence. The method comprises: obtaining newly added data, and constructing a reference sample according to the newly added data (S102); obtaining a target to be classified according to the newly added data and the reference sample (S104); inputting the reference sample and said target into a pre-generated target classification model to determine a first probability that said target belongs to a classification to which the reference sample belongs (S106), wherein the target classification model is trained on the basis of a meta-learning mode; and determining the classification to which said target belongs according to the first probability (S108).

Description

基于元学习的目标分类方法、装置、设备和存储介质Meta-learning-based object classification method, apparatus, device and storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2020年12月21日提交中国专利局,申请号为2020115233480,申请名称为″基于元学习的目标分类方法、装置、设备和存储介质″的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 21, 2020 with the application number 2020115233480 and the application title is "Meta-learning-based object classification method, device, equipment and storage medium", the entire content of which is Incorporated herein by reference.
技术领域technical field
本申请涉及一种基于元学习的目标分类方法、装置、设备和存储介质。The present application relates to a meta-learning-based object classification method, apparatus, device and storage medium.
背景技术Background technique
随着人工智能技术的发展,出现了计算机视觉、自然语言处理、语音识别等技术,但不同的人对于庞大的人工智能领域各有研究侧重点,如计算机视觉领域中的子任务目前就已经达到了500多个,自然语言处理领域的子任务也有300多个。对于纷繁复杂的学术论文,人工智能领域的学者急需一套系统来对新出现的论文进行分类和打标签。With the development of artificial intelligence technology, technologies such as computer vision, natural language processing, and speech recognition have emerged. However, different people have their own research focuses on the huge field of artificial intelligence. For example, the sub-tasks in the field of computer vision have now reached There are more than 500 sub-tasks in the field of natural language processing, and there are also more than 300 sub-tasks. For the numerous and complex academic papers, scholars in the field of artificial intelligence urgently need a system to classify and label emerging papers.
然而,发明人意识到,传统的基于机器学习的论文分类模型都只能处理训练集中出现过的论文类别,一旦有新类别的论文过来,这些模型就无法对这些论文进行正确的分类。此外新类别的论文一开始数据较少,由于机器学习模型通常都需要大量的训练样本来进行训练,即时将新类别的论文作为训练数据来进行训练,也不可能得到一个准确率很高的分类模型,导致模型在测试集上的效果变差,进而导致新出现的论文的分类不准确。However, the inventors realized that traditional machine learning-based paper classification models can only deal with paper categories that have appeared in the training set. Once new categories of papers come, these models cannot correctly classify these papers. In addition, the papers of the new category have less data at the beginning. Since machine learning models usually require a large number of training samples for training, even if the papers of the new category are used as training data for training, it is impossible to obtain a classification with high accuracy. model, resulting in poor performance of the model on the test set, which in turn leads to inaccurate classification of emerging papers.
发明内容SUMMARY OF THE INVENTION
根据本申请公开的各种实施例,提供一种基于元学习的目标分类方法、装置、设备和存储介质。According to various embodiments disclosed in the present application, a meta-learning-based object classification method, apparatus, device, and storage medium are provided.
一种基于元学习的目标分类方法,包括:A meta-learning-based object classification method including:
获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
一种基于元学习的目标分类装置,包括:A target classification device based on meta-learning, comprising:
新增数据获取模块,用于获取新增数据,并根据所述新增数据构建参照样本;a new data acquisition module, used for acquiring new data, and constructing a reference sample according to the new data;
待分类目标获取模块,用于根据所述新增数据以及所述参照样本得到待分类目标;an acquisition module for a target to be classified, configured to obtain a target to be classified according to the newly added data and the reference sample;
模型处理模块,用于将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及A model processing module, configured to input the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, wherein the The object classification model is trained based on meta-learning; and
分类模块,用于根据所述概率确定所述待分类目标所属分类。A classification module, configured to determine the classification to which the object to be classified belongs according to the probability.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored therein, the computer-readable instructions, when executed by the processor, cause the one or more processors to execute The following steps:
获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
上述基于元学习的目标分类方法、装置、设备和存储介质,根据新增数据确定了参照样本,这样仅需要将参照样本和待分类目标输入至预先生成的目标分类模型中,即可以得到待分类目标所述的分类,能够自动的为人工智能领域的目标进行分类,且不需要人工干涉,不需要专门的人工智能领域的知识,大大减少了人力成本,且当有新类别的数据来时,不需要重新训练模型,只需要少数几个支撑样本,就可以对待分类目标打标签,以进行分 类。The above-mentioned meta-learning-based target classification method, device, equipment and storage medium determine the reference sample according to the newly added data, so that only the reference sample and the target to be classified need to be input into the pre-generated target classification model, that is, the to-be-classified model can be obtained. The classification described in the target can automatically classify the target in the field of artificial intelligence, and does not require manual intervention, does not require special knowledge in the field of artificial intelligence, greatly reduces labor costs, and when new types of data come, There is no need to retrain the model, and only a few supporting samples are needed to label the classification target for classification.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features and advantages of the present application will be apparent from the description, drawings, and claims.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings required in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1为根据一个或多个实施例中基于元学习的目标分类方法的流程示意图。FIG. 1 is a schematic flowchart of a meta-learning-based object classification method according to one or more embodiments.
图2为根据另一个或多个实施例中基于元学习的目标分类方法的流程示意图。FIG. 2 is a schematic flowchart of a meta-learning-based object classification method according to another or more embodiments.
图3为根据一个或多个实施例中基于元学习的目标分类装置的结构框图。FIG. 3 is a structural block diagram of an apparatus for object classification based on meta-learning according to one or more embodiments.
图4为根据一个或多个实施例中计算机设备的内部结构图。4 is a diagram of the internal structure of a computer device in accordance with one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
在其中一个实施例中,如图1所示,提供了一种基于元学习的目标分类方法,本实施例以该方法应用于终端进行举例说明,可以理解的是,该方法也可以应用于服务器,还可以应用于包括终端和服务器的系统,并通过终端和服务器的交互实现。本实施例中,该方法包括以下步骤:In one of the embodiments, as shown in FIG. 1 , a meta-learning-based target classification method is provided. In this embodiment, the method is applied to a terminal for illustration. It can be understood that the method can also be applied to a server. , can also be applied to a system including a terminal and a server, and is realized through the interaction between the terminal and the server. In this embodiment, the method includes the following steps:
S102:获取新增数据,并根据新增数据构建参照样本。S102: Acquire new data, and construct a reference sample according to the new data.
具体地,新增数据是新增加的数据;以论文为例,当存在新类别的论文时,属于该新类别的论文则属于新增数据。参照样本是根据新增数据所构建的,参照样本是新增数据的子集,即一段时间内新增加了大量的新增数据,其中一小部分新增数据经过分类得到了参照样本。即对新增数据提取了部分数据,然后对所提取的部分数据进行分类得到了参照样本。且该部分数据的数量是较少的,例如小于一阈值,例如10篇等。Specifically, newly added data is newly added data; taking papers as an example, when there are papers in a new category, papers belonging to the new category belong to newly added data. The reference sample is constructed based on the newly added data. The reference sample is a subset of the newly added data, that is, a large amount of newly added data has been added over a period of time, and a small part of the new data has been classified to obtain the reference sample. That is, part of the data is extracted from the newly added data, and then the extracted part of the data is classified to obtain a reference sample. And the amount of this part of the data is small, for example, less than a threshold, for example, 10 articles.
S104:根据新增数据以及参照样本得到待分类目标。S104: Obtain the target to be classified according to the newly added data and the reference sample.
具体地,待分类目标则是新增加的数据中除了参照样本以外的数据,以论文为例,即新增加的未经过分类的论文。也就是说待分类目标和参照样本构成了所有的新增数据,这样其中参照样本是经过标签标注的,数量较少,例如10个等,而所剩余的数量较多的是待分类目标。Specifically, the target to be classified is the data other than the reference sample in the newly added data, taking the paper as an example, that is, the newly added unclassified paper. That is to say, the target to be classified and the reference sample constitute all the newly added data, so that the reference sample is labeled with a label, and the number is small, such as 10, etc., and the remaining large number is the target to be classified.
S106:将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,其中,目标分类模型是基于元学习的方式训练得到的。S106: Input the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is obtained by training based on meta-learning.
具体地,目标分类模型是基于人工智能技术领域的元学习的方式训练得到的,其中根据样本数据构建多个元训练任务,然后通过所构建的元训练任务进行训练得到目标分类模型。其中元训练任务是给定了少量支撑样本和大量的查询样本后,对该支撑样本和查询样本进行训练得到以较少的新类别样本即可以对新类别的数据进行区分的目标分类模型。Specifically, the target classification model is obtained by training based on meta-learning in the field of artificial intelligence technology, wherein multiple meta-training tasks are constructed according to sample data, and then the target classification model is obtained by training through the constructed meta-training tasks. The meta-training task is given a small number of support samples and a large number of query samples, and then train the support samples and query samples to obtain a target classification model that can distinguish new categories of data with fewer new categories of samples.
其中,服务器将参照样本和待分类目标输入至预先生成的目标分类模型中,以使得目标分类模型对参照样本和待分类样本进行处理,并计算处理后的待分类目标属于参照样本所属分类的概率。The server inputs the reference sample and the target to be classified into the pre-generated target classification model, so that the target classification model processes the reference sample and the sample to be classified, and calculates the probability that the processed target to be classified belongs to the category to which the reference sample belongs. .
其中目标分类模型对参照样本进行处理可以包括:对参照样本和待分类目标进行向量化表示的过程以及根据向量化表示的参照样本计算待分类目标的第一概率的步骤。The target classification model processing the reference samples may include: a process of vectorized representation of the reference samples and the target to be classified, and a step of calculating the first probability of the target to be classified according to the vectorized reference samples.
其中向量化表示的过程可以包括:计算参照样本和待分类目标的单词序列,然后将参照样本和待分类目标的单词序列进行处理得到每个单词的高阶特征表示,例如输入至BERT模型中进行处理,最后,分别将参照样本和待分类目标的高阶表示后的每个单词的高阶特征进行平均池化操作得到对应的参照样本和待分类目标的向量化表示。The process of vectorized representation may include: calculating the word sequence of the reference sample and the target to be classified, and then processing the reference sample and the word sequence of the target to be classified to obtain the high-level feature representation of each word, for example, inputting it into the BERT model for Process, and finally, perform an average pooling operation on the high-level features of each word after the high-level representation of the reference sample and the target to be classified, respectively, to obtain the vectorized representation of the corresponding reference sample and the target to be classified.
其中,根据向量化表示的参照样本计算待分类目标的第一概率的步骤可以包括:根据预先训练的模型来根据向量化表示的参照样本计算待分类目标的第一概率:Wherein, the step of calculating the first probability of the object to be classified according to the vectorized reference sample may include: calculating the first probability of the object to be classified according to the vectorized reference sample according to a pre-trained model:
Figure PCTCN2021109571-appb-000001
Figure PCTCN2021109571-appb-000001
Figure PCTCN2021109571-appb-000002
Figure PCTCN2021109571-appb-000002
其中Sigmod激活函数的输出为0到N之间的实数,因此可以根据P来确定待分类目标和参照样本的类别是不是相同的。atten函数是用来计算每个参照样本对待分类目标分类的贡献度。⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度。k代表参照样本的序号,其取值与参照样本中样本数量有关。The output of the sigmod activation function is a real number between 0 and N, so it can be determined whether the categories of the target to be classified and the reference sample are the same according to P. The atten function is used to calculate the contribution of each reference sample to the classification target classification. ⊙ represents the inner product of two vectors, and T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the reference sample, and its value is related to the number of samples in the reference sample.
S108:根据第一概率确定待分类目标所属分类。S108: Determine the category to which the target to be classified belongs according to the first probability.
具体地,其中服务器可以预设概率阈值,通过该概率阈值来确定待分类目标所属分类。且由于Sigmod激活函数的输出为0到N之间的实数,例如0到1之间的实数,也就是相当于一个二分类问题,所以大于0.5代表相同,小于0.5代表不同。在其他的实施例中。该预设概率阈值可以根据Sigmod激活函数的输出的范围来确定预设概率阈值。Specifically, the server may preset a probability threshold, and use the probability threshold to determine the category to which the object to be classified belongs. And since the output of the sigmod activation function is a real number between 0 and N, such as a real number between 0 and 1, it is equivalent to a binary classification problem, so greater than 0.5 means the same, and less than 0.5 means different. In other embodiments. The preset probability threshold may be determined according to the range of the output of the sigmod activation function.
需要强调的是,为进一步保证上述新增数据和新增数据对应的分类的私密和安全性,上述新增数据和新增数据对应的分类还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned new data and the classification corresponding to the new data, the above-mentioned new data and the corresponding classification of the new data can also be stored in a node of a blockchain.
上述基于元学习的目标分类方法,根据新增数据确定了参照样本,这样仅需要将参照样本和待分类目标输入至预先生成的目标分类模型中,即可以得到待分类目标的分类,能够自动的为人工智能领域的目标进行分类,且不需要人工干涉,不需要专门的人工智能领域的知识,大大减少了人力成本,且当有新类别的数据来时,不需要重新训练模型,只需要少数几个支撑样本,就可以对待分类目标打标签,以进行分类。The above-mentioned meta-learning-based target classification method determines the reference samples according to the newly added data. In this way, only the reference samples and the target to be classified need to be input into the pre-generated target classification model, that is, the classification of the target to be classified can be obtained automatically. Classify the targets in the field of artificial intelligence without manual intervention or special knowledge in the field of artificial intelligence, which greatly reduces labor costs, and when new types of data come, there is no need to retrain the model, only a few With a few supporting samples, the classification target can be labeled for classification.
在其中一个实施例中,新增数据包括多个分类;根据新增数据构建参照样本包括:将新增数据按照分类进行分组,并构建每一分组对应的参照样本;将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,包括:将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于每个分类的第一概率。In one embodiment, the newly added data includes a plurality of categories; constructing a reference sample according to the newly added data includes: grouping the newly added data according to the categories, and constructing a reference sample corresponding to each grouping; Input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, including: inputting the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to The first probability for each class.
具体地,由于新增数据可能属于多个分组,或者是同一数据,例如同一篇论文可能存在多个标签,因此服务器先从新增数据中获取少数量的样本数据,然后对该些样本数据进行分组,并构建每一分组对应的参照样本,其中需要说明的是所获取的样本数据可以同时分配到多个分组中,也即实现了同一数据存在多个标签,这样所构建的多个分组中可以存在重复的参照样本。Specifically, since the newly added data may belong to multiple groups, or the same data, for example, the same paper may have multiple tags, the server first obtains a small amount of sample data from the newly added data, and then performs the analysis on these sample data. group, and construct the reference samples corresponding to each group. It should be noted that the obtained sample data can be allocated to multiple groups at the same time, that is, the same data has multiple tags, so that the constructed multiple groups can be divided into multiple groups. There may be duplicate reference samples.
相应地,通过目标分类模型后所得到的待分类目标属于参照样本所属分类的概率也是多个,即与参照样本的分组的数量相关,两者是相等的关系,即通过目标分类模型后所得到的待分类目标属于每一参照样本所属分类的概率,这样服务器根据概率判断可以确定待分类目标所属的多个分类,从而实现了能够同时给一篇论文打多个标签的技术效果。而现有的基于机器学习的论文分类模型一般都是基于单标签的,也就是说一篇论文只能属于一个小类,而现实情况是一篇论文其实可以有多个标签,有些论文可能是跨越了多个领域的,只给该论文打一个标签实在是不合适的。Correspondingly, the probability that the target to be classified belongs to the classification of the reference sample obtained by the target classification model is also multiple, that is, it is related to the number of groupings of the reference sample, and the two are equal, that is, after the target classification model is obtained. The probability that the object to be classified belongs to the classification of each reference sample, so that the server can determine the multiple classifications to which the object to be classified belongs according to the probability judgment, thus realizing the technical effect of being able to label a paper with multiple labels at the same time. The existing paper classification models based on machine learning are generally based on a single label, that is to say, a paper can only belong to a small category, but the reality is that a paper can actually have multiple labels, and some papers may be It is really inappropriate to label the paper with only one label that spans multiple fields.
上述实施例中,由于构建了多组参照样本,因此支持同时对一篇论文打多个标签。In the above embodiment, since multiple sets of reference samples are constructed, it is possible to simultaneously tag a paper with multiple labels.
在其中一个实施例中,目标分类模型的训练方式包括:获取样本数据,根据样本数据构建多组元训练样本;根据元训练样本进行训练得到目标分类模型。In one of the embodiments, the training method of the target classification model includes: acquiring sample data, constructing multiple groups of meta-training samples according to the sample data; and performing training according to the meta-training samples to obtain the target classification model.
具体地,样本数据可以是预先设置的已经分类完成的样本,例如已经分类完成的论文。元训练样本是根据样本数据进行处理得到的,其中每个员训练样本可以包括多个支撑样本和多个查询样本,其中支撑样本中可以包括多个分组的样本数据,即属于不同分类的样本数据,对应的查询样本也是相应的分组中的查询样本。其中元训练样本的组数可以根据需要进行设置,例如一万个,然后通过该元训练样本来进行训练得到目标分类模型,例如依次通过元训练样本进行训练直至目标分类模型的准确率达到预期,其中对于目标分类模型的准确率的计算可以根据元训练样本进行处理,例如将元训练样本中的支撑样本和查询样本输入至目标分类模型中,以确定查询样本对应的分类,若是与查询样本的真实分类相比较,达到预期,则模型训练完成。Specifically, the sample data may be preset samples that have been classified, such as papers that have been classified. The meta-training samples are processed according to the sample data, in which each employee training sample can include multiple support samples and multiple query samples, and the support samples can include multiple grouped sample data, that is, sample data belonging to different categories. , the corresponding query sample is also the query sample in the corresponding group. The number of groups of meta-training samples can be set as required, such as 10,000, and then the target classification model is obtained by training the meta-training samples. The calculation of the accuracy of the target classification model can be processed according to the meta-training samples. For example, the support samples and the query samples in the meta-training samples are input into the target classification model to determine the classification corresponding to the query samples. If the actual classification is compared and the expectation is met, the model training is completed.
在其中一个实施例中,根据元训练样本进行训练得到目标分类模型,包括:对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;对高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。In one embodiment, the target classification model is obtained by training according to the meta-training samples, including: serializing the words of each support sample and query sample of each group of meta-training samples; The higher-order feature processing obtains the corresponding higher-order feature representation; the average pooling operation is performed on the higher-order feature representation to obtain the vector representation corresponding to each support sample and the vector representation corresponding to each query sample; according to the vector representation corresponding to each support sample Representation and the vector representation corresponding to each query sample are trained to obtain the target classification model.
具体地,单词序列化是指将支撑样本和查询样本中的每个单词转化成有序的序列,例 如设支撑(support)样本的单词序列分别为
Figure PCTCN2021109571-appb-000003
则支撑样本的输入为
Figure PCTCN2021109571-appb-000004
查询样本的单词序列为
Figure PCTCN2021109571-appb-000005
则查询样本的输入为
Figure PCTCN2021109571-appb-000006
其中,CLS和SEP是BERT里面特有的两个单词,BERT在预训练的时候,就加入了这两个特殊的单词,以让模型能定位到期输入的句子,因此在用BERT来微调下游任务的时候,也必须添加这两个特殊的单词,一个在开头,一个在结尾,S其实是support的首字母,代表
Figure PCTCN2021109571-appb-000007
是支撑样本的单词,Q其实是query的首字母,代表
Figure PCTCN2021109571-appb-000008
是查询样本的单词,m代表这个支撑样本一共有多少个单词,n代表这个查询样本一共有多少个单词。
Specifically, word serialization refers to converting each word in the support sample and the query sample into an ordered sequence. For example, let the word sequences of the support sample be respectively
Figure PCTCN2021109571-appb-000003
Then the input of the support sample is
Figure PCTCN2021109571-appb-000004
The word sequence of the query sample is
Figure PCTCN2021109571-appb-000005
Then the input of the query sample is
Figure PCTCN2021109571-appb-000006
Among them, CLS and SEP are two unique words in BERT. When BERT is pre-training, these two special words are added to allow the model to locate the sentence that is due to be input. Therefore, BERT is used to fine-tune downstream tasks. When , you must also add these two special words, one at the beginning and the other at the end, S is actually the first letter of support, representing
Figure PCTCN2021109571-appb-000007
is the word that supports the sample, Q is actually the first letter of the query, representing
Figure PCTCN2021109571-appb-000008
is the word of the query sample, m represents the total number of words in the support sample, and n represents the total number of words in the query sample.
高阶特征表示可以通过BERT模型来进行,例如通过以下公式得到每个单词的高阶特征表示:The high-order feature representation can be performed by the BERT model, for example, the high-order feature representation of each word can be obtained by the following formula:
Figure PCTCN2021109571-appb-000009
Figure PCTCN2021109571-appb-000009
Figure PCTCN2021109571-appb-000010
Figure PCTCN2021109571-appb-000010
其中
Figure PCTCN2021109571-appb-000011
Figure PCTCN2021109571-appb-000012
分别为支撑样本和查询样本的第i个和第j个单词。
in
Figure PCTCN2021109571-appb-000011
and
Figure PCTCN2021109571-appb-000012
are the ith and jth words of the support sample and query sample, respectively.
向量化表示可以是通过平均池化操作得到的,例如通过以下公式进行处理:The vectorized representation can be obtained by an average pooling operation, for example by the following formula:
s rep=MEAN_POOLING i(s i) s rep = MEAN_POOLING i (s i )
q rep=MEAN_POOLING j(q j) q rep = MEAN_POOLING j (q j )
这样得到的s rep就代表整个支撑样本的特征表示,q rep就代表整个查询样本的特征表示。 The obtained s rep represents the feature representation of the entire support sample, and q rep represents the feature representation of the entire query sample.
在其中一个实施例中,获取样本数据,根据样本数据构建多组元训练样本,包括爬取预设网站上已经分类完成的样本数据,对样本数据按照分类进行分组;随机从分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得到一组元训练样本;重复随机从分组中抽取至少一个分组的步骤以得到多组元训练样本。In one embodiment, obtaining sample data, constructing multiple groups of training samples according to the sample data, including crawling the sample data that has been classified on a preset website, and grouping the sample data according to the classification; randomly extracting at least one sample from the grouping grouping, and determining that the first quantity of sample data in the extracted at least one group is a support sample, and the second quantity of sample data is a query sample; obtain a set of meta-training samples according to the support sample and the query sample; repeatedly randomly extract at least A grouping step to get multiple sets of meta training samples.
具体地,由于针对论文的分类网上已有较为准确的分类,例如较为成熟的人工智能领域分类和打标签网站papers with codes,该网络上已经具有人工整理好的人能智能领域论文的类别,以及该种类别下的各种论文,爬取这些数据能够形成一些标注好的论文-类别数据集,而不用自己重新标注,从而可以大大减少工作量。从该网站上爬取各个领域的子任务,共约16大类,400多中类,1200多小类,针对每个小类爬取相应的论文标题,论文摘要和论文下载地址。Specifically, because there are relatively accurate classifications for papers on the Internet, such as the relatively mature artificial intelligence field classification and tagging website papers with codes, the network already has manually sorted categories of papers in the field of human intelligence, and For various papers in this category, crawling these data can form some labeled paper-category datasets without re-labeling yourself, which can greatly reduce the workload. Crawling sub-tasks in various fields from this website, there are about 16 major categories, more than 400 middle categories, and more than 1200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.
具体地,服务器随机从上述1200多个小类中抽取至少一个分组,例如10个分组,可以表示为:l 1,l 2,...,l 10从l 1,l 2,...,l 10这10个分组中,每个分组随机抽取第一数量,例如10个样本作为支撑(support)样本,每个分组随机抽取第二数量,例如100个样本作为查询(query)样本,因此一共会得到100个支撑样本,1000个查询样本。将这样一次构建的数据集成为一个元训练任务,该任务的目的是训使得模型能够在给定支撑样本的前提下,为查询样本进行分类。为了训练模型,可以构建了10000个这样的元训练任务。 Specifically, the server randomly selects at least one group from the above 1200 or more sub-categories, for example, 10 groups, which can be expressed as: l 1 , l 2 , ..., l 10 from l 1 , l 2 , ..., l 10 Among the 10 groups, each group randomly selects the first number, for example, 10 samples as support samples, and each group randomly selects the second number, such as 100 samples as query samples, so a total of You will get 100 support samples and 1000 query samples. The dataset thus constructed is a meta-training task whose purpose is to train the model to classify the query samples given the support samples. To train the model, 10,000 such meta-training tasks can be constructed.
在其中一个实施例中,随机从分组中抽取至少一个分组,包括:随机从分组中抽取预设数量的分组,预设数量的分组大于等于2;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:获取查询样本对应真实分类;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,模型分类包括与预设数量的个数相对应数量的第二概率;根据真实分类和第二概率进行训练得到目标分类模型。In one of the embodiments, randomly extracting at least one group from the group includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; according to the vector representation corresponding to each support sample and each The vector representation corresponding to the query sample is trained to obtain the target classification model, including: obtaining the real classification corresponding to the query sample; calculating the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
为了实现多标签,能够同时给一篇论文打多个标签,在该实施例中,通过设置分组的数量,来实现给待分类目标的多标签标注,即通过目标分类模型后所得到的待分类目标属于参照样本所属分类的概率也是多个,即与参照样本的分组的数量相关,两者是相等的关系,即通过目标分类模型后所得到的待分类目标属于每一参照样本所属分类的概率,这样服务器根据概率判断可以确定待分类目标所属的多个分类,从而实现了能够同时给一篇论文打多个标签的技术效果。In order to achieve multi-labeling, a paper can be labeled with multiple labels at the same time. In this embodiment, by setting the number of groups, the multi-label labeling of the target to be classified is realized, that is, the target to be classified is obtained after passing through the target classification model. The probability that the target belongs to the category to which the reference sample belongs is also multiple, that is, it is related to the number of groups of the reference sample, and the two are equal, that is, the target to be classified obtained after passing through the target classification model belongs to the probability of each reference sample belonging to the category , so that the server can determine the multiple categories to which the target to be classified belongs according to the probability judgment, thereby realizing the technical effect of being able to label a paper with multiple labels at the same time.
在其中一个实施例中,根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:根据以下公式计算每个查询样本对应的支撑样本的类别概率:In one embodiment, the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, including: calculating the class probability of the support sample corresponding to each query sample according to the following formula:
Figure PCTCN2021109571-appb-000013
Figure PCTCN2021109571-appb-000013
Figure PCTCN2021109571-appb-000014
Figure PCTCN2021109571-appb-000014
其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;根据每个查询样本的真实分组以及类别概率对公式进行训练得到目标分类模型。Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; the target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
具体地,计算查询样本为该类别的概率,由于每个元训练任务中包含了10个类别,因此会得到10个这样的概率,通过这个概率是否大于0.5,就可以知道查询样本是不是属于该类别,并将所得到的查询样本的模型类别与查询样本的真实分组进行比较来构建损失函数等,以对上述公式进行训练,从而得到目标分类模型,例如对上述的Sigmod激活函数、atten函数中的参数进行训练。Specifically, calculate the probability that the query sample belongs to this category. Since each meta-training task contains 10 categories, 10 such probabilities will be obtained. By whether this probability is greater than 0.5, we can know whether the query sample belongs to this category. category, and compare the obtained model category of the query sample with the real grouping of the query sample to construct a loss function, etc., to train the above formula, so as to obtain the target classification model, such as the above Sigmod activation function, atten function in the parameters for training.
具体地,参见图2所示,图2为另一个实施例中的基于元学习的目标分类方法的流程图,在该实施例中,首先借鉴较为成熟的人工智能领域分类和打标签网站papers with codes,该网络上已经具有人工整理好的人能智能领域论文的类别,以及该种类别下的各种 论文,爬取这些数据能够让我们形成一些标注好的论文-类别数据集,而不用自己重新标注。从该网站上爬取各个领域的子任务,共约16大类,400多中类,1200多小类,针对每个小类爬取相应的论文标题,论文摘要和论文下载地址。Specifically, referring to Fig. 2, Fig. 2 is a flowchart of a meta-learning-based target classification method in another embodiment. In this embodiment, we first draw lessons from the relatively mature artificial intelligence field classification and labeling website papers with codes, the network already has manually organized categories of papers in the field of human intelligence, as well as various papers in this category. Crawling these data allows us to form some labeled paper-category datasets without using our own Relabel. Crawling sub-tasks in various fields from this website, there are about 16 categories, more than 400 middle categories, and more than 1,200 sub-categories. For each sub-category, the corresponding paper title, paper abstract and paper download address are crawled.
其次,在爬取好这些论文的类别、标题以及摘要之后,开始构建训练集。其中标题和摘要拼接起来作为模型输入,而论文的类别作为标签。为了训练模型,本文首先需要构建一些列的元训练(meta-training)样本,构建规则如下:从1200类中随机抽取10个类别,不妨表示为,l 1,l 2,...,l 10。从l 1,l 2,...,l 10这10个类别中,每个类别随机抽取10个样本作为支撑(support)样本,每个类别随机抽取100个样本作为查询(query)样本,因此一共会得到100个支撑样本,1000个查询样本。本实施例中将这样一次构建的数据集成为一个元训练任务(meta-training task),该任务的目的是训使得模型能够在给定支撑样本的前提下,为查询样本进行分类。为了训练模型,本实施例构建了10000个这样的元训练任务。 Second, after crawling the categories, titles, and abstracts of these papers, start building the training set. The title and abstract are concatenated as the model input, and the category of the paper is used as the label. In order to train the model, this paper first needs to construct some columns of meta-training samples. The construction rules are as follows: 10 categories are randomly selected from 1200 categories, which may be expressed as, l 1 , l 2 , ..., l 10 . From the 10 categories l 1 , l 2 , ..., l 10 , 10 samples are randomly selected for each category as support samples, and 100 samples are randomly selected for each category as query samples, so A total of 100 support samples and 1000 query samples will be obtained. In this embodiment, the data set constructed once is a meta-training task, and the purpose of the task is to train the model to classify the query samples under the premise of given support samples. To train the model, this example constructs 10,000 such meta-training tasks.
在构建好10000个元训练任务之后,则开始构建模型。本实施例中采用中文预训练语言模型BERT来编码句子的特征表示,模型主体架构如下:After building 10,000 meta-training tasks, we start building the model. In this embodiment, the Chinese pre-trained language model BERT is used to encode the feature representation of the sentence. The main structure of the model is as follows:
设支撑(support)样本的单词序列分别为
Figure PCTCN2021109571-appb-000015
则支撑样本的输入为
Figure PCTCN2021109571-appb-000016
查询样本的单词序列为
Figure PCTCN2021109571-appb-000017
则查询样本的输入为
Figure PCTCN2021109571-appb-000018
Let the word sequences of the support samples be
Figure PCTCN2021109571-appb-000015
Then the input of the support sample is
Figure PCTCN2021109571-appb-000016
The word sequence of the query sample is
Figure PCTCN2021109571-appb-000017
Then the input of the query sample is
Figure PCTCN2021109571-appb-000018
将支撑样本和查询样本输入BERT之后,通过以下式子得到这些样本的每个单词的高阶特征表示:After inputting support samples and query samples into BERT, the high-order feature representation of each word of these samples is obtained by the following formula:
Figure PCTCN2021109571-appb-000019
Figure PCTCN2021109571-appb-000019
Figure PCTCN2021109571-appb-000020
Figure PCTCN2021109571-appb-000020
其中
Figure PCTCN2021109571-appb-000021
Figure PCTCN2021109571-appb-000022
分别为支撑样本和查询样本的第i个和第j个单词。
in
Figure PCTCN2021109571-appb-000021
and
Figure PCTCN2021109571-appb-000022
are the ith and jth words of the support sample and query sample, respectively.
在得到这些单词的高阶特征表示之后,服务器利用平均池化操作来得到一个统一的向量表示,用来代表整个样本:After obtaining the high-level feature representations of these words, the server uses an average pooling operation to obtain a unified vector representation that represents the entire sample:
s rep=MEAN_POOLING i(s i) s rep = MEAN_POOLING i (s i )
q rep=MEAN_POOLING j(q j) q rep = MEAN_POOLING j (q j )
这样得到的s rep就代表整个支撑样本的特征表示,q rep就代表整个查询样本的特征表示。 The obtained s rep represents the feature representation of the entire support sample, and q rep represents the feature representation of the entire query sample.
在得到整个样本的特征表示之后,服务器根据支撑样本来计算查询样本的类别概率:After obtaining the feature representation of the entire sample, the server calculates the class probability of the query sample according to the support sample:
Figure PCTCN2021109571-appb-000023
Figure PCTCN2021109571-appb-000023
Figure PCTCN2021109571-appb-000024
Figure PCTCN2021109571-appb-000024
其中Sigmod激活函数的输出为0到1之间的实数,因此我们可以P来确定查询样本和支撑样本的类别是不是相同的。atten函数是用来计算每个支撑样本对查询样本分类的贡献度。⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度。 k代表支撑样本的序号,因为本文针对每个类别选取了10个支撑样本,因为k最大取10。The output of the sigmod activation function is a real number between 0 and 1, so we can use P to determine whether the categories of the query sample and the support sample are the same. The atten function is used to calculate the contribution of each support sample to the query sample classification. ⊙ represents the inner product of two vectors, and T is a real number that controls the sharpness of the distribution obtained by atten. k represents the serial number of the support sample, because this paper selects 10 support samples for each category, because k is the maximum value of 10.
这样针对某一类别,服务器可以计算查询样本为该类别的概率,由于每个元训练任务中包含了10个类别,因此服务器会得到10个这样的概率,通过这个概率是否大于0.5就可以得到查询样本是不是属于该类别。In this way, for a certain category, the server can calculate the probability that the query sample is of this category. Since each meta-training task contains 10 categories, the server will get 10 such probabilities, and the query can be obtained by whether the probability is greater than 0.5. The sample does not belong to this category.
应该理解的是,虽然图1和图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1和图2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowcharts of FIG. 1 and FIG. 2 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIG. 1 and FIG. 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with other steps or sub-steps of other steps or at least a portion of a stage.
在其中一个实施例中,如图3所示,提供了一种基于元学习的目标分类装置,包括:新增数据获取模块100、待分类目标获取模块200、模型处理模块300和分类模块400,其中:In one of the embodiments, as shown in FIG. 3, a meta-learning-based target classification device is provided, including: a newly added data acquisition module 100, a target acquisition module 200 to be classified, a model processing module 300, and a classification module 400, in:
新增数据获取模块100,用于获取新增数据,并根据新增数据构建参照样本;A new data acquisition module 100 is added for acquiring new data and constructing a reference sample according to the new data;
待分类目标获取模块200,用于根据新增数据以及参照样本得到待分类目标;The target to be classified acquisition module 200 is used for obtaining the target to be classified according to the newly added data and the reference sample;
模型处理模块300,用于将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,其中,目标分类模型是基于元学习的方式训练得到的;The model processing module 300 is used for inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on a meta-learning method trained;
分类模块400,用于根据概率确定待分类目标所属分类。The classification module 400 is configured to determine the classification to which the object to be classified belongs according to the probability.
在其中一个实施例中,上述的新增数据包括多个分类,上述的新增数据获取模块100包括:In one embodiment, the above-mentioned newly added data includes multiple categories, and the above-mentioned newly added data acquisition module 100 includes:
分组单元,用于将新增数据按照分类进行分组,并构建每一分组对应的参照样本;The grouping unit is used to group the newly added data according to the classification, and construct a reference sample corresponding to each grouping;
上述模型处理模块300还用于将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于每个分类的第一概率。The above-mentioned model processing module 300 is further configured to input the reference sample and the object to be classified into the pre-generated object classification model, so as to determine the first probability that the object to be classified belongs to each classification.
在其中一个实施例中,上述的基于元学习的目标分类装置还包括:In one embodiment, the above-mentioned meta-learning-based target classification device further includes:
样本数据获取模块,用于获取样本数据,根据样本数据构建多组元训练样本;The sample data acquisition module is used to acquire sample data and construct multiple groups of training samples according to the sample data;
训练模块,用于根据元训练样本进行训练得到目标分类模型。The training module is used to train the target classification model according to the meta-training samples.
在其中一个实施例中,上述的训练模块包括:In one embodiment, the above-mentioned training module includes:
序列化单元,用于对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;The serialization unit is used to serialize the words of each support sample and query sample of each group of training samples;
特征处理单元,用于将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;The feature processing unit is used to perform high-level feature processing on each serialized word to obtain a corresponding high-level feature representation;
向量化单元,用于对高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;The vectorization unit is used to perform an average pooling operation on the high-order feature representation to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample;
训练单元,用于根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。The training unit is used for training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample to obtain the target classification model.
在其中一个实施例中,上述样本数据获取模块可以包括:In one embodiment, the above-mentioned sample data acquisition module may include:
分组单元,用于爬取预设网站上已经分类完成的样本数据,对样本数据按照分类进行分组;The grouping unit is used to crawl the classified sample data on the preset website, and group the sample data according to the classification;
抽取单元,用于随机从分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;an extraction unit, configured to randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
组合单元,用于根据支撑样本和查询样本得到一组元训练样本;The combination unit is used to obtain a set of meta training samples according to the support samples and the query samples;
循环单元,用于重复随机从分组中抽取至少一个分组的步骤以得到多组元训练样本。The loop unit is used for repeating the step of randomly extracting at least one group from the groups to obtain multiple groups of meta training samples.
在其中一个实施例中,上述的抽取单元还用于随机从分组中抽取预设数量的分组,预设数量的分组大于等于2;In one of the embodiments, the above-mentioned extraction unit is also used to randomly extract a preset number of groups from the group, and the preset number of groups is greater than or equal to 2;
上述训练单元包括:The above training unit includes:
真实分类获取子单元,用于获取查询样本对应真实分类;The real classification acquisition sub-unit is used to obtain the real classification corresponding to the query sample;
模型分类获取子单元,用于根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,模型分类包括与预设数量的个数相对应数量的第二概率;The model classification acquisition subunit is used to calculate the model classification corresponding to each query sample according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, and the model classification includes the number corresponding to the preset number. the second probability of ;
训练子单元,用于根据真实分类和第二概率进行训练得到目标分类模型。The training subunit is used for training according to the real classification and the second probability to obtain the target classification model.
在其中一个实施例中,上述训练模块可以包括:In one embodiment, the above-mentioned training module may include:
类别概率计算单元,用于根据以下公式计算每个查询样本对应的支撑样本的类别概率:The category probability calculation unit is used to calculate the category probability of the support sample corresponding to each query sample according to the following formula:
Figure PCTCN2021109571-appb-000025
Figure PCTCN2021109571-appb-000025
Figure PCTCN2021109571-appb-000026
Figure PCTCN2021109571-appb-000026
其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
目标分类模型生成单元,用于根据每个查询样本的真实分组以及类别概率对公式进行训练得到目标分类模型。The target classification model generation unit is used for training the formula according to the real grouping and the category probability of each query sample to obtain the target classification model.
关于基于元学习的目标分类装置的具体限定可以参见上文中对于基于元学习的目标分类方法的限定,在此不再赘述。上述基于元学习的目标分类装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the meta-learning-based target classification device, reference may be made to the above definition of the meta-learning-based target classification method, which will not be repeated here. Each module in the above-mentioned device for object classification based on meta-learning can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构 图可以如图4所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储新增数据以及其对应的分类数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于元学习的目标分类方法。In one embodiment, a computer device is provided, the computer device may be a server, and the internal structure diagram of which may be shown in FIG. 4 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used for storing newly added data and its corresponding classified data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instructions, when executed by a processor, implement a meta-learning based object classification method.
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 4 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤::获取新增数据,并根据新增数据构建参照样本;根据新增数据以及参照样本得到待分类目标;将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,其中,目标分类模型是基于元学习的方式训练得到的;根据第一概率确定待分类目标所属分类。A computer device includes a memory and one or more processors, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processors, causes one or more processors to perform the following steps: acquiring new data , and construct a reference sample according to the newly added data; obtain the target to be classified according to the newly added data and the reference sample; input the reference sample and the target to be classified into the pre-generated target classification model to determine that the target to be classified belongs to the classification of the reference sample The first probability, wherein the target classification model is obtained by training based on meta-learning; the category to which the target to be classified belongs is determined according to the first probability.
在一个实施例中,处理器执行计算机可读指令时所实现的新增数据包括多个分类;处理器执行计算机可读指令时所实现的根据新增数据构建参照样本包括:将新增数据按照分类进行分组,并构建每一分组对应的参照样本;将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,包括:将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于每个分类的第一概率。In one embodiment, the newly added data realized when the processor executes the computer-readable instruction includes a plurality of categories; the construction of a reference sample according to the newly-added data realized when the processor executes the computer-readable instruction includes: classifying the newly-added data according to classifying and grouping, and constructing a reference sample corresponding to each group; inputting the reference sample and the target to be classified into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The samples and the objects to be classified are input into a pre-generated object classification model to determine the first probability that the objects to be classified belong to each classification.
在一个实施例中,处理器执行计算机可读指令时所实现的目标分类模型的训练方式包括:获取样本数据,根据样本数据构建多组元训练样本;根据元训练样本进行训练得到目标分类模型。In one embodiment, the training method of the target classification model implemented when the processor executes the computer-readable instructions includes: acquiring sample data, constructing multiple sets of meta-training samples according to the sample data; and training according to the meta-training samples to obtain the target classification model.
在一个实施例中,处理器执行计算机可读指令时所实现的根据元训练样本进行训练得到目标分类模型,包括:对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;对高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。In one embodiment, when the processor executes the computer-readable instructions, the target classification model obtained by training according to the meta-training samples includes: serializing the words of each support sample and query sample of each group of meta-training samples ; Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform an average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the vector corresponding to each query sample Representation; the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
在一个实施例中,处理器执行计算机可读指令时所实现的获取样本数据,根据样本数据构建多组元训练样本,包括:爬取预设网站上已经分类完成的样本数据,对样本数据按照分类进行分组;随机从分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得 到一组元训练样本;重复随机从分组中抽取至少一个分组的步骤以得到多组元训练样本。In one embodiment, obtaining sample data when the processor executes the computer-readable instructions, and constructing multiple groups of meta-training samples according to the sample data, includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data according to classifying and grouping; randomly extract at least one group from the group, and determine that the first quantity of sample data in the at least one group extracted is a support sample, and the second quantity of sample data is a query sample; obtain a group according to the support sample and the query sample meta-training samples; repeating the step of randomly selecting at least one group from the groups to obtain multiple sets of meta-training samples.
在一个实施例中,处理器执行计算机可读指令时所实现的随机从分组中抽取至少一个分组,包括:随机从分组中抽取预设数量的分组,预设数量的分组大于等于2;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:获取查询样本对应真实分类;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,模型分类包括与预设数量的个数相对应数量的第二概率;根据真实分类和第二概率进行训练得到目标分类模型。In one embodiment, the random extraction of at least one group from the group achieved by the processor executing the computer-readable instructions includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; The vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample , and calculate the model classification corresponding to each query sample, where the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
在一个实施例中,处理器执行计算机可读指令时所实现的根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:根据以下公式计算每个查询样本对应的支撑样本的类别概率:In one embodiment, when the processor executes the computer-readable instructions, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to the query sample:
Figure PCTCN2021109571-appb-000027
Figure PCTCN2021109571-appb-000027
Figure PCTCN2021109571-appb-000028
Figure PCTCN2021109571-appb-000028
其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
根据每个查询样本的真实分组以及类别概率对公式进行训练得到目标分类模型。The target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
一个或多个存储有计算机可读指令的计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取新增数据,并根据新增数据构建参照样本;根据新增数据以及参照样本得到待分类目标;将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,其中,目标分类模型是基于元学习的方式训练得到的;根据第一概率确定待分类目标所属分类。One or more computer-readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: acquire newly added data and, according to the new The reference sample is constructed by adding data; the target to be classified is obtained according to the newly added data and the reference sample; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, The target classification model is obtained by training based on meta-learning; the classification to which the target to be classified belongs is determined according to the first probability.
其中,该计算机可读存储介质可以是非易失性,也可以是易失性的。Wherein, the computer-readable storage medium may be non-volatile or volatile.
在一个实施例中,计算机可读指令被处理器执行时所实现的新增数据包括多个分类;计算机可读指令被处理器执行时所实现的根据新增数据构建参照样本包括:将新增数据按照分类进行分组,并构建每一分组对应的参照样本;将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于参照样本所属分类的第一概率,包括:将参照样本和待分类目标输入至预先生成的目标分类模型中,以确定待分类目标属于每个分类的第一概率。In one embodiment, the newly added data realized when the computer readable instructions are executed by the processor includes a plurality of categories; the construction of a reference sample according to the newly added data realized when the computer readable instructions are executed by the processor includes: adding new data The data is grouped according to the classification, and a reference sample corresponding to each group is constructed; the reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, including: The reference sample and the target to be classified are input into the pre-generated target classification model to determine the first probability that the target to be classified belongs to each classification.
在一个实施例中,计算机可读指令被处理器执行时所实现的目标分类模型的训练方式包括:获取样本数据,根据样本数据构建多组元训练样本;根据元训练样本进行训练得到目标分类模型。In one embodiment, the training method of the target classification model realized when the computer readable instructions are executed by the processor includes: acquiring sample data, constructing multiple groups of meta training samples according to the sample data; training according to the meta training samples to obtain the target classification model .
在一个实施例中,计算机可读指令被处理器执行时所实现的根据元训练样本进行训 练得到目标分类模型,包括:对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;对高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。In one embodiment, when the computer-readable instructions are executed by the processor, the target classification model obtained by training according to the meta-training samples includes: performing a sequence sequence on each support sample of each group of meta-training samples and the words of the query sample Perform high-order feature processing on each serialized word to obtain the corresponding high-order feature representation; perform average pooling operation on the high-order feature representation to obtain the vector representation corresponding to each support sample and the corresponding high-order feature representation of each query sample. Vector representation; the target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
在一个实施例中,计算机可读指令被处理器执行时所实现的获取样本数据,根据样本数据构建多组元训练样本,包括:爬取预设网站上已经分类完成的样本数据,对样本数据按照分类进行分组;随机从分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;根据支撑样本和查询样本得到一组元训练样本;重复随机从分组中抽取至少一个分组的步骤以得到多组元训练样本。In one embodiment, the acquisition of sample data realized by the computer-readable instructions being executed by the processor, and the construction of multiple groups of meta-training samples according to the sample data, includes: crawling the sample data that has been classified on a preset website, and analyzing the sample data. Grouping according to classification; randomly extracting at least one grouping from the grouping, and determining that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample; obtain a sample data according to the support sample and the query sample. Component training samples; repeating the step of randomly extracting at least one group from the groups to obtain multiple component training samples.
在一个实施例中,计算机可读指令被处理器执行时所实现的随机从分组中抽取至少一个分组,包括:随机从分组中抽取预设数量的分组,预设数量的分组大于等于2;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:获取查询样本对应真实分类;根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,模型分类包括与预设数量的个数相对应数量的第二概率;根据真实分类和第二概率进行训练得到目标分类模型。In one embodiment, the random extraction of at least one group from the group by the computer-readable instructions when executed by the processor includes: randomly extracting a preset number of groups from the group, and the preset number of groups is greater than or equal to 2; The vector representation corresponding to each support sample and the vector representation corresponding to each query sample are trained to obtain the target classification model, including: obtaining the true classification corresponding to the query sample; according to the vector representation corresponding to each support sample and the vector corresponding to each query sample means that the model classification corresponding to each query sample is calculated, and the model classification includes a number of second probabilities corresponding to the preset number; the target classification model is obtained by training according to the real classification and the second probability.
在一个实施例中,计算机可读指令被处理器执行时所实现的根据每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:根据以下公式计算每个查询样本对应的支撑样本的类别概率:In one embodiment, when the computer-readable instructions are executed by the processor, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes: calculating each The class probability of the support sample corresponding to each query sample:
Figure PCTCN2021109571-appb-000029
Figure PCTCN2021109571-appb-000029
其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample;
根据每个查询样本的真实分组以及类别概率对公式进行训练得到目标分类模型。The target classification model is obtained by training the formula according to the real grouping and class probability of each query sample.
本发明所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和 扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application may acquire and process related data based on artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读取存储介质中,该存储介质可以是易失性的或非易失性的。该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium, the storage medium may be volatile or non-volatile. When executed, the computer-readable instructions may include the processes of the above-described method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims (20)

  1. 一种基于元学习的目标分类方法,包括:A meta-learning-based object classification method including:
    获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
    根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
    根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
  2. 根据权利要求1所述的方法,其中,所述新增数据包括多个分类;所述根据所述新增数据构建参照样本包括:The method according to claim 1, wherein the newly added data includes a plurality of categories; and the constructing a reference sample according to the newly added data comprises:
    将所述新增数据按照分类进行分组,并构建每一分组对应的参照样本;及Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and
    所述将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,包括:The inputting the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample includes:
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于每个分类的第一概率。The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
  3. 根据权利要求1或2所述的方法,其中,所述目标分类模型的训练方式包括:The method according to claim 1 or 2, wherein the training method of the target classification model comprises:
    获取样本数据,根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and
    根据所述元训练样本进行训练得到目标分类模型。The target classification model is obtained by training according to the meta-training samples.
  4. 根据权利要求3所述的方法,其中,所述根据所述元训练样本进行训练得到目标分类模型,包括:The method according to claim 3, wherein the target classification model obtained by training according to the meta-training samples comprises:
    对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;Serialize the words of each support sample and query sample of each group of training samples;
    将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;
    对所述高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;及performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and
    根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
  5. 根据权利要求4所述的方法,其中,所述获取样本数据,根据所述样本数据构建 多组元训练样本,包括:The method according to claim 4, wherein, the acquisition of sample data, according to the sample data to construct multiple groups of training samples, comprising:
    爬取预设网站上已经分类完成的样本数据,对所述样本数据按照分类进行分组;Crawling the sample data that has been classified on the preset website, and grouping the sample data according to the classification;
    随机从所述分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;Randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
    根据所述支撑样本和所述查询样本得到一组元训练样本;及obtaining a set of meta-training samples from the support samples and the query samples; and
    重复随机从所述分组中抽取至少一个分组的步骤以得到多组元训练样本。The step of randomly extracting at least one group from the groups is repeated to obtain multiple sets of meta training samples.
  6. 根据权利要求5所述的方法,其中,所述随机从所述分组中抽取至少一个分组,包括:The method of claim 5, wherein said randomly extracting at least one grouping from said groupings comprises:
    随机从所述分组中抽取预设数量的分组,所述预设数量的分组大于等于2;Randomly extract a preset number of groups from the grouping, and the preset number of groups is greater than or equal to 2;
    所述根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, including:
    获取所述查询样本对应真实分类;obtaining the real classification corresponding to the query sample;
    根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,所述模型分类包括与预设数量的个数相对应数量的第二概率;及According to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, a model classification corresponding to each query sample is calculated, where the model classification includes a number of second probabilities corresponding to a preset number of samples ;and
    根据所述真实分类和所述第二概率进行训练得到目标分类模型。A target classification model is obtained by training according to the true classification and the second probability.
  7. 根据权利要求4所述的方法,其中,所述根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:The method according to claim 4, wherein the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample comprises:
    根据以下公式计算每个查询样本对应的支撑样本的类别概率:The class probability of the support sample corresponding to each query sample is calculated according to the following formula:
    Figure PCTCN2021109571-appb-100001
    Figure PCTCN2021109571-appb-100001
    Figure PCTCN2021109571-appb-100002
    Figure PCTCN2021109571-appb-100002
    其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;及Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; and
    根据每个所述查询样本的真实分组以及所述类别概率对所述公式进行训练得到目标分类模型。The target classification model is obtained by training the formula according to the real grouping of each query sample and the class probability.
  8. 一种基于元学习的目标分类装置,,包括:A target classification device based on meta-learning, comprising:
    新增数据获取模块,用于获取新增数据,并根据所述新增数据构建参照样本;a new data acquisition module, used for acquiring new data, and constructing a reference sample according to the new data;
    待分类目标获取模块,用于根据所述新增数据以及所述参照样本得到待分类目标;an acquisition module for a target to be classified, configured to obtain a target to be classified according to the newly added data and the reference sample;
    模型处理模块,用于将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及A model processing module, configured to input the reference sample and the target to be classified into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the category to which the reference sample belongs, wherein the The object classification model is trained based on meta-learning; and
    分类模块,用于根据所述概率确定所述待分类目标所属分类。A classification module, configured to determine the classification to which the object to be classified belongs according to the probability.
  9. 根据权利要求8所述的装置,其特征在于,所述新增数据包括多个分类,所述新增数据获取模块包括:The device according to claim 8, wherein the newly added data includes a plurality of categories, and the newly added data acquisition module comprises:
    分组单元,用于将所述新增数据按照分类进行分组,并构建每一分组对应的参照样本;及a grouping unit for grouping the newly added data according to classification, and constructing a reference sample corresponding to each grouping; and
    所述模型处理模块用于将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于每个分类的第一概率。The model processing module is configured to input the reference sample and the object to be classified into a pre-generated object classification model, so as to determine the first probability that the object to be classified belongs to each classification.
  10. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device comprising a memory and one or more processors, the memory having computer-readable instructions stored in the memory that, when executed by the one or more processors, cause the one or more processors to Each processor performs the following steps:
    获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
    根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
    根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
  11. 根据权利要求10所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所涉及的所述新增数据包括多个分类;所述处理器执行所述计算机可读指令时所实现的所述根据所述新增数据构建参照样本包括:The computer device of claim 10, wherein the newly added data involved when the processor executes the computer-readable instructions includes a plurality of categories; The realized construction of a reference sample according to the newly added data includes:
    将所述新增数据按照分类进行分组,并构建每一分组对应的参照样本;及Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and
    所述处理器执行所述计算机可读指令时所实现的所述将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分 类的第一概率,包括:The inputting the reference sample and the target to be classified into a pre-generated target classification model, which is implemented when the processor executes the computer-readable instructions, to determine that the target to be classified belongs to the reference sample The first probability of the class to which it belongs, including:
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于每个分类的第一概率。The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
  12. 根据权利要求10或11所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所涉及的所述目标分类模型的训练方式包括:The computer device according to claim 10 or 11, wherein the training method of the target classification model involved when the processor executes the computer-readable instructions comprises:
    获取样本数据,根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and
    根据所述元训练样本进行训练得到目标分类模型。The target classification model is obtained by training according to the meta-training samples.
  13. 根据权利要求12所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述根据所述元训练样本进行训练得到目标分类模型,包括:The computer device according to claim 12, wherein the obtaining the target classification model by training according to the meta-training samples, which is implemented when the processor executes the computer-readable instructions, comprises:
    对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;Serialize the words of each support sample and query sample of each group of training samples;
    将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;
    对所述高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;及performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and
    根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
  14. 根据权利要求13所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述获取样本数据,根据所述样本数据构建多组元训练样本,包括:The computer device according to claim 13, wherein the acquiring sample data realized when the processor executes the computer-readable instructions, and constructing multiple groups of training samples according to the sample data, comprising:
    爬取预设网站上已经分类完成的样本数据,对所述样本数据按照分类进行分组;Crawling the sample data that has been classified on the preset website, and grouping the sample data according to the classification;
    随机从所述分组中抽取至少一个分组,并确定所抽取的至少一个分组中的第一数量样本数据为支撑样本,第二数量样本数据为查询样本;Randomly extract at least one grouping from the grouping, and determine that the first quantity of sample data in the extracted at least one grouping is a support sample, and the second quantity of sample data is a query sample;
    根据所述支撑样本和所述查询样本得到一组元训练样本;及obtaining a set of meta-training samples from the support samples and the query samples; and
    重复随机从所述分组中抽取至少一个分组的步骤以得到多组元训练样本。The step of randomly extracting at least one group from the groups is repeated to obtain multiple sets of meta training samples.
  15. 根据权利要求1所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述随机从所述分组中抽取至少一个分组,包括:The computer device of claim 1, wherein the randomly extracting at least one packet from the packets, implemented by the processor when executing the computer-readable instructions, comprises:
    随机从所述分组中抽取预设数量的分组,所述预设数量的分组大于等于2;Randomly extract a preset number of groups from the grouping, and the preset number of groups is greater than or equal to 2;
    所述处理器执行所述计算机可读指令时所实现的所述根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:When the processor executes the computer-readable instructions, the target classification model obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample includes:
    获取所述查询样本对应真实分类;obtaining the real classification corresponding to the query sample;
    根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示,计算每个查询样本对应的模型分类,所述模型分类包括与预设数量的个数相对应数量的第二概率;及According to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, a model classification corresponding to each query sample is calculated, where the model classification includes a number of second probabilities corresponding to a preset number of samples ;and
    根据所述真实分类和所述第二概率进行训练得到目标分类模型。A target classification model is obtained by training according to the true classification and the second probability.
  16. 根据权利要求13所述的计算机设备,其中,所述处理器执行所述计算机可读指令时所实现的所述根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型,包括:The computer device according to claim 13, wherein the performing the process according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample, which is implemented when the processor executes the computer-readable instructions The target classification model is obtained by training, including:
    根据以下公式计算每个查询样本对应的支撑样本的类别概率:The class probability of the support sample corresponding to each query sample is calculated according to the following formula:
    Figure PCTCN2021109571-appb-100003
    Figure PCTCN2021109571-appb-100003
    Figure PCTCN2021109571-appb-100004
    Figure PCTCN2021109571-appb-100004
    其中,Sigmod激活函数的输出为0到1之间的实数,atten是用来计算每个支撑样本对查询样本分类的贡献度,⊙代表两个向量的内积,T是一个实数,用于控制atten得到的分布的尖锐程度,k代表支撑样本的序号,k的值与支撑样本的样本数量有关;及Among them, the output of the sigmod activation function is a real number between 0 and 1, atten is used to calculate the contribution of each support sample to the query sample classification, ⊙ represents the inner product of two vectors, T is a real number, used to control The sharpness of the distribution obtained by atten, k represents the serial number of the support sample, and the value of k is related to the number of samples of the support sample; and
    根据每个所述查询样本的真实分组以及所述类别概率对所述公式进行训练得到目标分类模型。The target classification model is obtained by training the formula according to the real grouping of each query sample and the class probability.
  17. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
    获取新增数据,并根据所述新增数据构建参照样本;acquiring new data, and constructing a reference sample according to the new data;
    根据所述新增数据以及所述参照样本得到待分类目标;obtaining the target to be classified according to the newly added data and the reference sample;
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,其中,所述目标分类模型是基于元学习的方式训练得到的;及The reference sample and the target to be classified are input into a pre-generated target classification model to determine the first probability that the target to be classified belongs to the classification of the reference sample, wherein the target classification model is based on the acquired through training in the manner of learning; and
    根据所述第一概率确定所述待分类目标所属分类。The category to which the object to be classified belongs is determined according to the first probability.
  18. 根据权利要求17所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所涉及的所述新增数据包括多个分类;所述计算机可读指令被所述处理器执行时所实 现的所述根据所述新增数据构建参照样本包括:18. The storage medium of claim 17, wherein the newly added data involved when the computer-readable instructions are executed by the processor includes a plurality of categories; the computer-readable instructions are executed by the processor The construction of the reference sample according to the newly added data realized at the time includes:
    将所述新增数据按照分类进行分组,并构建每一分组对应的参照样本;及Grouping the newly added data according to categories, and constructing a reference sample corresponding to each grouping; and
    所述计算机可读指令被所述处理器执行时所实现的所述将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于所述参照样本所属分类的第一概率,包括:The inputting the reference sample and the object to be classified into a pre-generated object classification model implemented when the computer readable instructions are executed by the processor, so as to determine that the object to be classified belongs to the reference The first probability of the class to which the sample belongs, including:
    将所述参照样本和所述待分类目标输入至预先生成的目标分类模型中,以确定所述待分类目标属于每个分类的第一概率。The reference sample and the object to be classified are input into a pre-generated object classification model to determine the first probability that the object to be classified belongs to each classification.
  19. 根据权利要求17或18所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所涉及的所述目标分类模型的训练方式包括:The storage medium according to claim 17 or 18, wherein the training manner of the target classification model involved when the computer readable instructions are executed by the processor comprises:
    获取样本数据,根据所述样本数据构建多组元训练样本;及obtaining sample data, and constructing multiple sets of meta-training samples based on the sample data; and
    根据所述元训练样本进行训练得到目标分类模型。The target classification model is obtained by training according to the meta-training samples.
  20. 根据权利要求19所述的存储介质,其中,所述计算机可读指令被所述处理器执行时所实现的所述根据所述元训练样本进行训练得到目标分类模型,包括:The storage medium according to claim 19, wherein the target classification model obtained by training according to the meta-training samples, which is implemented when the computer-readable instructions are executed by the processor, comprises:
    对每一组元训练样本的每个支撑样本和查询样本的单词进行序列化;Serialize the words of each support sample and query sample of each group of training samples;
    将序列化后的每个单词进行高阶特征处理得到对应的高阶特征表示;Perform high-level feature processing on each word after serialization to obtain the corresponding high-level feature representation;
    对所述高阶特征表示进行平均池化操作以得到每个支撑样本对应的向量表示以及每个查询样本对应的向量表示;及performing an average pooling operation on the higher-order feature representations to obtain a vector representation corresponding to each support sample and a vector representation corresponding to each query sample; and
    根据所述每个支撑样本对应的向量表示以及每个查询样本对应的向量表示进行训练得到目标分类模型。The target classification model is obtained by training according to the vector representation corresponding to each support sample and the vector representation corresponding to each query sample.
PCT/CN2021/109571 2020-12-21 2021-07-30 Meta-learning-based target classification method and apparatus, device and storage medium WO2022134586A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011523348.0 2020-12-21
CN202011523348.0A CN112613555A (en) 2020-12-21 2020-12-21 Object classification method, device, equipment and storage medium based on meta learning

Publications (1)

Publication Number Publication Date
WO2022134586A1 true WO2022134586A1 (en) 2022-06-30

Family

ID=75243956

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109571 WO2022134586A1 (en) 2020-12-21 2021-07-30 Meta-learning-based target classification method and apparatus, device and storage medium

Country Status (2)

Country Link
CN (1) CN112613555A (en)
WO (1) WO2022134586A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613555A (en) * 2020-12-21 2021-04-06 深圳壹账通智能科技有限公司 Object classification method, device, equipment and storage medium based on meta learning
CN113392642B (en) * 2021-06-04 2023-06-02 北京师范大学 Automatic labeling system and method for child care cases based on meta learning
CN113689234B (en) * 2021-08-04 2024-03-15 华东师范大学 Platform-related advertisement click rate prediction method based on deep learning
CN113505861B (en) * 2021-09-07 2021-12-24 广东众聚人工智能科技有限公司 Image classification method and system based on meta-learning and memory network
CN114842246A (en) * 2022-04-19 2022-08-02 清华大学 Social media pressure category detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961089A (en) * 2019-02-26 2019-07-02 中山大学 Small sample and zero sample image classification method based on metric learning and meta learning
CN111582360A (en) * 2020-05-06 2020-08-25 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for labeling data
US20200327445A1 (en) * 2019-04-09 2020-10-15 International Business Machines Corporation Hybrid model for short text classification with imbalanced data
CN111985581A (en) * 2020-09-09 2020-11-24 福州大学 Sample-level attention network-based few-sample learning method
CN112613555A (en) * 2020-12-21 2021-04-06 深圳壹账通智能科技有限公司 Object classification method, device, equipment and storage medium based on meta learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919317B (en) * 2018-01-11 2024-06-04 华为技术有限公司 Machine learning model training method and device
CN111639181A (en) * 2020-04-30 2020-09-08 深圳壹账通智能科技有限公司 Paper classification method and device based on classification model, electronic equipment and medium
CN111767400B (en) * 2020-06-30 2024-04-26 平安国际智慧城市科技股份有限公司 Training method and device for text classification model, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961089A (en) * 2019-02-26 2019-07-02 中山大学 Small sample and zero sample image classification method based on metric learning and meta learning
US20200327445A1 (en) * 2019-04-09 2020-10-15 International Business Machines Corporation Hybrid model for short text classification with imbalanced data
CN111582360A (en) * 2020-05-06 2020-08-25 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for labeling data
CN111985581A (en) * 2020-09-09 2020-11-24 福州大学 Sample-level attention network-based few-sample learning method
CN112613555A (en) * 2020-12-21 2021-04-06 深圳壹账通智能科技有限公司 Object classification method, device, equipment and storage medium based on meta learning

Also Published As

Publication number Publication date
CN112613555A (en) 2021-04-06

Similar Documents

Publication Publication Date Title
WO2022134586A1 (en) Meta-learning-based target classification method and apparatus, device and storage medium
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
CN110866140B (en) Image feature extraction model training method, image searching method and computer equipment
JP6956177B2 (en) Keyword extraction method, computer equipment and storage media
CN111859986B (en) Semantic matching method, device, equipment and medium based on multi-task twin network
CN112711953A (en) Text multi-label classification method and system based on attention mechanism and GCN
CN111914159B (en) Information recommendation method and terminal
CN112380344B (en) Text classification method, topic generation method, device, equipment and medium
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
US11354567B2 (en) Systems and methods for classifying data sets using corresponding neural networks
Adib et al. A deep hybrid learning approach to detect bangla fake news
CN112115265A (en) Small sample learning method in text classification
CN114925238B (en) Federal learning-based video clip retrieval method and system
CN112732872B (en) Biomedical text-oriented multi-label classification method based on subject attention mechanism
CN110232128A (en) Topic file classification method and device
CN112100377A (en) Text classification method and device, computer equipment and storage medium
Perri et al. Binary classification of proteins by a machine learning approach
CN112580329B (en) Text noise data identification method, device, computer equipment and storage medium
US10614031B1 (en) Systems and methods for indexing and mapping data sets using feature matrices
CN110598869B (en) Classification method and device based on sequence model and electronic equipment
CN111709225A (en) Event cause and effect relationship judging method and device and computer readable storage medium
CN112487406B (en) Network behavior analysis method based on machine learning
CN112464660A (en) Text classification model construction method and text data processing method
WO2022127124A1 (en) Meta learning-based entity category recognition method and apparatus, device and storage medium
CN109472319B (en) Three-dimensional model classification method and retrieval method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908591

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02/10/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21908591

Country of ref document: EP

Kind code of ref document: A1