CN113378993A

CN113378993A - Artificial intelligence based classification method, device, equipment and storage medium

Info

Publication number: CN113378993A
Application number: CN202110779021.8A
Authority: CN
Inventors: 康焱; 刘洋
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2021-09-10
Anticipated expiration: 2041-07-09
Also published as: CN113378993B

Abstract

The application provides a classification method, a classification device, classification equipment and a computer-readable storage medium based on artificial intelligence; the method comprises the following steps: grouping a first sample set of a first service scene and a second sample set of a second service scene according to sample attribute characteristics respectively to obtain a plurality of characteristic groups and monomer characteristics in each service scene; performing cross operation on the plurality of feature groups under each service scene to obtain a plurality of cross feature groups; training at least one feature extraction model according to the plurality of feature sets and the plurality of cross feature sets; according to the monomer characteristics in the first service scene and the first sample characteristics output by each characteristic extraction model, representing a training category prediction model, and continuing to train at least one characteristic extraction model; and predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model. By the method and the device, the prediction accuracy can be improved while the transfer learning efficiency is improved.

Description

Artificial intelligence based classification method, device, equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a classification method, apparatus, device, and computer-readable storage medium based on artificial intelligence.

Background

With the rapid development of artificial intelligence, machine learning can achieve good performance and effect by performing supervised training on a large amount of marked data, however, large-sized marked data sets are limited in quantity and application fields, and manually marking a sufficient amount of training data often requires high cost.

However, the deep learning model with strong migration capability lacks interpretability, and the deep learning model with strong interpretability has weak migration capability, so that the deep learning model trained for a long time in the migration learning process cannot reach the target performance, and the migration learning efficiency is low. For how to improve the transfer learning efficiency and ensure the prediction accuracy, no effective solution is available in the related technology.

Disclosure of Invention

The embodiment of the application provides a classification method and device based on artificial intelligence and a computer-readable storage medium, which can improve the transfer learning efficiency and achieve the target performance.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a classification method based on artificial intelligence, which comprises the following steps:

grouping a first sample set of a first service scene and a second sample set of a second service scene according to sample attribute characteristics respectively to obtain a plurality of characteristic groups and monomer characteristics in each service scene;

performing cross operation on the plurality of feature groups under each service scene to obtain a plurality of cross feature groups;

training at least one feature extraction model according to the plurality of feature sets and the plurality of cross feature sets;

according to the monomer characteristics in the first service scene and the first sample characteristics output by each characteristic extraction model, representing a training category prediction model, and continuing to train at least one characteristic extraction model;

and predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model.

The embodiment of the application provides a sorter based on artificial intelligence, includes:

the characteristic grouping module is used for grouping a first sample set of a first service scene and a second sample set of a second service scene according to the attribute characteristics of the samples respectively to obtain a plurality of characteristic groups and monomer characteristics in each service scene;

the cross module is used for performing cross operation on the plurality of feature groups under each service scene to obtain a plurality of cross feature groups;

a first training module for training at least one feature extraction model based on the plurality of feature sets and the plurality of cross feature sets;

the second training module is used for representing a training category prediction model according to the individual characteristics in the first service scene and the first sample characteristics output by each characteristic extraction model, and continuing to train at least one characteristic extraction model;

a prediction module for predicting the class label of the sample to be tested in the second service scene according to the trained at least one feature extraction model and the trained class prediction model

In the foregoing solution, the feature grouping module is further configured to perform the following processing for each sample in the first sample set of the first service scenario: grouping a plurality of monomer features with the same sample attribute in each sample to obtain a plurality of first sample feature groups, and taking the monomer features which are not used for grouping in each sample as the first sample monomer features; performing the following for each sample in a second set of samples of a second traffic scenario: and grouping a plurality of monomer features of each sample with the same sample attribute to obtain a plurality of second sample feature groups, and taking the monomer features which are not used for grouping in each sample as the monomer features of the second sample.

In the above scheme, the interleaving module is further configured to perform an interleaving operation on any two of the plurality of first sample feature groups according to an interleaving operation function to obtain a plurality of first sample interleaving feature groups; and performing cross operation on any two of the second sample feature groups according to the cross operation function to obtain a plurality of second sample cross feature groups.

In the above scheme, the first training module is further configured to determine, according to the feature extraction models respectively corresponding to the plurality of first sample feature groups, first sample feature characterizations respectively corresponding to the plurality of first sample feature groups; determining second sample feature characterizations respectively corresponding to the plurality of second sample feature groups according to feature extraction models respectively corresponding to the plurality of second sample feature groups, wherein the feature extraction models corresponding to the first sample feature group and the second sample feature group with the same sample attributes are the same; determining first sample feature representations respectively corresponding to the plurality of first sample cross feature groups according to the feature extraction models respectively corresponding to the plurality of first sample cross feature groups; determining second sample feature characterizations corresponding to the plurality of second sample cross feature groups respectively according to feature extraction models corresponding to the plurality of second sample cross feature groups respectively, wherein feature extraction models corresponding to a first sample cross feature group and a second sample cross feature group with the same sample attributes are the same; determining a domain distinguishing loss value of the domain distinguishing model according to a domain distinguishing model corresponding to each first sample characteristic characterization and each second sample characteristic characterization with the same sample attribute, and updating parameters of a feature extraction model corresponding to the first sample characteristic characterization and the second sample characteristic characterization with the same sample attribute according to the domain distinguishing loss value of the domain distinguishing model; and obtaining trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters of each feature extraction model.

In the above scheme, the second training module is further configured to call a category prediction model based on the monomer feature and the first sample feature representation in the first service scenario, so as to obtain a first sample prediction category label output by the category prediction model; determining a category prediction loss value according to the first sample prediction category label and the first sample real category label; updating parameters of the category prediction model according to the category prediction loss value, and acquiring a trained category prediction model based on the updated parameters; continuing to train at least one feature extraction model, comprising: and continuously updating parameters of the feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the category prediction loss values, and acquiring continuously trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters.

In the above scheme, the first training module is further configured to determine, according to the feature extraction model, first sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively, and determine, according to the feature extraction model, second sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively; determining corresponding domain distinguishing loss values according to each first sample characteristic feature, each second sample characteristic feature and corresponding domain distinguishing models; and updating parameters of the feature extraction model according to the distinguishing loss value of each field, and obtaining the trained feature extraction model based on the updated parameters.

In the above scheme, the first training module is further configured to call, based on each first sample feature characterization, a domain distinguishing model corresponding to each first sample feature characterization to obtain a first sample prediction domain label of each first sample feature characterization output by the domain distinguishing model, and determine a corresponding domain distinguishing loss value according to the first sample prediction domain label and the first sample true domain label; calling a domain distinguishing model corresponding to each second sample feature characterization based on each second sample feature characterization to obtain a second sample prediction domain label of each second sample feature output by the domain distinguishing model, and determining a corresponding domain distinguishing loss value according to the second sample prediction domain label and a second sample real domain label, wherein the first sample feature characterization and the second sample feature characterization having the same sample attribute correspond to the same domain distinguishing model.

In the above scheme, the second training module is further configured to call a category prediction model based on the monomer feature and the first sample feature representation in the first service scenario, obtain a first sample prediction category label output by the category prediction model, and determine a category prediction loss value according to the first sample prediction category label and the first sample true category label; updating parameters of the category prediction model according to the category prediction loss value, and acquiring a trained category prediction model based on the updated parameters; continuing to train at least one feature extraction model, comprising: and continuously updating parameters of the feature extraction model according to the class prediction loss value, and obtaining the trained feature extraction model based on the updated parameters.

In the above scheme, the prediction module is further configured to call, based on a plurality of feature groups in a second service scenario, trained feature extraction models corresponding to the plurality of feature groups, respectively, to obtain updated second sample feature representations output by the trained feature extraction models and corresponding to the plurality of feature groups, respectively; calling the trained feature extraction models corresponding to the plurality of cross feature groups respectively based on the plurality of cross feature groups in the second service scene to obtain updated second sample feature representations output by the trained feature extraction models and corresponding to the plurality of cross feature groups respectively; and calling the trained class prediction model based on the updated second sample characteristic feature and the monomer feature in the second service scene to obtain a class label of the second service scene output by the trained class prediction model.

In the above scheme, the prediction module is further configured to call a feature extraction model based on a plurality of feature groups and a plurality of cross feature groups in a second service scenario, so as to obtain updated second sample feature characterizations output by the feature extraction model and respectively corresponding to the plurality of feature groups and the plurality of cross feature groups; and calling the trained category prediction model based on the updated second sample characteristic feature and the monomer feature under the second service scene to obtain a category label of the second service scene output by the trained category prediction model.

An embodiment of the present application provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the artificial intelligence-based classification method provided by the embodiment of the application when the processor executes the executable instructions stored in the memory.

The embodiment of the application provides a computer-readable storage medium, which stores executable instructions for causing a processor to execute, so as to implement the artificial intelligence-based classification method provided by the embodiment of the application.

The embodiment of the application has the following beneficial effects:

the first sample set of the first service scene and the second sample set of the second service scene are respectively subjected to grouping processing according to the sample attribute characteristics to obtain a plurality of characteristic groups and single characteristics under each service scene, and the plurality of characteristic groups are subjected to cross operation to obtain a plurality of cross characteristic groups, so that the grouping processing accuracy can be further improved, and the high-order characteristics learned from each characteristic group and the cross characteristic groups subsequently have higher interpretability. The feature extraction model is continuously trained while the category prediction model is trained, so that the training efficiency is improved, the performance of the feature extraction model is effectively improved, and the category label of the sample to be tested can be accurately predicted according to the trained at least one feature extraction model and the trained category prediction model. The transfer learning efficiency is improved, and meanwhile the prediction accuracy is improved.

Drawings

FIG. 1 is an alternative structural diagram based on a system architecture provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a sample set of business scenarios provided by an embodiment of the present application;

FIG. 4 is a schematic flow chart of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 5 is a schematic flow chart of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of the interleaving operation provided by the embodiments of the present application;

FIG. 7 is a flowchart illustrating an artificial intelligence based classification method according to an embodiment of the present application;

FIG. 8 is a schematic flow chart diagram of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 9 is a flowchart illustrating an artificial intelligence based classification method according to an embodiment of the present application;

FIG. 10 is a schematic flow chart diagram of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 11 is a flowchart illustrating an artificial intelligence based classification method according to an embodiment of the present application;

FIG. 12 is a schematic flow chart diagram of an artificial intelligence-based classification method provided by an embodiment of the present application;

FIG. 13 is a schematic flow chart diagram illustrating an artificial intelligence based classification method according to an embodiment of the present application;

FIG. 14 is a schematic diagram illustrating a principle of a transfer learning model training provided by an embodiment of the present application;

fig. 15 is a schematic diagram of a principle of migration learning model prediction according to an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Artificial intelligence, which is the subject of studying to make computer simulate some thinking process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of human, mainly includes the principle of computer to realize intelligence, and the computer made up by using human brain intelligence to make computer implement higher-level application.

2) Transfer learning, which is a method of machine learning, refers to a pre-trained model being reused in another task. Migratory learning is related to problems of multitask learning and concept drift, and is not a special field of machine learning. However, migration learning is very popular in some deep learning problems, such as where there are a large number of resources required to train a deep model or where there are a large number of data sets used to pre-train a model. The migration learning only works if the depth model features in the first task are generalization features. This migration in deep learning is referred to as inductive migration. It is an advantageous way to narrow the search range of possible models by using a model that is suitable for different but related tasks.

3) Deep learning, which is the intrinsic law and expression level of the learning sample data, is a great help for the interpretation of data such as characters, images and sounds. The final aim of the method is to enable the machine to have the analysis and learning capability like a human, and to recognize data such as characters, images and sounds. Deep learning is a complex machine learning algorithm, and achieves the effect in speech and image recognition far exceeding the prior related art.

In order to better understand the artificial intelligence based classification method provided in the embodiments of the present application, first, an artificial intelligence based classification method in the related art is described:

machine learning can achieve good performance and effect by performing supervised training on a large amount of marked data, however, a large-sized marked data set is limited in quantity and application fields, and manually marking a sufficient amount of training and data often requires high cost. Aiming at the problem, a transfer learning method is generally adopted to solve the problem, namely, a discriminator is trained to adjust parameters of a transfer learning network, so that the distribution deviation between data in a source field and data in a target field is reduced under the transfer learning network after the parameters are adjusted, and the transfer learning network has a better effect when the target field is applied to finish a target task. However, the lack of interpretability of deep learning models makes them difficult to use for migratory learning in applications that require model interpretability (e.g., financial risk control), and low complexity deep learning models have a weak ability to learn migratable knowledge from raw data and thus are not robust in migratory ability. This creates a contradiction, the deep learning model with strong migration capability lacks interpretability, and the deep learning model with strong interpretability has weak migration capability, so that the deep learning model cannot take account of both interpretability and migration capability. Meanwhile, in the transfer learning, because the data of different fields or scenes cannot be effectively distinguished, the trained model can reach the target performance only after being trained for a longer time in the transfer learning process, so that the transfer learning efficiency is low, a computer needs to consume a large amount of resources and computing power, and the utilization rate of computer computing power resources is low.

In the implementation process of the embodiment of the application, the following problems are found in the related art:

with the surprising effect of deep learning models in many practical applications, deep learning models have become the mainstream of modern artificial intelligence. One important factor that supports the success of these deep learning models is the massive amount of high-quality annotation data.

Obtaining annotated data, however, is often costly because of the large human and material costs associated with screening and annotating the data. This is particularly true in highly specialized fields (e.g., financial, medical fields). A relatively common scheme for solving the problem that high-quality labeled data is difficult to obtain is to perform knowledge migration on the data-deficient field by utilizing the field with rich data resources through a field self-adaptive method, so that the requirement on the labeled data is reduced.

The traditional domain adaptive method achieves the purpose of domain adaptation by training a migratable (neural network) feature extraction model on all features. However, training only one feature extraction model based on all features has the following two disadvantages:

the output of the feature extraction model lacks interpretability. For each original sample, the output of the feature extraction model is a feature vector, and we do not know how the feature extraction model is computed to obtain the feature vector.

The feature extraction model achieves the purpose of field adaptation by extracting high-order features with unchanged fields from original data. Training a single feature extraction model on all features may not be able to learn good domain invariant features. Since features with different variations may have a negative effect on the learning of domain-invariant high-order features.

Based on the above analysis, the applicant finds that, in the prior art, a deep learning model with strong migration learning capability lacks interpretability, and the deep learning model with strong interpretability has weak migration capability, so that the deep learning model which is often trained for a long time in the migration learning process cannot reach the target performance, and the efficiency of the migration learning is low. In order to solve the above problem, the embodiment of the present application provides an artificial intelligence based translation method that improves the transfer learning efficiency and simultaneously ensures the prediction accuracy.

The embodiments of the present application provide a classification method, apparatus, device, and computer-readable storage medium based on artificial intelligence, which can improve the efficiency of migration learning and achieve target performance, and an exemplary application of the classification device based on artificial intelligence provided in the embodiments of the present application is described below.

Taking an example that the electronic device provided by the embodiment of the present application is a server, referring to fig. 1, fig. 1 is a schematic structural diagram of an artificial intelligence based server 400 provided by the embodiment of the present application, and the server 400 shown in fig. 1 includes: at least one processor 410, memory 450, at least one network interface 420, and a user interface 430. The various components in server 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in FIG. 1.

The Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 410.

The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 450 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 450 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.

An operating system 451, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 452 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

in some embodiments, the artificial intelligence based classification apparatus provided in the embodiments of the present application may be implemented in software, and fig. 1 illustrates an apparatus 455 stored in a memory 450, which may be software in the form of programs and plug-ins, and includes the following software modules: a feature grouping module 4551, a cross-over module 4552, a first training module 4553, a second training module 4554 and a prediction module 4555, which are logical and thus can be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be explained below.

In other embodiments, the artificial intelligence based classification Device provided in the embodiments of the present Application may be implemented in hardware, and for example, the artificial intelligence based classification Device provided in the embodiments of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the artificial intelligence based classification method provided in the embodiments of the present Application, for example, the processor in the form of the hardware decoding processor may be one or more Application Specific Integrated Circuits (ASICs), DS ps, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate arrays (fpgas), or other electronic components.

The artificial intelligence based classification method provided by the embodiment of the present application will be described below with reference to an exemplary application and implementation of the terminal provided by the embodiment of the present application.

Referring to fig. 2, fig. 2 is an alternative flowchart of the artificial intelligence based classification method provided in the embodiment of the present application, which will be described with reference to the steps shown in fig. 2.

And 11, grouping the first sample set of the first service scene and the second sample set of the second service scene according to the sample attribute characteristics respectively to obtain a plurality of characteristic groups and monomer characteristics in each service scene.

Referring to fig. 3, fig. 3 is a schematic diagram of a sample set of service scenarios provided in an embodiment of the present application, and a first sample set of a first service scenario and a second sample set of a second service scenario will be schematically illustrated in conjunction with the sample set of the service scenario illustrated in fig. 3.

The first service scenario and the second service scenario may be two service scenarios that are similar but different.

For example, the first service scenario is credit score and fraud score, and the second service scenario is fraud score, where the first service scenario may have a large number of samples, and the second service scenario may have a certain number of samples, that is, the number of the first samples is far greater than the number of the second samples. Each sample in the first sample corresponds to a category label and a domain label, each sample in the second sample corresponds to a domain label, the domain label of the first sample is different from the domain label of the second sample, for example, the domain label corresponds to a business scenario, the domain label of each sample in the first sample is 1, and the domain label of each sample in the second sample is 0. In fig. 3, one of the first samples in a first service scenario and one of the second samples in a second service scenario are schematically depicted.

For another example, the first service scenario may be a private account pair, and the second service scenario may be a public account pair; the first service scenario may be a loan service, and the second service scenario may be a storage service; the first service scenario may be a financial service and the second application scenario may be a credit card service. The first service scenario and the second service scenario are not limited to the service scenario in the financial field, but may also be the service scenario in the manufacturing field, and the like, and the specific type of the service scenario does not constitute a limitation to the present application.

In some embodiments, the samples may be user samples in a corresponding service scenario, the first sample in the first service scenario may be a user sample in a private-to-private transfer service scenario, the second sample in the second service scenario may be a user sample in a revolving-to-public-account service scenario, the first sample set in the first service scenario may be a set of all user samples in the private-to-public transfer service scenario, and the second sample set in the second service scenario may be a set of all user samples in the revolving-to-public-account service scenario. Typically, the size of the user for the private transfer is much larger than the size of the user for the public transfer, i.e. the number of first samples is much larger than the number of second samples.

For example, a user may have both business requirements for a public account and a private account, and thus the user may be both a first sample of a first business scenario and a second sample of a second business scenario. Another user may only have business requirements for a public account but not for a private account, and thus is only the first sample of the first business scenario. That is, the same samples may be in the first sample set of the first service scenario and the second sample set of the second service scenario, but the samples in the first sample set of the first service scenario and the second sample set of the second service scenario are not completely the same, so that the first service scenario and the second service scenario may be two similar but not completely the same service scenarios, and in the process of transfer learning, the two similar but not completely the same service scenarios may complete transfer learning more efficiently, while the time consumption of transfer learning of the two completely different service scenarios may be longer than that of the two similar but not the same service scenarios.

In some embodiments, the sample attribute feature may be an attribute feature of the user, for example, the attribute feature of the user may be a social attribute feature, a work attribute feature, a habit attribute feature, a financing attribute feature, a consumption attribute feature, an age attribute feature, and the like, and the specific type of the business attribute feature does not constitute a limitation of the present application.

For example, a user sample set in a private account transfer service scenario and a user sample set in a revolution account transfer service scenario are respectively grouped according to sample attribute characteristics, so that a plurality of characteristic groups and individual characteristics in each service scenario can be obtained.

For example, in a private transfer service scenario, individual features of a user sample may be grouped according to a social attribute feature, a work attribute feature, a habit attribute feature, a financing attribute feature, and the like, for example, a feature group of the user sample may be a work attribute feature group, and the work attribute feature group may have at least two individual features, for example, a work address feature, a work type feature, a salary feature, and the like.

In some embodiments, the samples in the first set of samples of the first traffic scenario and the samples in the second set of samples of the second traffic scenario have the same feature space. The first sample set of the first service scenario and the second sample set of the second service scenario may be used for a classification problem, a regression problem, and the like, and the specific use of the first sample set of the first service scenario and the second sample set of the second service scenario does not constitute a limitation of the present application.

In some embodiments, referring to fig. 4, fig. 4 is a schematic flowchart of an artificial intelligence based classification method provided in an embodiment of the present application, and step 11 may be implemented by step 111 and step 112 shown in fig. 4, which will be described in detail with reference to each step.

Step 111, for each sample of the first set of samples of the first traffic scenario, performing the following: and grouping a plurality of individual features with the same sample attribute in each sample to obtain a plurality of first sample feature groups, and taking the individual features which are not used for grouping in each sample as the first sample individual features.

For example, for a first sample in the first sample set of the first traffic scenario, referring to fig. 3, the plurality of first sample feature groups may be F₁ ^A、F₂ ^A、F₃ ^A……F_i ^A……F_N ^AFor example, the first sample feature set F₁ ^AMay be comprised of a plurality of first sample monomer features having the same sample property. For example, the first sample feature set F₁ ^ASample property of the first sample monomer feature of (a) and a first sample feature group F₂ ^AThe sample properties of the first sample monomer features in (a) are different.

For example, referring to FIG. 3, the first sample monomer can be characterized as f₁ ^A、f₂ ^A……f_j ^A……f_M ^AIt is understood that the first sample monomer characteristic f₁ ^A、f₂ ^A……f_j ^A……f_M ^AAre not used for packet processing, i.e. f₁ ^A、f₂ ^A……f_j ^A……f_M ^AAre not in any of the first sample feature groups F₁ ^A、F₂ ^A、F₃ ^A……F_i ^A……F_N ^AIn (1).

Step 112, for each sample of the second set of samples of the second traffic scenario, performing the following: and grouping a plurality of monomer features of each sample with the same sample attribute to obtain a plurality of second sample feature groups, and taking the monomer features which are not used for grouping in each sample as the monomer features of the second sample.

For example, grouping is performed for each sample in the second set of samples of the second traffic scenario. For one second sample of the second set of samples of the second traffic scenario, see fig. 3, a plurality ofThe second sample feature set may be F₁ ^B、F₂ ^B、F₃ ^B……F_i ^B……F_N ^BFor example, the second sample feature set F₁ ^BMay be comprised of a plurality of second sample monomer features having the same sample property. For example, the second sample feature set F₁ ^BAnd a second sample feature set F and sample attributes of a second sample individual feature of (a)₂ ^BThe sample properties of the second sample monomer features in (a) are different.

For example, referring to FIG. 3, the second sample monomer feature may be f₁ ^B、f₂ ^B……f_j ^B……f_M ^BIt can be understood that the second sample monomer characteristic f₁ ^B、f₂ ^B……f_j ^B……f_M ^BAre not used for packet processing, i.e. f₁ ^B、f₂ ^B……f_j ^B……f_M ^BAre not in any second sample feature set F₁ ^B、F₂ ^B、F₃ ^B……F_i ^B……F_N ^BIn (1).

In some embodiments, a second sample of a second service scene and a second sample of the second service scene may be obtained first, the second sample and features of the second sample are grouped according to sample attribute features, for example, the features of the second sample of the second service scene are grouped according to the sample attribute features to obtain a preset number of second sample feature groups, and the features of the second sample of the second service scene are grouped according to the sample attribute features to obtain a preset number of second sample feature groups, so that the features in each group of samples are descriptions of a certain attribute of the sample, each sample feature group may be a relatively abstract description of a certain aspect of the sample, for example, features such as marriage dating, and wechat dating may be grouped, and the group of features is descriptions of social attributes of the sample.

And step 12, performing cross operation on the plurality of feature groups in each service scene to obtain a plurality of cross feature groups.

In some embodiments, the interleaving operation may interleave any two feature groups in the plurality of feature groups, the interleaving operation may be a concatenation, an inner product, an outer product, or the like, and the specific type of interleaving operation does not constitute a limitation to the embodiments of the present application.

In some embodiments, referring to fig. 5, fig. 5 is a schematic flowchart of an artificial intelligence based classification method provided in an embodiment of the present application, and step 12 may be implemented by step 121 and step 122 shown in fig. 5, and will be described with reference to each step.

And step 121, performing cross operation on any two of the plurality of first sample feature groups according to the cross operation function to obtain a plurality of first sample cross feature groups.

In some embodiments, the expression of the crossover operation function may be:

g(F_i,F_j),i≠j

wherein, F_jAnd F_iCan represent a different first set of sample features, the cross-over function g (F)_i,F_j) May be a pair F_iAnd F_jThe specific operation type of the cross operation function does not constitute a limitation of the present application.

And step 122, performing cross operation on any two of the second sample feature groups according to the cross operation function to obtain a plurality of second sample cross feature groups.

In some embodiments, the expression of the crossover operation function may be:

g(F_i,F_j),i≠j

wherein, F_jAnd F_iCan represent a different second sample feature set, the cross-over function g (F)_i,F_j) May be a pair F_iAnd F_jThe specific operation type of the cross operation function does not constitute a limitation of the present application.

Referring to fig. 6, fig. 6 is a schematic diagram of a crossover operation provided by an embodiment of the present application. In fig. 6, a plurality of second sample feature groups F₁、F₂、F₃Can be interleaved as shown in fig. 6, F₁And F₂Can be operated by interleaving the function g (F)₁,F₂) Performing a crossover operation to obtain F₁And F₂A second set of sample cross features of (a); f₁And F₃Can be operated by interleaving the function g (F)₁,F₃) Performing a crossover operation to obtain F₁And F₃A second set of sample cross features of (a); f₃And F₂Can be operated by interleaving the function g (F)₂,F₃) Performing a crossover operation to obtain F₃And F₂Cross the feature set.

And step 13, training at least one feature extraction model according to the plurality of feature groups and the plurality of cross feature groups.

In some embodiments, the feature set may include a first sample feature set and a second sample feature set, and the cross-feature set may include a first sample cross-feature set and a second sample cross-feature set. The feature extraction model achieves the purpose of field adaptation by extracting high-order features with unchanged fields from original data.

The feature extraction model may be a deep neural network model, and the feature extraction model may be used for feature learning.

In some embodiments, referring to fig. 7, fig. 7 is a schematic flowchart of a classification method based on artificial intelligence provided in an embodiment of the present application, and step 13 may be implemented by steps 1311 to 1316 shown in fig. 7, which will be described with reference to the steps.

Step 1311, determining first sample feature characterizations corresponding to the plurality of first sample feature groups respectively according to the feature extraction models corresponding to the plurality of first sample feature groups respectively.

Referring to fig. 14, fig. 14 is a schematic diagram of a principle of the transfer learning model training provided by the embodiment of the present application. In FIG. 14, a plurality of first sample feature groups F are identified₁ ^A、F₂ ^A、F₃ ^ARespectively corresponding feature extraction model R₁、R₂、R₃Determining a plurality of first sample feature groups F₁ ^A、F₂ ^A、F₃ ^ARespectively corresponding first sample characterization r₁ ^A、r₂ ^A、r₃ ^A。

It will be appreciated that in fig. 14, three first sample feature groups F are only schematically shown₁ ^A、F₂ ^A、F₃ ^AThe first sample feature set may also have F₄ ^A、F₅ ^AEtc., the number of first sample feature sets may also be 2, 4, 5, etc., the actual number of first sample feature sets not being sufficient to limit the present application.

Step 1312, determining second sample feature characterizations corresponding to the plurality of second sample feature groups respectively according to the feature extraction models corresponding to the plurality of second sample feature groups respectively, wherein the feature extraction models corresponding to the first sample feature group and the second sample feature group with the same sample attributes are the same.

For example, referring to FIG. 14, a plurality of first sample feature groups F₁ ^B、F₂ ^B、F₃ ^BRespectively corresponding feature extraction model R₁、R₂、R₃Determining a plurality of first sample feature groups F₁ ^B、F₂ ^B、F₃ ^BRespectively corresponding first sample characterization r₁ ^B、r₂ ^B、r₃ ^B。

It will be appreciated that in fig. 14, three first sample feature groups F are only schematically shown₁ ^B、F₂ ^B、F₃ ^BThe first sample feature set may also have F₄ ^B、F₅ ^BEtc., the number of first sample feature sets may also be 2, 4, 5, etc., the actual number of first sample feature setsThe number is not intended to be limiting of the present application.

In some embodiments, the feature extraction models corresponding to the first sample feature set and the second sample feature set having the same sample property are the same, see fig. 14, and the first sample feature set F having the same sample property₁ ^AAnd a second sample feature set F₁ ^BThe corresponding feature extraction models are the same, and the first sample feature group F with the same sample attributes₂ ^AAnd a second sample feature set F₂ ^BThe corresponding feature extraction models are the same, and the first sample feature group F with the same sample attributes₃ ^BAnd a second sample feature set F₃ ^BThe corresponding feature extraction models are the same.

Step 1313, determining first sample feature characterizations corresponding to the plurality of first sample cross feature groups respectively according to the feature extraction models corresponding to the plurality of first sample cross feature groups respectively.

Referring to FIG. 14, a feature set F is interleaved with a plurality of first samples according to₁ ^AF₂ ^A、F₁ ^AF₃ ^A、F₂ ^AF₃ ^ARespectively corresponding feature extraction model R₄、R₅、R₆Determining a first plurality of sample cross feature sets F₁ ^AF₂ ^A、F₁ ^AF₃ ^A、F₂ ^AF₃ ^ARespectively corresponding first sample characterization r₄ ^A、r₅ ^A、r₆ ^A。

It will be appreciated that in fig. 14, three first sample cross feature sets F are only schematically drawn₁ ^AF₂ ^A、F₁ ^AF₃ ^A、F₂ ^AF₃ ^AThe first set of sample cross features may also have F₄ ^BF₅ ^BEtc., the number of first sample cross feature sets may also be 6, etc., the actual number of first sample cross feature setsAnd are not intended to limit the scope of the present application.

And 1314, determining second sample feature characterizations corresponding to the plurality of second sample cross feature groups respectively according to the feature extraction models corresponding to the plurality of second sample cross feature groups respectively, wherein the feature extraction models corresponding to the first sample cross feature group and the second sample cross feature group with the same sample attributes are the same.

E.g. based on the set of features F intersecting a plurality of second samples₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding feature extraction model R₄、R₅、R₆Determining a set of cross-features F with a plurality of second samples₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding second sample characterization r₄ ^B、r₅ ^B、r₆ ^B。

It will be appreciated that in fig. 14, three second sample cross feature sets F are only schematically drawn₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BThe second sample cross feature set may also have F₄ ^BF₅ ^BAnd so on, the number of second sample cross feature sets may also be 6, and so on, the actual number of second sample cross feature sets is not limiting of the present application.

For example, referring to FIG. 14, a first sample cross feature set F having the same sample attributes₁ ^AF₂ ^AAnd a second sample cross feature set F₁ ^BF₂ ^BThe corresponding feature extraction models are the same, and the first sample cross feature group F with the same sample attribute₁ ^AF₃ ^AAnd a second sample cross feature set F₁ ^BF₃ ^BThe corresponding feature extraction models are the sameFirst sample cross feature set F with same sample attributes₂ ^AF₃ ^AAnd a second sample cross feature set F₂ ^BF₃ ^BThe corresponding feature extraction models are the same.

Step 1315, determining a domain distinguishing loss value of the domain distinguishing model according to one domain distinguishing model corresponding to each first sample feature characterization and each second sample feature characterization with the same sample attribute, and updating parameters of the feature extraction model corresponding to the first sample feature characterization and the second sample feature characterization with the same sample attribute according to the domain distinguishing loss value of the domain distinguishing model.

The domain distinguishing model may be a differentiable classification model, and the domain distinguishing model may be used to determine whether the feature characterization learned by the feature extraction model is a first sample feature characterization or a second sample feature characterization.

For example, referring to FIG. 14, r is characterized according to first sample features each having the same sample attributes₁ ^AAnd a second sample characterization r₁ ^BCorresponding a domain distinguishing model D₁Determining a domain-specific model D₁Is a domain discrimination loss value L_d,1(ii) a Characterizing r from first sample features each having the same sample property₂ ^AAnd a second sample characterization r₂ ^BCorresponding a domain distinguishing model D₂Determining a domain-specific model D₂Is a domain discrimination loss value L_d,2(ii) a Characterizing r from first sample features each having the same sample property₃ ^AAnd a second sample characterization r₃ ^BCorresponding a domain distinguishing model D₃Determining a domain-specific model D₃Is a domain discrimination loss value L_d,3。

For example, referring to FIG. 14, the model D is differentiated according to domain₁Is a domain discrimination loss value L_d,1Updating a first sample characterization r having the same sample attributes₁ ^AAnd a second sample characterization r₁ ^BCorresponding feature extraction model R₁The parameters of (1); according to the region of the fieldPartial model D₂Is a domain discrimination loss value L_d,2Updating a first sample characterization r having the same sample attributes₂ ^AAnd a second sample characterization r₂ ^BCorresponding feature extraction model R₂The parameters of (1); distinguishing model D according to field₃Is a domain discrimination loss value L_d,3Updating a first sample characterization r having the same sample attributes₃ ^AAnd a second sample characterization r₃ ^BCorresponding feature extraction model R₃The parameter (c) of (c).

Step 1316, obtaining trained feature extraction models corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively, based on the updated parameters of each feature extraction model.

For example, referring to FIG. 14, a model R is extracted based on each feature₁、R₂、R₃、R₄、R₅、R₆The updated parameters of (a) are obtained as trained feature extraction models R corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively₁、R₂、R₃、R₄、R₅、R₆。

In some embodiments, referring to fig. 8, fig. 8 is a flowchart illustrating a classification method based on artificial intelligence according to an embodiment of the present application, and step 13 may be implemented by steps 1321 to 1323 shown in fig. 8, which will be described with reference to the steps.

Step 1321, determining first sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups respectively according to the feature extraction model, and determining second sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups respectively according to the feature extraction model.

For example, the feature group F is determined according to a feature extraction model₁ ^A、F₂ ^A、F₃ ^AAnd a plurality of cross feature sets F₁ ^AF₂ ^A、F₁ ^AF₃ ^A、F₂ ^AF₃ ^ARespectively corresponding first sample characterization r₁ ^A、r₂ ^A、r₃ ^A、r₄ ^A、r₅ ^A、r₆ ^AAnd determining a plurality of feature groups F according to the feature extraction model₁ ^B、F₂ ^B、F₃ ^BAnd a plurality of cross feature sets F₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding second sample characterization r₁ ^B、r₂ ^B、r₃ ^B、r₄ ^B、r₅ ^B、r₆ ^B。

And 1322, determining corresponding domain distinguishing loss values according to each first sample characteristic feature, each second sample characteristic feature and the corresponding domain distinguishing models.

For example, r is characterized according to each first sample characteristic₁ ^A、r₂ ^A、r₃ ^A、r₄ ^A、r₅ ^A、r₆ ^AAnd each second sample characterization r₁ ^B、r₂ ^B、r₃ ^B、r₄ ^B、r₅ ^B、r₆ ^BAnd respectively corresponding domain-specific models D₁、D₂、D₃、D₄、D₅、D₆Determining the respectively corresponding region-specific loss values L_d,1、L_d,2、L_d,3、L_d,4、L_d,5、L_d,6。

In some embodiments, referring to fig. 10, fig. 10 is a flowchart illustrating an artificial intelligence based classification method provided in an embodiment of the present application, and step 1322 may be implemented through steps 132221 through 13222 shown in fig. 10, which will be described in conjunction with the above steps.

And 13221, calling a domain distinguishing model corresponding to each first sample feature characterization based on each first sample feature characterization to obtain a first sample prediction domain label of each first sample feature characterization, and determining a corresponding domain distinguishing loss value according to the first sample prediction domain label and the first sample real domain label.

And 13222, calling a domain distinguishing model corresponding to each second sample characteristic characterization based on each second sample characteristic characterization to obtain a second sample prediction domain label of each second sample characteristic, and determining a corresponding domain distinguishing loss value according to the second sample prediction domain label and a second sample real domain label, wherein the first sample characteristic characterization and the second sample characteristic characterization having the same sample attribute correspond to the same domain distinguishing model.

And 1323, updating parameters of the feature extraction model according to the distinguishing loss value of each field, and obtaining the trained feature extraction model based on the updated parameters.

For example, the loss value L is discriminated according to each domain_d,1、L_d,2、L_d,3、L_d,4、L_d,5、L_d,6And updating parameters of the feature extraction model, and obtaining the trained feature extraction model based on the updated parameters.

And 14, characterizing a training class prediction model according to the individual features in the second service scene and the second sample features output by each feature extraction model, and continuously training at least one feature extraction model.

The class prediction model may be a differentiable machine learning model, and the differentiable machine learning model may be a linear regression model, a logistic regression model, or the like, and the class prediction model has high interpretability and may explain the degree of contribution of each feature group to the prediction result.

In some embodiments, referring to fig. 9, fig. 9 is a schematic flowchart of an artificial intelligence based classification method provided in an embodiment of the present application, and step 14 may be implemented by steps 141 to 143 shown in fig. 9, which will be described in detail with reference to the steps.

Step 141, a category prediction model is invoked based on the monomer feature and the first sample feature representation in the first service scenario, and a first sample prediction category label output by the category prediction model is obtained.

For example, referring to FIG. 14, based on the individual feature f in the first service scenario₁ ^A、f₂ ^A、f₃ ^AIs characterized by the first sample characteristic r₁ ^A、r₂ ^A、r₃ ^A、r₄ ^A、r₅ ^A、r₆ ^AAnd calling the category prediction model to obtain a first sample prediction category label output by the category prediction model.

And 142, determining a class prediction loss value according to the first sample prediction class label and the first sample real class label.

For example, referring to FIG. 14, a class prediction loss value L is determined based on a first sample prediction class label and a first sample true class label_cls ^A。

And 143, updating the parameters of the class prediction model according to the class prediction loss value, and acquiring the trained class prediction model based on the updated parameters.

For example, referring to FIG. 14, the loss value L is predicted based on the class_cls ^AAnd updating parameters of the class prediction model, and obtaining the trained class prediction model based on the updated parameters. It is to be understood that the parameters of the category prediction model may be parameters for prediction by the category prediction model.

In some embodiments, step 14 may be implemented by: and continuously updating parameters of the feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the category prediction loss values, and acquiring continuously trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters.

For example, referring to FIG. 14, the update continues with a plurality of feature sets F based on the class predicted loss values₁ ^A、F₂ ^A、F₃ ^AAnd a plurality of cross feature sets F₁ ^AF₂ ^A、F₁ ^AF₃ ^A、F₂ ^AF₃ ^ARespectively corresponding feature extraction model R₁、R₂、R₃、R₄、R₅、R₆Based on the updated parameters, obtaining continuously trained feature extraction models R respectively corresponding to the plurality of feature groups and the plurality of cross feature groups₁、R₂、R₃、R₄、R₅、R₆。

In some embodiments, referring to fig. 11, fig. 11 is a flowchart illustrating an artificial intelligence based classification method provided in an embodiment of the present application, and step 14 may be implemented by steps 144 to 145 shown in fig. 11, which will be described in detail with reference to the steps.

And 144, calling the class prediction model based on the monomer characteristic and the first sample characteristic representation in the first service scene to obtain a first sample prediction class label output by the class prediction model, and determining a class prediction loss value according to the first sample prediction class label and the first sample real class label.

For example, based on the individual characteristic f in the first service scenario₁ ^A、f₂ ^A、f₃ ^AIs characterized by the first sample characteristic r₁ ^A、r₂ ^A、r₃ ^A、r₄ ^A、r₅ ^A、r₆ ^ACalling a category prediction model to obtain a first sample prediction category label output by the category prediction model, and determining a category prediction loss value L according to the first sample prediction category label and a first sample real category label_cls ^A。

And 145, updating the parameters of the class prediction model according to the class prediction loss value, and acquiring the trained class prediction model based on the updated parameters.

For example, the loss value L is predicted according to the category_cls ^AAnd updating parameters of the class prediction model, and obtaining the trained class prediction model based on the updated parameters.

In some embodiments, step 14 may be implemented by: and continuously updating parameters of the feature extraction model according to the class prediction loss value, and obtaining the trained feature extraction model based on the updated parameters.

In some embodiments, the loss value L is predicted according to class_cls ^AAnd continuously updating the parameters of the feature extraction model R, and obtaining the trained feature extraction model based on the updated parameters.

And step 15, predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model.

In some embodiments, referring to fig. 12, fig. 12 is a schematic flowchart of an artificial intelligence based classification method provided in an embodiment of the present application, and step 15 may be implemented by steps 151 to 153 shown in fig. 12, which will be described in conjunction with the steps.

And 151, calling the trained feature extraction models corresponding to the feature groups respectively based on the feature groups in the second service scene to obtain updated second sample feature representations corresponding to the feature groups respectively.

In some embodiments, referring to fig. 15, fig. 15 is a schematic diagram of a migration learning model prediction provided by an embodiment of the present application.

For example, referring to fig. 15, based on a plurality of feature groups F in the second service scenario₁ ^B、F₂ ^B、F₃ ^BCalling and multiple feature groups F₁ ^B、F₂ ^B、F₃ ^BRespectively corresponding trained feature extraction model R₁、R₂、R₃Obtaining a plurality of feature groups F₁ ^B、F₂ ^B、F₃ ^BRespectively corresponding updated second sample characteristic characterization r₁ ^B、r₂ ^B、r₃ ^B。

Step 152, calling the trained feature extraction models corresponding to the plurality of cross feature groups respectively based on the plurality of cross feature groups in the second service scenario, and obtaining updated second sample feature representations corresponding to the plurality of cross feature groups respectively.

For example, referring to FIG. 15, based on a plurality of cross feature sets F in a second business scenario₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BCalling and multiple cross feature sets F₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding trained feature extraction model R₄、R₅、R₆Obtaining a plurality of cross feature groups F₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding updated second sample characteristic characterization r₄ ^B、r₅ ^B、r₆ ^B。

And 153, calling the trained class prediction model based on the updated second sample characteristic features and the monomer features in the second service scene to obtain a class label of the second service scene output by the trained class prediction model.

Referring to FIG. 15, r is characterized based on the updated second sample feature₁ ^B、r₂ ^B、r₃ ^B、r₄ ^B、r₅ ^B、r₆ ^BAnd the monomer characteristic f in the second service scene₁ ^B、f₂ ^B、f₃ ^BAnd calling the trained category prediction model to obtain a category label of the second service scene output by the trained category prediction model.

Referring to fig. 13, fig. 13 is a schematic flowchart of an artificial intelligence based classification method provided in an embodiment of the present application, and step 15 may be implemented by steps 154 to 155 shown in fig. 13, which will be described in detail with reference to the steps.

Step 154, calling a feature extraction model based on the plurality of feature groups and the plurality of cross feature groups in the second service scenario to obtain updated second sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively.

E.g. based on a plurality of feature sets F in a second service scenario₁ ^B、F₂ ^B、F₃ ^BAnd a plurality of cross feature sets F₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BCalling a feature extraction model R to obtain a plurality of feature groups F₁ ^B、F₂ ^B、F₃ ^BAnd a plurality of cross feature sets F₁ ^BF₂ ^B、F₁ ^BF₃ ^B、F₂ ^BF₃ ^BRespectively corresponding updated second sample characteristic characterization r₁ ^B、r₂ ^B、r₃ ^B、r₄ ^B、r₅ ^B、r₆ ^B。

And step 155, calling the trained class prediction model based on the updated second sample characteristic feature and the monomer feature in the second service scene to obtain a class label of the second service scene.

For example, r is characterized based on the updated second sample feature₁ ^B、r₂ ^B、r₃ ^B、r₄ ^B、r₅ ^B、r₆ ^BAnd the monomer characteristic f in the second service scene₁ ^B、f₂ ^B、f₃ ^BAnd calling the trained class prediction model to obtain a class label of the second service scene.

Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.

For example, two service scenarios are similar but different, a first service scenario and a second service scenario (such as two service scenarios, a credit score and a fraud score).

There are a large number of samples in a first business scenario (e.g., a credit scoring business scenario), each sample corresponding to a category label and a domain label. There is a certain number of samples in the second service scenario (e.g., fraud scoring service scenario), each sample corresponding to a domain tag, but no category tag.

In two practical application scenes of credit scoring and fraud scoring, the feature extraction model and the category prediction model can be trained through sample features in the credit scoring scene and the fraud scoring scene to obtain the trained feature extraction model and the trained category prediction model. And predicting the category label of the fraud scoring scene through the trained feature extraction model and the category prediction model.

The training of the feature extraction model and the category prediction model uses the sample features in two service scenes, so that the trained feature extraction model and the trained category prediction model have better transfer learning capability.

The first traffic scenario (e.g., credit scoring traffic scenario) and the second traffic scenario (e.g., fraud scoring traffic scenario) have the same feature space (i.e., features are the same). In both the first service scenario (e.g., credit scoring service scenario) and the second service scenario (e.g., fraud scoring service scenario), identical partial features are chosen according to expert knowledge and grouped in identical ways. The features in each group are descriptions of some property of the sample. Thus each feature packet can be seen as a relatively abstract description of some aspect of the sample. For example, features such as love dating, microblog dating, WeChat dating, etc. may be grouped together, where the group of features is descriptive of the social attributes of the sample. Thus, the higher-order features learned from each feature group are interpretable.

But expert knowledge based feature groupings ignore interactions between features belonging to different feature groupings. Thus, the intersection between features belonging to different feature groupings is achieved by pairwise interactions between feature groupings.

And performing pairwise crossing on the N feature groups in the first service scene to obtain Z feature group crossings. Each feature packet intersection is also a feature packet in nature. We merge the Z feature packet intersections with the N feature packets in the first service scenario to obtain an extended feature packet set.

And performing pairwise crossing on the N feature groups in the second service scene by using the same method to obtain Z feature group crossings. And combining the Z feature packet intersections with the N feature packets in the second service scene to obtain an expanded feature packet set.

For each pair of feature packets in the feature packet set of the credit scoring scenario and the feature packet set of the fraud scoring scenario:

and training a feature extraction model to perform feature learning on the feature groups, wherein the input of the feature extraction model is the feature groups, and the corresponding outputs are the learned feature representations respectively.

A domain discrimination model is trained to approximate the feature distribution between feature groupings.

The feature extraction model is a deep neural network model and is used for performing characterization learning on features, namely learning migratable knowledge, so as to achieve the purpose of field adaptation.

The domain distinguishing model is an arbitrary differentiable classification model used for judging whether the characterization learned by the feature extraction model is from a first service scene (for example, a credit scoring scene) or a second service scene (for example, a fraud scoring scene).

Based on all samples and class labels of the first business scenario, a class prediction model and all N + Z feature extraction models are trained. And differentiable machine learning models in which the class prediction model has higher interpretability, such as linear regression models and logistic regression models, and the like. Based on the parameters and the model structure of the class prediction model, the class prediction model can explain the contribution of each feature group to the prediction result, so that the reason for obtaining the prediction result can be explained.

The number of training model rounds is preset, and in each round of model training, based on samples of a first service scene and a second service and corresponding class labels and domain labels, a gradient descent algorithm can be used for training the model through the following steps.

Inputting each feature group of the input first service scene sample and each feature group of the input second service scene sample into corresponding feature extraction models respectively to obtain N + Z feature characterizations r_i ^AAnd N + Z signatures r_i ^B。

For each feature characterization r_i ^AAnd inputting the domain label into a corresponding domain distinguishing model to obtain a predicted domain label. Distinguishing loss function L by Domain_d,iBased on r_i ^AAnd calculating to obtain a domain distinguishing loss value. And updating the domain distinguishing model based on the domain distinguishing loss value, and training the feature extraction model in a domain confrontation learning mode based on the domain distinguishing loss value.

For each feature characterization r_i ^BAnd inputting the domain label into a corresponding domain distinguishing model to obtain a predicted domain label. Distinguishing loss function L by Domain_d,iBased on r_i ^BAnd calculating to obtain a domain distinguishing loss value. And updating the domain distinguishing model based on the domain distinguishing loss value, and training the feature extraction model in a domain confrontation learning mode based on the domain distinguishing loss value.

All M monomer features of a first sample of an input first service scene and N + Z feature characterizations r obtained by a feature extraction model_i ^AAnd inputting a category prediction model to obtain a prediction category label. Predicting a loss function by class

And calculating to obtain a class prediction loss value based on the prediction class label and the real class label. And updating the class prediction model and all the N + Z feature extraction models based on the class prediction loss value.

Updating the category prediction model and all N + Z feature extraction models may be training the category prediction model and all N + Z feature extraction models to obtain the trained category prediction model and all N + Z feature extraction models.

Inputting each feature group of a sample to be predicted of a second service scene into a corresponding trained N + Z feature extraction model respectively to obtain N + Z feature characterizations r of the feature characterizations_i ^B。

Characterizing M individual features and N + Z features of a second service scenario by r_i ^BAnd inputting the updated category prediction model to obtain the predicted category label of the second service scene.

The feature extraction model and the category prediction model are trained through the individual features and feature groups of the first service scene sample and the second scene sample, so that the trained feature extraction model and the trained category prediction model can be well adapted to two different scenes. When the class label of the second service scene is predicted, the trained feature extraction model and the trained class prediction model are used, so that the class label of the second service scene can be accurately predicted.

Continuing with FIG. 1, an exemplary structure implemented as software modules of the artificial intelligence based classification device 455 provided by the embodiments of the present application is described, and in some embodiments, as shown in FIG. 1, the software modules stored in the artificial intelligence based classification device 455 of the memory 450 may include: a feature grouping module 4551, an intersection module 4552, a first training module 4553, a second training module 4554, and a prediction module 4555.

The feature grouping module 4551 is configured to perform grouping processing on the first sample set of the first service scenario and the second sample set of the second service scenario according to the sample attribute features, respectively, to obtain a plurality of feature groups and individual features in each service scenario;

the intersecting module 4552 is configured to perform an intersecting operation on the multiple feature groups in each service scenario to obtain multiple intersecting feature groups;

a first training module 4553, configured to train at least one feature extraction model according to a plurality of feature sets and a plurality of cross feature sets;

the second training module 4554 is configured to characterize a training category prediction model according to the individual features in the first service scenario and the first sample feature output by each feature extraction model, and continue to train at least one feature extraction model;

and the predicting module 4555 is configured to predict the class label of the to-be-detected sample of the second service scenario according to the trained at least one feature extraction model and the trained class prediction model.

In some embodiments, the feature grouping module 4551 is further configured to perform the following for each sample in the first set of samples of the first traffic scenario: grouping a plurality of monomer features with the same sample attribute in each sample to obtain a plurality of first sample feature groups, and taking the monomer features which are not used for grouping in each sample as the first sample monomer features; performing the following for each sample in a second set of samples of a second traffic scenario: and grouping a plurality of monomer features of each sample with the same sample attribute to obtain a plurality of second sample feature groups, and taking the monomer features which are not used for grouping in each sample as the monomer features of the second sample.

In some embodiments, the interleaving module 4552 is further configured to interleave any two of the plurality of first sample feature sets according to an interleaving operation function to obtain a plurality of first sample interleaving feature sets; and performing cross operation on any two of the second sample feature groups according to the cross operation function to obtain a plurality of second sample cross feature groups.

In some embodiments, the first training module 4553 is further configured to determine, according to the feature extraction model corresponding to each of the plurality of first sample feature groups, a first sample feature characterization corresponding to each of the plurality of first sample feature groups; determining second sample feature characterizations respectively corresponding to the plurality of second sample feature groups according to feature extraction models respectively corresponding to the plurality of second sample feature groups, wherein the feature extraction models corresponding to the first sample feature group and the second sample feature group with the same sample attributes are the same; determining first sample feature representations respectively corresponding to the plurality of first sample cross feature groups according to the feature extraction models respectively corresponding to the plurality of first sample cross feature groups; determining second sample feature characterizations corresponding to the plurality of second sample cross feature groups respectively according to feature extraction models corresponding to the plurality of second sample cross feature groups respectively, wherein feature extraction models corresponding to a first sample cross feature group and a second sample cross feature group with the same sample attributes are the same; determining a domain distinguishing loss value of the domain distinguishing model according to a domain distinguishing model corresponding to each first sample characteristic characterization and each second sample characteristic characterization with the same sample attribute, and updating parameters of a feature extraction model corresponding to the first sample characteristic characterization and the second sample characteristic characterization with the same sample attribute according to the domain distinguishing loss value of the domain distinguishing model; and obtaining trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters of each feature extraction model.

In some embodiments, the second training module 4554 is further configured to invoke a category prediction model based on the monomer feature and the first sample feature characterization in the first service scenario, so as to obtain a first sample prediction category label output by the category prediction model; determining a category prediction loss value according to the first sample prediction category label and the first sample real category label; updating parameters of the category prediction model according to the category prediction loss value, and acquiring a trained category prediction model based on the updated parameters; continuing to train at least one feature extraction model, comprising: and continuously updating parameters of the feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the category prediction loss values, and acquiring continuously trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters.

In some embodiments, the first training module 4553 is further configured to determine first sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively, according to the feature extraction model, and determine second sample feature characterizations corresponding to the plurality of feature groups and the plurality of cross feature groups, respectively, according to the feature extraction model; determining corresponding domain distinguishing loss values according to each first sample characteristic feature, each second sample characteristic feature and corresponding domain distinguishing models; and updating parameters of the feature extraction model according to the distinguishing loss value of each field, and obtaining the trained feature extraction model based on the updated parameters.

In some embodiments, the first training module 4553 is further configured to invoke a domain differentiation model corresponding to each first sample feature characterization based on each first sample feature characterization, obtain a first sample prediction domain label of each first sample feature characterization output by the domain differentiation model, and determine a corresponding domain differentiation loss value according to the first sample prediction domain label and the first sample true domain label; calling a domain distinguishing model corresponding to each second sample feature characterization based on each second sample feature characterization to obtain a second sample prediction domain label of each second sample feature output by the domain distinguishing model, and determining a corresponding domain distinguishing loss value according to the second sample prediction domain label and a second sample real domain label, wherein the first sample feature characterization and the second sample feature characterization having the same sample attribute correspond to the same domain distinguishing model.

In some embodiments, the second training module 4554 is further configured to invoke a category prediction model based on the monomer feature and the first sample feature characterization in the first service scenario, obtain a first sample prediction category label output by the category prediction model, and determine a category prediction loss value according to the first sample prediction category label and the first sample true category label; updating parameters of the category prediction model according to the category prediction loss value, and acquiring a trained category prediction model based on the updated parameters; continuing to train at least one feature extraction model, comprising: and continuously updating parameters of the feature extraction model according to the class prediction loss value, and obtaining the trained feature extraction model based on the updated parameters.

In some embodiments, the predicting module 4555 is further configured to call, based on a plurality of feature groups in a second service scenario, trained feature extraction models corresponding to the plurality of feature groups, and obtain updated second sample feature characterizations, output by the trained feature extraction models, corresponding to the plurality of feature groups, respectively; calling the trained feature extraction models corresponding to the plurality of cross feature groups respectively based on the plurality of cross feature groups in the second service scene to obtain updated second sample feature representations output by the trained feature extraction models and corresponding to the plurality of cross feature groups respectively; and calling the trained class prediction model based on the updated second sample characteristic feature and the monomer feature in the second service scene to obtain a class label of the second service scene output by the trained class prediction model.

In some embodiments, the predicting module 4555 is further configured to invoke a feature extraction model based on the plurality of feature groups and the plurality of cross feature groups in the second service scenario, and obtain updated second sample feature characterizations, which are output by the feature extraction model and respectively correspond to the plurality of feature groups and the plurality of cross feature groups; and calling the trained category prediction model based on the updated second sample characteristic feature and the monomer feature under the second service scene to obtain a category label of the second service scene output by the trained category prediction model.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the artificial intelligence based classification method described in the embodiment of the present application.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, the embodiment of the present application has the following beneficial effects:

(1) grouping a first sample set of a first service scene and a second sample set of a second service scene according to sample attribute features respectively to obtain a plurality of feature groups and individual features under each service scene, wherein each feature group is a description of a specific feature of a sample, so that high-order features learned from each feature group subsequently have interpretability; the accuracy of grouping processing can be further improved by performing the cross operation on the plurality of feature groups to obtain a plurality of cross feature groups, so that high-order features learned from each feature group and each cross feature group have higher interpretability. Training at least one feature extraction model according to the feature groups and the cross feature groups, characterizing a training category prediction model according to the individual features in the first service scene and the first sample features output by each feature extraction model, and continuing to train the at least one feature extraction model. The characteristic extraction model is continuously trained while the category prediction model is trained, so that the training efficiency can be improved, and the performance of the characteristic extraction model can be effectively improved. And predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model, so that the class label of the sample to be tested can be accurately predicted. Thus, the target performance can be achieved while improving the transfer learning efficiency.

(2) Knowledge of each feature set can be migrated through the feature extraction model of each feature set.

(3) By explaining the model parameters or the model structure of the model G, the contribution of each characteristic group to the prediction result can be explained, so that the reason for obtaining the prediction result can be explained, and the purposes of model migration and model interpretability are achieved.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A classification method based on artificial intelligence is characterized by comprising the following steps:

training at least one feature extraction model from the plurality of feature sets and the plurality of cross feature sets;

according to the monomer features in the first service scene and the first sample features output by each feature extraction model, representing a training class prediction model, and continuing to train the at least one feature extraction model;

2. The method of claim 1, wherein the grouping the first sample set of the first service scenario and the second sample set of the second service scenario according to the sample attribute features to obtain a plurality of feature groups and individual features in each service scenario comprises:

performing the following for each sample of the first set of samples of the first traffic scenario: grouping a plurality of monomer features with the same sample attribute in each sample to obtain a plurality of first sample feature groups, and taking the monomer features which are not used for grouping in each sample as first sample monomer features;

performing the following for each sample of a second set of samples of the second traffic scenario: and grouping the plurality of individual features of each sample with the same sample attribute to obtain a plurality of second sample feature groups, and taking the individual features of each sample which are not used for grouping as the individual features of the second sample.

3. The method of claim 2, wherein the interleaving the plurality of feature groups in each of the service scenarios to obtain a plurality of interleaved feature groups comprises:

performing cross operation on any two of the first sample feature groups according to a cross operation function to obtain a plurality of first sample cross feature groups;

and performing cross operation on any two of the second sample feature groups according to the cross operation function to obtain a plurality of second sample cross feature groups.

4. The method of claim 1,

the feature groups and the cross feature groups respectively correspond to one feature extraction model;

training at least one feature extraction model from the plurality of feature sets and the plurality of cross-feature sets, comprising:

determining first sample feature characterizations corresponding to the first sample feature groups respectively according to the feature extraction models corresponding to the first sample feature groups respectively;

determining second sample feature characterizations respectively corresponding to the second sample feature groups according to feature extraction models respectively corresponding to the second sample feature groups, wherein the feature extraction models corresponding to the first sample feature group and the second sample feature group with the same sample attributes are the same;

determining first sample feature characterizations corresponding to the first sample cross feature groups respectively according to feature extraction models corresponding to the first sample cross feature groups respectively;

determining second sample feature characterizations corresponding to the second sample cross feature groups respectively according to feature extraction models corresponding to the second sample cross feature groups respectively, wherein the feature extraction models corresponding to the first sample cross feature group and the second sample cross feature group with the same sample attributes are the same;

determining a domain distinguishing loss value of the domain distinguishing model according to a domain distinguishing model corresponding to the first sample feature characterization and the second sample feature characterization which have the same sample attribute, and updating parameters of a feature extraction model corresponding to the first sample feature characterization and the second sample feature characterization which have the same sample attribute according to the domain distinguishing loss value of the domain distinguishing model;

and obtaining trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters of each feature extraction model.

5. The method of claim 1,

the characterizing a training class prediction model according to the individual features and the first sample features output by each feature extraction model in the first service scenario includes:

calling the category prediction model based on the monomer characteristic and the first sample characteristic representation in the first service scene to obtain a first sample prediction category label output by the category prediction model;

determining a class prediction loss value according to the first sample prediction class label and the first sample real class label;

updating parameters of the class prediction model according to the class prediction loss value, and acquiring the trained class prediction model based on the updated parameters;

the continuing to train the at least one feature extraction model comprises:

and continuously updating parameters of the feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the category prediction loss values, and acquiring continuously trained feature extraction models respectively corresponding to the plurality of feature groups and the plurality of cross feature groups based on the updated parameters.

6. The method of claim 1,

the plurality of feature groups and the plurality of cross feature groups correspond to the same feature extraction model;

determining first sample feature characterizations respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the feature extraction model, and determining second sample feature characterizations respectively corresponding to the plurality of feature groups and the plurality of cross feature groups according to the feature extraction model;

determining corresponding domain distinguishing loss values according to each first sample characteristic feature, each second sample characteristic feature and corresponding domain distinguishing models;

and updating parameters of the feature extraction model according to each domain distinguishing loss value, and obtaining the trained feature extraction model based on the updated parameters.

7. The method of claim 6, wherein determining respective corresponding domain discrimination loss values from each of the first sample characterization and each of the second sample characterization and respective corresponding domain discrimination models comprises:

calling a domain distinguishing model corresponding to each first sample feature characterization based on each first sample feature characterization to obtain a first sample prediction domain label of each first sample feature characterization, and determining a corresponding domain distinguishing loss value according to the first sample prediction domain label and a first sample real domain label;

calling a domain distinguishing model corresponding to each second sample feature characterization based on each second sample feature characterization to obtain a second sample prediction domain label of each second sample feature, and determining a corresponding domain distinguishing loss value according to the second sample prediction domain label and a second sample real domain label, wherein the first sample feature characterization and the second sample feature characterization have the same sample attribute and correspond to the same domain distinguishing model.

8. The method of claim 1,

calling the category prediction model based on the monomer characteristic and the first sample characteristic characterization under the first service scene to obtain a first sample prediction category label output by the category prediction model, and determining a category prediction loss value according to the first sample prediction category label and a first sample real category label;

updating parameters of the category prediction model according to the category prediction loss value, and acquiring the trained category prediction model based on the updated parameters;

the continuing to train the at least one feature extraction model comprises:

and continuously updating the parameters of the feature extraction model according to the class prediction loss value, and acquiring the trained feature extraction model based on the updated parameters.

9. The method of claim 1,

the predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model comprises:

calling trained feature extraction models corresponding to the feature groups respectively based on the feature groups under the second service scene to obtain updated second sample feature representations corresponding to the feature groups respectively;

calling trained feature extraction models corresponding to the plurality of cross feature groups respectively based on the plurality of cross feature groups in the second service scene to obtain updated second sample feature representations corresponding to the plurality of cross feature groups respectively;

calling the trained class prediction model based on the updated second sample characteristic feature and the monomer feature in the second service scene to obtain a class label of the second service scene output by the trained class prediction model.

10. The method of claim 8,

calling the feature extraction model based on the plurality of feature groups and the plurality of cross feature groups in the second service scene to obtain updated second sample feature representations respectively corresponding to the plurality of feature groups and the plurality of cross feature groups;

and calling the trained category prediction model based on the updated second sample characteristic feature and the monomer feature under the second service scene to obtain a category label of the second service scene.

11. A classification device based on artificial intelligence, comprising:

the second training module is used for representing a training category prediction model according to the individual features in the first service scene and the first sample features output by each feature extraction model, and continuing to train the at least one feature extraction model;

and the prediction module is used for predicting the class label of the sample to be tested of the second service scene according to the trained at least one feature extraction model and the trained class prediction model.

12. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

a processor for implementing the artificial intelligence based classification method of any one of claims 1 to 10 when executing executable instructions stored in the memory.

13. A computer-readable storage medium having stored thereon executable instructions for, when executed by a processor, implementing the artificial intelligence based classification method of any one of claims 1 to 10.

14. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the artificial intelligence based classification method of any one of claims 1 to 10.