CN110781970B

CN110781970B - Classifier generation method, device, equipment and storage medium

Info

Publication number: CN110781970B
Application number: CN201911046719.8A
Authority: CN
Inventors: 刘紫薇; 宋辉; 吕培立; 董井然; 陈守志
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2024-04-26
Anticipated expiration: 2039-10-30
Also published as: CN110781970A

Abstract

The invention provides a classifier generation method, a classifier generation device, electronic equipment and a storage medium; the method comprises the following steps: constructing the maximum mean difference between the source domain and the target domain based on the sample of the source domain and the sample of the target domain; separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference; decomposing the optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain; predicting the sample of the target domain based on the sample of the source domain, the label corresponding to the sample, the sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the sample of the target domain; and generating a classifier corresponding to the target domain based on the sample of the source domain and the label corresponding to the sample of the target domain. The invention can separate the maximum mean difference related to large-scale calculation amount in the transfer learning, thereby reducing the calculation complexity of the transfer learning.

Description

Classifier generation method, device, equipment and storage medium

Technical Field

The present invention relates to artificial intelligence technologies, and in particular, to a method for generating a classifier, a method and apparatus for classifying data to be classified, an electronic device, and a storage medium.

Background

Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is a comprehensive technology of computer science, and by researching the design principle and implementation method of various intelligent machines, the machines have the functions of sensing, reasoning and decision. Artificial intelligence technology is a comprehensive subject, and relates to a wide range of fields, such as natural language processing technology, machine learning/deep learning and other directions, and with the development of technology, the artificial intelligence technology will be applied in more fields and has an increasingly important value.

Transfer learning is one of the important applications in the field of artificial intelligence, and has wide application in dialogue systems, face recognition systems, intelligent hardware and the like, namely, transfer learning is a basic component of the complex systems.

However, the large-scale calculation data in the transfer learning greatly increases the calculation complexity in the process of data transfer, so that normal transfer learning and other subsequent operations based on the transfer learning cannot be performed.

Disclosure of Invention

The embodiment of the invention provides a method, a device and a storage medium, which can separate the maximum mean difference related to large-scale calculated amount in transfer learning, thereby reducing the calculation complexity of the transfer learning.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a generation method of a classifier, which comprises the following steps:

constructing the maximum mean difference between a source domain and a target domain based on a sample of the source domain and a sample of the target domain;

separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference;

Decomposing an optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain;

Predicting the sample of the target domain based on the sample of the source domain, the label corresponding to the sample, the sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the sample of the target domain;

And generating a classifier corresponding to the target domain based on the sample of the source domain and the label corresponding to the sample of the target domain.

The embodiment of the invention provides a classification method of data to be classified, which is applied to a classifier of a corresponding target domain in the embodiment of the invention;

The method comprises the following steps:

determining data to be classified in the target domain;

And classifying the data to be classified through the classifier of the corresponding target domain to obtain the label corresponding to the data to be classified.

The embodiment of the invention provides a generation device of a classifier, which comprises the following steps:

the construction module is used for constructing the maximum mean difference between the source domain and the target domain based on the sample of the source domain and the sample of the target domain;

The separation module is used for separating the maximum mean difference between the source domain and the target domain;

the processing module is used for decomposing the optimization target according to the mean value difference obtained by separation to obtain a transformation matrix aiming at the source domain;

the prediction module is used for predicting the sample of the target domain based on the sample of the source domain, the label corresponding to the sample, the sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the sample of the target domain;

and the generation module is used for generating a classifier corresponding to the target domain based on the sample of the source domain and the label corresponding to the sample of the target domain.

In the above technical solution, the construction module is further configured to map the source domain sample and the target domain sample through a mapping space, so as to obtain a mapped source domain sample and a mapped target domain sample;

And respectively carrying out mean value processing on the mapped source domain samples and the mapped target domain samples, correspondingly obtaining the mean value of the source domain samples and the mean value of the target domain samples, and determining the difference between the mean value of the source domain samples and the mean value of the target domain samples as the maximum mean value difference between the source domain and the target domain.

In the above technical solution, the separation module is further configured to separate the maximum mean difference in a matrix form, so as to obtain a mean difference after separation in a vector form.

In the above technical solution, the maximum mean difference is a matrix of n×n, where N is a sum of the number of samples in the source domain and the number of samples in the target domain, and N is a natural number greater than 2;

the separation module is further used for carrying out matrix decomposition on the maximum mean value difference to obtain Cartesian products of the two separated mean value differences;

wherein the mean difference after separation is a vector of N1.

In the above technical solution, the processing module is further configured to combine the sample of the source domain and the sample of the target domain to obtain a feature matrix;

And carrying out generalized eigenvalue decomposition on an optimization target according to the eigenvalue matrix and the separated mean value difference to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the processing module is further configured to multiply the feature matrix and the separated mean difference to obtain an intermediate matrix;

and carrying out generalized eigenvalue decomposition on the optimization target based on the intermediate matrix to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the construction module is further configured to construct a maximum mean difference between the C-th class sample of the source domain and the C-th class sample of the target domain based on the C-th class sample of the source domain and the C-th class sample of the target domain, where C is less than or equal to C, C is the total number of sample types of the source domain, and C are natural numbers;

The separation module is further used for separating the maximum mean value difference of the class c samples of the source domain and the target domain to obtain the mean value difference of the class c samples after separation;

The processing module is also used for summing the mean differences of the C separated samples to obtain the sum of the mean differences after separation;

And decomposing the optimization target according to the sum to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the intermediate matrix is U _c＝X·e_c, where X is the feature matrix, and e _c is the mean difference of the separated c-th sample;

the processing module is further configured to determine a representation of the optimization objective:

And satisfies a ^TXHX^T a=i, where C is the total number of sample types in the source domain, tr is the trace of the matrix, a is the transformation matrix, M _c is the maximum mean difference of the class C samples, λ is the regularization term coefficient, H is the center matrix, and I is the identity matrix;

Substitution of XHX ^T in the optimization objective to

Based onAnd carrying out generalized eigenvalue decomposition on the optimization target according to the following formula to obtain a transformation matrix A aiming at the source domain:

wherein/> Is a lagrange multiplier.

In the above technical solution, the prediction module is further configured to obtain a transformed source domain sample according to the source domain sample and the transformation matrix for the source domain;

constructing a classification model of the classifier according to the transformed source domain sample and the label corresponding to the source domain sample;

And predicting the sample of the target domain through the classification model to obtain a label corresponding to the sample of the target domain.

In the above technical solution, the generating module is further configured to train the classifier based on the sample of the target domain and the label corresponding to the sample of the target domain, to obtain the classifier corresponding to the target domain.

In the above technical solution, the sample of the source domain is a first user behavior feature in a first financial scene, and the sample of the target domain is a second user behavior feature in a second financial scene;

The construction module is further configured to construct a maximum mean difference between the first financial scenario and the second financial scenario based on the first user behavior feature in the first financial scenario and the second user behavior feature in the second financial scenario;

The prediction module is further configured to perform default prediction on the second user of the second financial scene based on the first user behavior feature, the default label corresponding to the first user, the second user behavior feature, and the transformation matrix for the first financial scene, so as to obtain the default label corresponding to the second user of the second financial scene;

The generation module is further configured to generate a classifier corresponding to the second financial scenario based on the second user behavior feature of the second financial scenario and the default label corresponding to the second user.

The embodiment of the invention provides a device for classifying data to be classified, which comprises:

the determining module is used for determining data to be classified in the target domain;

And the classification module is used for classifying the data to be classified through a classifier corresponding to the target domain to obtain a label corresponding to the data to be classified.

The embodiment of the invention provides a generation device of a classifier, which comprises the following components:

a memory for storing executable instructions;

And the processor is used for realizing the generation method of the classifier provided by the embodiment of the invention when executing the executable instructions stored in the memory.

a memory for storing executable instructions;

And the processor is used for realizing the classification method of the data to be classified provided by the embodiment of the invention when executing the executable instructions stored in the memory.

The embodiment of the invention provides a storage medium which stores executable instructions for realizing the generation method of the classifier provided by the embodiment of the invention or realizing the classification method of the data to be classified provided by the embodiment of the invention when being executed by a processor.

The embodiment of the invention has the following beneficial effects:

the maximum mean difference related to large-scale calculated amount is separated, so that the calculation complexity is reduced, and an optimization target can be decomposed to obtain a transformation matrix for subsequent classification operation; the maximum mean value difference is separated to solve the problem of continuous multiplication of a large matrix (the maximum mean value difference), and the computable data scale in a computer operation environment is increased, so that normal transfer learning is performed, the performance of the transfer learning is remarkably improved, and the method is suitable for various classification application scenes.

Drawings

Fig. 1 is an application scenario schematic diagram of a classifier generation system 10 provided in an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a classifier generating device 500 according to an embodiment of the present invention;

fig. 3 to 7 are schematic flow diagrams of a method for generating a classifier according to an embodiment of the present invention;

Fig. 8 is an application scenario schematic diagram of the classification system 20 for data to be classified according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a classification device 600 for data to be classified according to an embodiment of the present invention;

fig. 10 is a flowchart of a method for classifying data to be classified according to an embodiment of the present invention;

fig. 11 is a flowchart of the overall algorithm of JDA according to an embodiment of the present invention.

Detailed Description

The present invention will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present invention more apparent, and the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present invention.

In the following description, the terms "first", "second", and the like are merely used to distinguish between similar objects and do not represent a particular ordering of the objects, it being understood that the "first", "second", or the like may be interchanged with one another, if permitted, to enable embodiments of the invention described herein to be practiced otherwise than as illustrated or described herein.

In the embodiment of the application, the relevant data collection and processing should be strictly according to the requirements of relevant national laws and regulations when the example is applied, the informed consent or independent consent of the personal information body is obtained, and the subsequent data use and processing behaviors are developed within the authorized range of the laws and regulations and the personal information body.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before describing embodiments of the present invention in further detail, the terms and terminology involved in the embodiments of the present invention will be described, and the terms and terminology involved in the embodiments of the present invention will be used in the following explanation.

1) Transfer learning (TRANSFER LEARNING): refers to a learning process that uses similarities between data, tasks, or models to apply models learned in an old domain to a new domain. It is simply understood that similarity is the basis of migration, and two scenes undergoing migration need to have certain similarity, but there is a difference at the same time, so that a model trained by using source domain data cannot be directly used for target domain prediction. The transfer learning is used for solving the problem, and the method relaxes the assumption that training test data obeys the same distribution in machine learning, and can train a classifier with good performance by combining a source domain (a large number of label samples) under the condition that a target domain has only a small number of data labels (the target domain is insufficient for training the classifier).

2) Domain (Domain): the field in transfer learning consists of data and probability distributions for generating the data, generally denoted D, and P denotes a probability distribution. There are two basic fields in transfer learning, namely: source Domain (Source Domain) and Target Domain (Target Domain). The source domain is the object to be migrated, typically with sufficient label samples; the target field is the final desired (label information needs to be given) object. Migration learning refers to completing the migration of knowledge from a source domain to a target domain.

3) Task (Task): the learning target in the transfer learning consists of a label and a function. The method can be divided into inductive migration, direct push migration and unsupervised migration according to the existence of labels in the source domain and the target domain. There are a large number of labels in the source domain and a small number of labels (insufficient to train the classifier) in the target domain belong to the inductive transfer learning range.

4) Maximum mean difference (MMD, maximum MEAN DISCREPANCY): a common distance measurement method in transfer learning. It is mainly used to measure the difference between two different but related distributions (the difference between the source domain and the target domain). Briefly, MMD describes the difference between the two distribution mapped averages.

5) And (3) image identification: techniques for processing, analyzing, and understanding images with computers to identify targets and objects in various different modes are one practical application for applying deep learning algorithms. The image recognition technology is generally divided into face recognition and article recognition, wherein the face recognition is mainly applied to security inspection, identity verification and mobile payment; the article identification is mainly applied to the article circulation process, in particular to the unmanned retail fields such as unmanned goods shelves, intelligent retail cabinets and the like.

6) And (3) target detection: the method is also called target extraction, is image segmentation based on target geometric and statistical characteristics, integrates target segmentation and recognition, and has accuracy and instantaneity which are important capabilities of the whole system. Especially in complex scenes, when multiple targets need to be processed in real time, automatic extraction and recognition of the targets are particularly important. With the development of computer technology and the wide application of computer vision principle, the real-time identification and research of targets by utilizing computer image processing technology is getting more and more popular, and the dynamic real-time identification and positioning of targets has wide application value in the aspects of intelligent traffic systems, intelligent systems, target detection, surgical instrument positioning in medical navigation surgery and the like.

7) And (3) voice recognition: techniques for a machine to convert speech signals to corresponding text or commands through a recognition and understanding process. The speech recognition technology mainly comprises three aspects of a feature extraction technology, a pattern matching criterion and a model training technology.

The classifier described in the embodiment of the invention can be applied to various classification fields, for example, the classification fields such as an image recognition neural network, a target detection neural network, a voice recognition neural network, a face detection neural network, default detection and the like, namely, the classifier in the embodiment of the invention is not limited to a certain classification field.

In order to at least solve the above technical problems of the related art, embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for generating a classifier, which can separate the maximum mean difference related to large-scale calculation in migration learning, thereby reducing the calculation complexity, saving the calculation cost, and applying the generated classifier to subsequent classification operations. The following describes an exemplary application of the generation device of the classifier provided by the embodiment of the present invention, where the generation device of the classifier provided by the embodiment of the present invention may be a server, for example, a server deployed in a cloud, and according to a sample of a source domain, a label corresponding to the sample of the source domain, and a sample of a target domain provided by other devices or users, a series of processes are performed based on the sample of the source domain and the sample of the target domain, so as to obtain a classifier of the corresponding target domain, and provide the classifier of the corresponding target domain to the users for performing a subsequent classification operation; various types of user terminals, such as a handheld terminal, including a notebook computer, a tablet computer, a desktop computer, a mobile device (e.g., a mobile phone, a personal digital assistant), etc., may be used to obtain a classifier corresponding to a target domain according to a sample of a source domain, a label corresponding to the sample of the source domain, and a sample of the target domain, which are input by a user on the handheld terminal, and provide the classifier corresponding to the target domain to the user for subsequent classification operations.

As an example, referring to fig. 1, fig. 1 is a schematic application scenario of a classifier generating system 10 provided in an embodiment of the present invention, where a terminal 200 is connected to a server 100 through a network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two.

In some embodiments, the terminal 200 locally executes the method for generating a classifier provided in the embodiments of the present invention to obtain a classifier corresponding to a target domain according to a sample of a source domain, a label corresponding to the sample of the source domain, and a sample of the target domain input by a user, for example, a classifier generating assistant is installed on the terminal 200, the user inputs the sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain in the classifier generating assistant, and the terminal 200 obtains the classifier corresponding to the target domain according to the input sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain, and displays the classifier corresponding to the target domain on the display interface 210 of the terminal 200, so that the user can perform applications such as image recognition, target detection, and speech recognition according to the classifier corresponding to the target domain.

In some embodiments, the terminal 200 may also send, to the server 100 through the network 300, the sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain input by the user on the terminal 100, and invoke the generation function of the classifier provided by the server 100, where the server 100 obtains the classifier of the corresponding target domain through the generation method of the classifier provided by the embodiment of the present invention, for example, installs a classifier generation assistant on the terminal 200, where the user inputs the sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain in the classifier generation assistant, and sends the sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain to the server 100 through the network 300, and after the server 100 receives the sample of the source domain, the label corresponding to the sample of the source domain, and the sample of the target domain, the classifier of the corresponding to obtain the classifier of the corresponding target domain by performing a series of processing on the sample of the source domain, and the sample of the target domain, and returns the classifier of the corresponding target domain to the classifier generation assistant, and displays the classifier of the corresponding target domain on the terminal 210 of the corresponding target domain, or displays the classifier of the corresponding domain on the terminal 200, directly displays the corresponding classifier on the terminal of the target domain, and the target domain to the voice domain, and the voice domain is identified by the user, and the user is identified.

In one implementation scenario, to obtain a classifier for image recognition, a server or terminal may construct a maximum mean difference of a source domain and a target domain based on an image sample of the source domain and an image sample of the target domain; separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference; decomposing the optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain; predicting the image sample of the target domain based on the image sample of the source domain, the label corresponding to the image sample, the image sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the image sample of the target domain; based on the image sample of the source domain and the label corresponding to the image sample of the target domain, a classifier corresponding to the target domain is generated, so that the image of the target domain can be classified according to the classifier corresponding to the target domain to obtain the label corresponding to the image of the target domain, for example, the image of the target domain is classified according to the classifier corresponding to the target domain to obtain the label corresponding to the image of the target domain (car, automobile, bus, etc.). The maximum mean value difference is separated, so that the calculation complexity is reduced, the optimization target can be decomposed, a transformation matrix is obtained, otherwise, the memory overflow occurs in the calculation process of the computer, the transfer learning cannot be performed, the classifier corresponding to the target domain cannot be generated, and the image cannot be identified.

In one implementation scenario, to obtain a classifier for target detection, a server or terminal may construct a maximum mean difference between a source domain and a target domain based on a target object sample of the source domain and a target object sample of the target domain; separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference; decomposing the optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain; predicting the target object sample of the target domain based on the target object sample of the source domain and the label corresponding to the target object sample, the target object sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the target object sample of the target domain; and generating a classifier corresponding to the target domain based on the target object sample of the source domain and the label corresponding to the target object sample of the target domain, so that the target object of the target domain can be classified according to the classifier corresponding to the target domain to obtain the label corresponding to the image of the target domain, thereby realizing target detection, for example, classifying the target object of the target domain according to the classifier corresponding to the target domain to obtain the label (tree, pedestrian, vehicle and the like) corresponding to the target object of the target domain, and detecting the pedestrian. By separating the maximum mean value difference, the calculation complexity is reduced, so that the optimization target can be decomposed to obtain a transformation matrix, otherwise, the computer has memory overflow in the calculation process, transfer learning cannot be performed, a classifier corresponding to the target domain cannot be generated, the target object cannot be identified, and target detection cannot be realized.

In one implementation scenario, to obtain a classifier for speech recognition, a server or terminal may construct a maximum mean difference of a source domain and a target domain based on a speech sample of the source domain and a speech sample of the target domain; separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference; decomposing the optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain; predicting the voice sample of the target domain based on the voice sample of the source domain, the label corresponding to the voice sample, the voice sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the voice sample of the target domain; based on the voice sample of the source domain and the label corresponding to the voice sample of the target domain, a classifier corresponding to the target domain is generated, so that the voice of the target domain can be classified according to the classifier corresponding to the target domain, the label corresponding to the voice of the target domain is obtained, and therefore voice recognition is achieved, for example, the voice of the target domain is classified according to the classifier corresponding to the target domain, the label (Xiaoming, xiaohong, xiaoqiang and the like) corresponding to the voice of the target domain is obtained, and therefore people corresponding to the voice of the target domain are detected. By separating the maximum mean value difference, the calculation complexity is reduced, so that the optimization target can be decomposed to obtain a transformation matrix, otherwise, the memory overflow occurs in the calculation process of the computer, the transfer learning cannot be performed, the classifier corresponding to the target domain cannot be generated, and the voice cannot be recognized.

In one implementation scenario, in order to obtain a classifier for face recognition, a server or a terminal may construct a maximum mean difference between a source domain and a target domain based on a face sample of the source domain and a face sample of the target domain; separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference; decomposing the optimization target according to the separated mean value difference to obtain a transformation matrix aiming at the source domain; predicting the face sample of the target domain based on the face sample of the source domain, the label corresponding to the face sample, the face sample of the target domain and the transformation matrix aiming at the source domain to obtain the label corresponding to the face sample of the target domain; based on the face sample of the source domain and the label corresponding to the face sample of the target domain, a classifier corresponding to the target domain is generated, so that the face of the target domain can be classified according to the classifier corresponding to the target domain, the label corresponding to the face of the target domain is obtained, and face recognition is achieved, for example, the face of the target domain is classified according to the classifier corresponding to the target domain, the label (Xiaoming, xiaohong, xiaoqiang and the like) corresponding to the face of the target domain is obtained, and therefore the person corresponding to the face of the target domain is detected. By separating the maximum mean value difference, the calculation complexity is reduced, so that the optimization target can be decomposed to obtain a transformation matrix, otherwise, the memory overflows in the calculation process of the computer, the transfer learning cannot be performed, the classifier corresponding to the target domain cannot be generated, and the face cannot be recognized.

Continuing to describe the structure of the classifier generating device provided in the embodiment of the present invention, the classifier generating device may be various terminals, such as a mobile phone, a computer, etc., or may be a server 100 as shown in fig. 1.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a generating device 500 of a classifier according to an embodiment of the present invention, where the generating device 500 of a classifier shown in fig. 2 includes: at least one processor 510, a memory 550, at least one network interface 520, and a user interface 530. The various components in the classifier's generation device 500 are coupled together by a bus system 540. It is appreciated that the bus system 540 is used to enable connected communications between these components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to the data bus. The various buses are labeled as bus system 540 in fig. 2 for clarity of illustration.

The Processor 510 may be an integrated circuit chip having signal processing capabilities such as a general purpose Processor, such as a microprocessor or any conventional Processor, a digital signal Processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.

The user interface 530 includes one or more output devices 531 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 530 also includes one or more input devices 532, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

Memory 550 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile memory may be a read only memory (ROM, read Onl y Memory) and the volatile memory may be a random access memory (RAM, random Access Memory). The memory 550 described in embodiments of the present invention is intended to comprise any suitable type of memory. Memory 550 may optionally include one or more storage devices physically located remote from processor 510.

In some embodiments, memory 550 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 551 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

Network communication module 552 is used to reach other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;

A display module 553 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;

The input processing module 554 is configured to detect one or more user inputs or interactions from one of the one or more input devices 532 and translate the detected inputs or interactions.

In some embodiments, the generation apparatus of the classifier provided by the embodiments of the present invention may be implemented by combining software and hardware, and as an example, the generation apparatus of the classifier provided by the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to perform the generation method of the classifier provided by the embodiments of the present invention, for example, the processor in the form of a hardware decoding processor may use one or more Application specific integrated circuits (ASICs, applications SPECIFIC INTEGRATED circuits), DSPs, programmable logic devices (PLDs, programmable Logic Device), complex programmable logic devices (CPLDs, complex Programmable Logic Device), field programmable gate arrays (FPGAs, field-Pro programmable GATE ARRAY), or other electronic components.

In other embodiments, the generation device of the classifier provided in the embodiments of the present invention may be implemented in software, and fig. 2 shows the generation device 555 of the classifier stored in the memory 550, which may be software in the form of a program or a plug-in, and includes a series of modules including a construction module 5551, a separation module 5552, a processing module 5553, a prediction module 5554, and a generation module 5555; the building module 5551, the separating module 5552, the processing module 5553, the predicting module 5554 and the generating module 5555 are configured to implement the generating method of the classifier provided by the embodiment of the invention.

It can be understood from the foregoing that the method for generating the classifier provided by the embodiment of the present invention may be implemented by various types of generating devices of the classifier, such as an intelligent terminal, a server, and the like.

The method for generating the classifier provided by the embodiment of the invention is described below in connection with exemplary application and implementation of the server provided by the embodiment of the invention. Referring to fig. 3, fig. 3 is a flowchart of a method for generating a classifier according to an embodiment of the present invention, and is described with reference to the steps shown in fig. 3.

In step 101, a maximum mean difference between the source domain and the target domain is constructed based on the samples of the source domain and the samples of the target domain.

In order to perform migration learning on the source domain and the target domain, a maximum mean difference between the source domain and the target domain needs to be constructed according to a sample of the source domain and a sample of the target domain, so that a classifier for the target domain is generated according to the maximum mean difference.

Referring to fig. 4, fig. 4 is a schematic flow chart of an alternative embodiment of the present invention, and in some embodiments, fig. 4 illustrates that step 101 in fig. 3 may be implemented by steps 1011 to 1012 illustrated in fig. 4.

In step 1011, mapping processing is performed on the source domain samples and the target domain samples through the mapping space, so as to obtain mapped source domain samples and mapped target domain samples.

In order to generate the maximum mean difference between the source domain and the target domain, mapping processing can be performed on the sample X _s of the source domain and the sample X _t of the target domain through the mapping space, so as to obtain the mapped sample of the source domainMapped target domain sample/>Wherein/>Representing the mapping space.

In step 1012, the mapped source domain samples and the mapped target domain samples are subjected to mean value processing, so as to obtain a mean value of the source domain samples and a mean value of the target domain samples, and a difference between the mean value of the source domain samples and the mean value of the target domain samples is determined as a maximum mean value difference between the source domain and the target domain.

After mapping the source domain sample X _s and the target domain sample X _t, the mapped source domain samples can be respectively mappedMapped target domain sample/>Performing mean processing to obtain a mean value of the source domain sample and a mean value of the target domain sample respectively, and determining a difference between the mean value of the source domain sample and the mean value of the target domain sample as a maximum mean value difference between the source domain and the target domain:

Where n _s represents the number of source domain samples X _s and n _t represents the number of source domain samples X _t.

In step 102, the maximum mean difference between the source domain and the target domain is separated, and the separated mean difference is obtained.

Because the maximum mean value difference of the source domain and the target domain is a large-scale matrix, in order to reduce the calculated amount, the maximum mean value difference of the source domain and the target domain can be separated to obtain the mean value difference after separation, thereby avoiding memory overflow in the process of calculating the large matrix and causing the incapability of carrying out subsequent classification operation.

In some embodiments, separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference comprises: and separating the maximum mean value difference in the matrix form to obtain the separated mean value difference in the vector form.

Since the maximum mean difference is data in a matrix form, in order to reduce the calculation amount, the maximum mean difference in a matrix form can be converted into a separated mean difference in a vector form.

In some embodiments, separating the maximum mean difference of the source domain and the target domain to obtain a separated mean difference comprises: and carrying out matrix decomposition on the maximum mean value difference to obtain Cartesian products of the two separated mean value differences.

The maximum mean value difference is a matrix of N, the mean value difference after separation is a vector of N1, N is the sum of the number of samples in a source domain and the number of samples in a target domain, and N is a natural number greater than 2.

In step 103, the optimization target is decomposed according to the separated mean value difference, and a transformation matrix for the source domain is obtained.

The calculated amount can be greatly reduced through the separated mean value difference, so that the optimization target can be decomposed through the separated mean value difference to obtain a transformation matrix aiming at the source domain, and the sample of the target domain can be predicted according to the transformation matrix aiming at the source domain to generate a label corresponding to the sample of the target domain.

Referring to fig. 5, fig. 5 is an alternative flow chart provided by an embodiment of the present invention, and in some embodiments, step 103 in fig. 3 is shown in fig. 5 may be implemented by steps 1031 to 1032 shown in fig. 5.

In step 1031, the source domain samples and the target domain samples are combined to obtain a feature matrix.

In order to obtain a transformation matrix for the source domain according to the separated mean value difference, the samples of the source domain and the samples of the target domain need to be fused, that is, the samples of the source domain and the samples of the target domain are combined, so that a feature matrix containing the samples of the source domain and the samples of the target domain is obtained.

In step 1032, the generalized eigenvalue decomposition is performed on the optimization target according to the eigenvalue matrix and the separated mean value difference, so as to obtain a transformation matrix for the source domain.

After the feature matrix containing the sample of the source domain and the sample of the target domain is obtained, the generalized feature value decomposition can be carried out on the optimization target according to the feature matrix and the average value difference after separation, so that the transformation matrix aiming at the source domain is obtained.

In some embodiments, performing generalized eigenvalue decomposition on the optimization target according to the eigenvalue matrix and the separated mean value difference to obtain a transformation matrix for the source domain, including: multiplying the feature matrix by the mean value difference after separation to obtain an intermediate matrix; and carrying out generalized eigenvalue decomposition on the optimization target based on the intermediate matrix to obtain a transformation matrix aiming at the source domain.

In order to reduce the calculated amount related to the maximum mean difference in the optimization target, the feature matrix and the separated mean difference can be multiplied to obtain an intermediate matrix, and the optimization target is calculated according to the intermediate matrix, so that a transformation matrix aiming at the source domain is obtained.

Referring to fig. 6, fig. 6 is a schematic flow chart of an alternative embodiment of the present invention, and in some embodiments, fig. 6 illustrates that steps 101-103 in fig. 3 may be implemented by steps 101D-103D illustrated in fig. 6.

In step 101D, a maximum mean difference between the class c samples of the source domain and the target domain is constructed based on the class c samples of the source domain and the class c samples of the target domain.

Wherein C is less than or equal to C, C is the total number of sample types of the source domain, and C are both natural numbers, and the sample types of the source domain are the same as the sample types of the target domain, i.e., the source domain is similar to the target domain, e.g., the sample types of the source domain are user-violated or user-not-violated, and the sample types of the target domain are also user-violated or user-not-violated.

In some embodiments, mapping is performed on the c-type sample of the source domain and the c-type sample of the target domain through a mapping space, so as to obtain the mapped c-type sample of the source domain and the mapped c-type sample of the target domain; and respectively carrying out average value processing on the c-th sample of the source domain and the c-th sample of the target domain after mapping, correspondingly obtaining the average value of the c-th sample of the source domain and the average value of the c-th sample of the target domain, and determining the difference of the average values as the maximum average value difference between the c-th sample of the source domain and the c-th sample of the target domain.

In step 102D, the maximum mean difference between the class c samples in the source domain and the target domain is separated, so as to obtain the mean difference between the class c samples after separation.

In some embodiments, the maximum mean difference of the class c samples of the source domain and the target domain in the matrix form is separated, and the maximum mean difference of the separated class c samples in the vector form is obtained.

In some embodiments, the maximum mean difference of the class c samples is subjected to matrix decomposition to obtain a cartesian product of the mean differences of the two separated class c samples, wherein the maximum mean difference of the class c samples is a matrix of m×m, the separated mean difference is a vector of m×1, M is a sum of the number of class c samples in the source domain and the number of class c samples in the target domain, and M is a natural number greater than 2.

In step 103D, summing the mean differences of the C separated samples to obtain a sum of the mean differences after separation; and decomposing the optimization target according to the sum to obtain a transformation matrix aiming at the source domain.

And summing the obtained maximum mean value difference of the class 1 samples, the maximum mean value difference of the class 2 samples, … … and the maximum mean value difference of the class c samples to obtain the sum of the separated mean value differences, and decomposing an optimization target according to the sum to obtain a transformation matrix aiming at a source domain.

In some embodiments, the intermediate matrix is U _c＝X·e_c, X is the feature matrix, e _c is the mean difference of the separated class c samples; based on the intermediate matrix, performing generalized eigenvalue decomposition on the optimization target to obtain a transformation matrix for the source domain, including: determining a representation of the optimization objective:

and satisfies a ^TXHX^T a=i, where C is the total number of sample types in the source domain, tr is the trace of the matrix, a is the transformation matrix, M _c is the maximum mean difference of the C-th type samples, λ is the regularized term coefficient, H is the center matrix, and I is the identity matrix; replacement of XHX ^T in optimization objective with/> Based on/>And carrying out generalized eigenvalue decomposition on the optimization target according to the following formula to obtain a transformation matrix A aiming at the source domain: /(I)Wherein/>Is a lagrange multiplier.

First, an optimization objective is determined, wherein XHX ^T in the optimization objective is a large matrix running product and the intermediate matrix U _c＝X·e_c is a small matrix relative to XHX ^T, so XHX ^T in the optimization objective can be replaced withThereby based onThe optimization target can be subjected to generalized eigenvalue decomposition to obtain a transformation matrix A aiming at the source domain.

In step 104, the sample of the target domain is predicted based on the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain, so as to obtain the label corresponding to the sample of the target domain.

Since the samples of the source domain have corresponding labels, the samples of the target domain have no corresponding labels. Therefore, it is necessary to predict the sample of the target domain according to the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain, so as to obtain the label corresponding to the sample of the target domain.

In some embodiments, predicting the sample of the target domain based on the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain, to obtain the label corresponding to the sample of the target domain includes: obtaining a transformed source domain sample according to the source domain sample and a transformation matrix for the source domain; constructing a classification model of the classifier according to the transformed source domain samples and the labels corresponding to the source domain samples; and predicting the sample of the target domain through the classification model to obtain the label corresponding to the sample of the target domain.

After the transformation matrix for the source domain is obtained, the samples of the source domain can be transformed based on the transformation matrix for the source domain, so as to obtain transformed samples of the source domain. And constructing a classification model of the classifier according to the transformed source domain sample and the label corresponding to the source domain sample, namely modeling according to the transformed source domain sample and the label corresponding to the source domain sample. After modeling is completed, a sample of the target domain can be predicted through the classification model, so that the probability that the sample of the target domain belongs to a certain type of label is obtained, when the probability of the certain type of label is larger than or equal to a set threshold value, the sample of the target domain is determined to belong to the certain type of label, wherein the label is a pseudo label and is not a label which is defined manually, and the set threshold value is a reference value preset in a server by a user.

In step 105, a classifier corresponding to the target domain is generated based on the sample of the source domain and the label corresponding to the sample of the target domain.

After predicting the sample of the target domain to obtain the label corresponding to the sample of the target domain, a classifier corresponding to the target domain can be generated according to the sample of the source domain and the label corresponding to the sample of the target domain, so that the data to be classified of the target domain can be classified later.

In some embodiments, generating a classifier for a target domain based on a sample of a source domain and a label corresponding to the sample of the target domain includes: training the classifier based on the sample of the source domain and the label corresponding to the sample of the target domain to obtain the classifier corresponding to the target domain.

After obtaining the labels corresponding to the samples of the target domain, training the classifier based on the samples of the source domain and the labels corresponding to the samples of the target domain by a training method of the neural network, so as to obtain the trained classifier corresponding to the target domain, namely updating the maximum mean value difference between the source domain and the target domain through the labels corresponding to the samples of the target domain, and iteratively solving according to the maximum mean value difference between the source domain and the target domain to obtain a transformation matrix for the source domain, thereby completing training of the classifier according to the samples of the source domain and the labels corresponding to the samples of the target domain, and generating the classifier for the target domain so as to classify the data to be classified of the target domain subsequently, wherein the classifier belongs to the neural network model.

Referring to fig. 7, fig. 7 is a schematic flowchart of an alternative method provided by an embodiment of the present invention, fig. 7 illustrates that the method for generating a classifier is applied to a financial scene to classify a user and determine whether the user violates, and in some embodiments, fig. 7 illustrates that steps 101-105 in fig. 3 may be implemented by steps 101E-105E illustrated in fig. 7.

In step 101E, a maximum mean difference between the first financial scenario and the second financial scenario is constructed based on the first user behavioral characteristics in the first financial scenario and the second user behavioral characteristics in the second financial scenario.

In order to perform migration learning on the first financial scene and the second financial scene, a maximum mean difference between the first financial scene and the second financial scene is required to be constructed according to the first user behavior feature in the first financial scene and the second user behavior feature in the second financial scene, so that a classifier for the first financial scene is generated according to the maximum mean difference. The first user does not refer to a certain user, but refers to a plurality of users in the first financial scene, and the second user does not refer to a certain user, but refers to a plurality of users in the second financial scene.

In step 102E, the maximum mean difference between the first financial scene and the second financial scene is separated, so as to obtain a separated mean difference.

Because the maximum mean value difference of the first financial scene and the second financial scene is a large-scale matrix, in order to reduce the calculated amount, the maximum mean value difference of the first financial scene and the second financial scene can be separated to obtain the separated mean value difference, so that the problem that the memory overflows in the process of calculating the large matrix, and the subsequent illegal classification operation cannot be performed is avoided.

In step 103E, the optimization objective is decomposed according to the separated mean difference, so as to obtain a transformation matrix for the first financial scene.

The calculated amount can be greatly reduced through the separated mean value difference, so that the optimization target can be decomposed through the separated mean value difference to obtain a transformation matrix aiming at the first financial scene, and a second user in the second financial scene can be predicted according to the transformation matrix aiming at the first financial scene to generate an default label corresponding to the second user in the second financial scene.

In step 104E, based on the first user behavior feature, the default label corresponding to the first user, the second user behavior feature, and the transformation matrix for the first financial scene, default prediction is performed on the second user in the second financial scene, so as to obtain the default label corresponding to the second user in the second financial scene.

Since the first user has a corresponding breach tag, the second user has no corresponding breach tag. Therefore, the second user needs to be predicted according to the first user and the default label corresponding to the first user, the second user behavior feature in the second financial scene, and the transformation matrix for the first financial scene, so as to obtain the default label (whether default) corresponding to the second user in the second financial scene.

In step 105E, a classifier corresponding to the second financial scenario is generated based on the first user behavioral characteristics of the first financial scenario and the breach tags corresponding to the second user.

After predicting the second user in the second financial scene to obtain the default label corresponding to the second user in the second financial scene, a classifier corresponding to the second financial scene can be generated according to the user behavior characteristics of the first user in the first financial scene and the label corresponding to the second user in the second financial scene so as to classify the user in the second financial scene later.

The method for generating the classifier provided by the embodiment of the present invention has been described in connection with the exemplary application and implementation of the server provided by the embodiment of the present invention, and the scheme for implementing the generation of the classifier by cooperation of each module in the generation device 555 of the classifier provided by the embodiment of the present invention is further described below.

A construction module 5551, configured to construct a maximum mean difference between a source domain and a target domain based on a sample of the source domain and a sample of the target domain;

A separation module 5552, configured to separate a maximum mean difference between the source domain and the target domain;

A processing module 5553, configured to decompose an optimization target according to the mean difference obtained by the separation, so as to obtain a transformation matrix for the source domain;

A prediction module 5554, configured to predict, based on the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain, the sample of the target domain to obtain the label corresponding to the sample of the target domain;

A generating module 5555, configured to generate a classifier corresponding to the target domain based on the sample of the source domain and the label corresponding to the sample of the target domain.

In the above technical solution, the building module 5551 is further configured to map the source domain sample and the target domain sample through a mapping space, so as to obtain a mapped source domain sample and a mapped target domain sample; and respectively carrying out mean value processing on the mapped source domain samples and the mapped target domain samples, correspondingly obtaining the mean value of the source domain samples and the mean value of the target domain samples, and determining the difference between the mean value of the source domain samples and the mean value of the target domain samples as the maximum mean value difference between the source domain and the target domain.

In the above technical solution, the separation module 5552 is further configured to separate the maximum mean difference in a matrix form, so as to obtain a mean difference after separation in a vector form.

In the above technical solution, the maximum mean difference is a matrix of n×n, where N is a sum of the number of samples in the source domain and the number of samples in the target domain, and N is a natural number greater than 2; the separation module 5552 is further configured to perform matrix decomposition on the maximum mean difference to obtain a cartesian product of two separated mean differences; wherein the mean difference after separation is a vector of N1.

In the above technical solution, the processing module 5553 is further configured to combine the sample of the source domain and the sample of the target domain to obtain a feature matrix; and carrying out generalized eigenvalue decomposition on an optimization target according to the eigenvalue matrix and the separated mean value difference to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the processing module 5553 is further configured to multiply the feature matrix and the separated mean difference to obtain an intermediate matrix; and carrying out generalized eigenvalue decomposition on the optimization target based on the intermediate matrix to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the construction module 5551 is further configured to construct a maximum mean difference between the C-th class sample of the source domain and the C-th class sample of the target domain based on the C-th class sample of the source domain and the C-th class sample of the target domain, where C is less than or equal to C, C is the total number of sample types of the source domain, and C are natural numbers;

The separation module 5552 is further configured to separate the maximum mean difference between the class c samples of the source domain and the target domain, so as to obtain a mean difference between the class c samples after separation;

the processing module 5553 is further configured to sum the mean differences of the C separated samples to obtain a sum of the mean differences after separation; and decomposing the optimization target according to the sum to obtain a transformation matrix aiming at the source domain.

In the above technical solution, the intermediate matrix is U _c＝X·e_c, where X is the feature matrix, and e _c is the mean difference of the separated c-th sample; the processing module 5553 is also configured to determine a representation of the optimization objective: And satisfies a ^TXHX^T a=i, where C is the total number of sample types in the source domain, tr is the trace of the matrix, a is the transformation matrix, M _c is the maximum mean difference of the class C samples, λ is the regularization term coefficient, H is the center matrix, and I is the identity matrix; replace XHX ^T in the optimization objective with/> Based on/>And carrying out generalized eigenvalue decomposition on the optimization target according to the following formula to obtain a transformation matrix A aiming at the source domain: wherein/> Is a lagrange multiplier.

In the above technical solution, the prediction module 5554 is further configured to obtain a transformed source domain sample according to the source domain sample and the transformation matrix for the source domain; constructing a classification model of the classifier according to the transformed source domain sample and the label corresponding to the source domain sample; and predicting the sample of the target domain through the classification model to obtain a label corresponding to the sample of the target domain.

In the above technical solution, the generating module 5555 is further configured to train the classifier based on the sample of the source domain and the label corresponding to the sample of the target domain, to obtain the classifier corresponding to the target domain.

In the above technical solution, the sample of the source domain is a first user behavior feature in a first financial scene, and the sample of the target domain is a second user behavior feature in a second financial scene; the building module 5551 is further configured to build a maximum mean difference between the first financial scenario and the second financial scenario based on the first user behavior feature in the first financial scenario and the second user behavior feature in the second financial scenario; the prediction module 5554 is further configured to perform default prediction on the second user of the second financial scene based on the first user behavior feature, the default label corresponding to the first user, the second user behavior feature, and the transformation matrix for the first financial scene, so as to obtain the default label corresponding to the second user of the second financial scene;

The generating module 5555 is further configured to generate a classifier corresponding to the second financial scenario based on the first user behavior feature of the first financial scenario and the default label corresponding to the second user.

Based on the above description of the method and structure for generating the classifier, the following description will provide an example application of the device for classifying data according to the embodiment of the present invention, and referring to fig. 8, fig. 8 is a schematic application scenario diagram of the system 20 for classifying data according to the embodiment of the present invention, where the terminal 200 is connected to the server 100 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of both.

In some embodiments, the terminal 200 locally executes the method for classifying the data to be classified according to the embodiment of the present invention to obtain the tag corresponding to the data to be classified according to the data to be classified in the target domain input by the user, for example, a classification assistant is installed on the terminal 200, the user inputs the data to be classified in the target domain in the classification assistant, the terminal 200 obtains the tag corresponding to the data to be classified according to the data to be classified in the input target domain, and the tag corresponding to the data to be classified is displayed on the display interface 210 of the terminal 200.

In some embodiments, the terminal 200 may also send data to be classified in a target domain input by a user on the terminal 100 to the server 100 through the network 300, call a classification function of the data to be classified provided by the server 100, obtain a tag corresponding to the data to be classified by using the classification method of the data to be classified provided by the embodiment of the present invention, for example, install a classification assistant on the terminal 200, the user inputs the data to be classified in the target domain in the classification assistant, the terminal sends the data to be classified in the target domain to the server 100 through the network 300, and after the server 100 receives the data to be classified in the target domain, the server 100 performs classification processing on the data to be classified in the target domain to obtain the tag corresponding to the data to be classified, and returns the tag corresponding to the data to be classified to the classifier to generate the assistant, and display the tag corresponding to the data to be classified on the display interface 210 of the terminal 200, or the server 100 directly gives the tag corresponding to the data to be classified.

The classification system based on the data to be classified is described. Referring to fig. 9, fig. 9 is a schematic structural diagram of a device 600 for classifying data according to an embodiment of the present invention, where the device 600 for classifying data shown in fig. 9 includes: at least one processor 610, a memory 650, at least one network interface 620, and a user interface 630. The functions of the processor 610, the memory 650, the at least one network interface 620, and the user interface 630 are similar to those of the processor 510, the memory 550, the at least one network interface 520, and the user interface 530, that is, the functions of the output device 631 and the input device 632 are similar to those of the output device 531 and the input device 532, and the functions of the operating system 651, the network communication module 652, the display module 653, and the input processing module 654 are similar to those of the operating system 551, the network communication module 552, the display module 553, and the input processing module 554, respectively, which are not described herein.

In other embodiments, the device for classifying data according to the embodiments of the present invention may be implemented in software, and fig. 9 shows a device 655 for classifying data stored in a memory 650, which may be software in the form of a program, a plug-in, or the like, and includes a series of modules, including a determining module 6551 and a classifying module 6552; the determining module 6551 and the classifying module 6552 are configured to implement the method for classifying data to be classified according to the embodiment of the present invention.

It can be understood from the foregoing that the method for classifying data to be classified according to the embodiments of the present invention may be implemented by various types of classifying devices for data to be classified, such as an intelligent terminal, a server, and the like.

The method for classifying the data to be classified provided by the embodiment of the invention is described below in connection with exemplary application and implementation of the server provided by the embodiment of the invention. Referring to fig. 10, fig. 10 is a flowchart of a method for classifying data according to an embodiment of the present invention, and the description will be made with reference to the steps shown in fig. 10.

In step 201, data to be classified in a target domain is determined.

In step 202, the data to be classified is classified by the classifier corresponding to the target domain, and the tag corresponding to the data to be classified is obtained.

After the classifier corresponding to the target domain is generated by the generation method of the classifier (the maximum mean difference between the source domain and the target domain is constructed based on the sample of the source domain and the sample of the target domain, the maximum mean difference between the source domain and the target domain is separated to obtain the separated mean difference, the optimized target is decomposed according to the separated mean difference to obtain a transformation matrix aiming at the source domain, the sample of the target domain is predicted based on the sample of the source domain and the corresponding label of the sample, the sample of the target domain and the transformation matrix aiming at the source domain to obtain the corresponding label of the sample of the target domain, and the classifier corresponding to the target domain is generated based on the sample of the target domain and the corresponding label of the sample of the target domain), the data to be classified in the target domain can be classified according to the classifier corresponding to the target domain. Therefore, the data to be classified in the target domain can be determined first, and then the data to be classified is classified through the classifier corresponding to the target domain, so that the label corresponding to the data to be classified is obtained.

In some embodiments, the source domain is a first financial scene, the target domain is a second financial scene, and the data to be classified in the target domain is user behavior features in the second financial scene; determining data to be classified in a target domain, comprising: determining user behavior features in a second financial scene in the target domain;

Classifying the data to be classified through a classifier corresponding to the target domain to obtain a label corresponding to the data to be classified, wherein the label comprises: and classifying the user behavior characteristics in the second financial scene through a classifier corresponding to the second financial scene to obtain the default labels corresponding to the user behavior characteristics in the second financial scene.

Having thus described the method for classifying data provided by the embodiment of the present invention, the following continues to describe a scheme in which each module in the apparatus 655 for classifying data provided by the embodiment of the present invention cooperates with each other to implement classification of data to be classified.

A determining module 6551 for determining data to be classified in the target domain;

And the classification module 6552 is configured to classify the data to be classified by using a classifier corresponding to the target domain, so as to obtain a label corresponding to the data to be classified.

It should be noted here that: the description of the device is similar to the description of the method, and the description of the beneficial effects of the method is omitted herein for details of the device not disclosed in the embodiments of the present invention, please refer to the description of the embodiments of the method of the present invention.

In the following, an exemplary application of the embodiment of the present invention in a practical application scenario will be described.

MMD describes the difference between the two mapped averages, and the mathematical description is shown in equation (1):

Where n _s represents the number of samples of source field X _s, n _t represents the number of samples of target field X _t, Representing a feature mapping function, i.e., mapping the sample features to another space (e.g., mapping the sample features to a regenerated hilbert space using a kernel matrix). In the process of solving MMD distance, a quadratic product part appears after the square of the distance is unfolded, and the distance (formula (1)) is transformed into tr (KL) - λtr (K) to be solved by introducing a kernel matrix K and an MMD matrix M, wherein M can be expressed as the following formula (2):

Where D _s represents the source domain (including the probability distribution of the sample data and samples of the source domain) and D _t represents the target domain (including the probability distribution of the sample data and samples of the target domain).

The field self-adaption is one of the most extensive problems in transfer learning, and can be applied to scenes with sample labels in a source field and unlabeled samples in a target field. In practice, due to the lack of labeled samples, only labeled samples in other similar scenes can be found to replace modeling, and if a model built by labeled samples in similar scenes is directly used for target domain prediction, the problem of inaccurate prediction results due to data distribution difference in two scenes often occurs. Therefore, some modification of the source domain model is needed to enable it to be successfully used for target domain prediction while maintaining good prediction performance. Joint distributed adaptation (JDA, joint Distribution Adaptati on) is one of the typical methods in field adaptation, which was developed from classical method migration component analysis (TCA, trans fer Component Analysis). The JDA searches a public feature space of the source domain and the target domain through feature transformation, so that the difference between edge distribution and conditional distribution of the two domains is as small as possible, and the migration of knowledge from the source domain to the target domain is completed. The overall objective of the optimization of JDA is to ensure that the sum of MMD distances of the edge distribution and the conditional distribution is minimized with the variance unchanged, and the calculation process is as shown in formula (3):

And satisfies a ^TXHX^T a=i (3)

Wherein C is the total number of sample types in the source domain, tr is the trace of the matrix, a is the transformation matrix, M _c is the maximum mean difference of the class C samples, λ is the regularized term coefficient, H is the center matrix, I is the identity matrix, and X is the feature matrix.

Because of the lack of tag information in the target domain, computing the conditional distribution of the target domain is a key technique in JDA. The quasi-conditional probability can be approximated to the conditional probability distribution, namely P (Y _t|X_t)≈P(X_t|Y_t), and the pseudo-label strategy is combined, the classifier is trained by using the source domain data, the prediction result on the target domain is used as the target domain label, and the accuracy of the pseudo-label is improved through continuous iteration.

The JDA solution process is similar to principal component analysis (PCA, PRINCIPAL COMPONENT ANALYSIS) in that the minimum optimization problem described above is converted into a generalized eigenvalue decomposition transformation matrix a by introducing lagrangian multipliers. And mapping the source domain and target domain data to a public feature subspace by utilizing the transformation matrix A obtained by solving, modeling by using a machine learning method in the public space, and completing prediction. The whole JDA algorithm flow is shown in fig. 11, wherein data acquisition is to acquire samples of a source domain and labels corresponding to the samples, data preprocessing is to normalize the data of the source domain and a target domain, X represents original features, A represents matrix transformation, Z _s represents transformed features, Y _s represents source domain labels, pseudo labels are updated through target domain prediction, the pseudo labels act on MMD iteration solving transformation matrix A, and therefore training of a classifier is achieved, and finally classification on the target domain can be achieved. The feature transformation found by the JDA can be linear or nonlinear transformation, and can be determined according to actual requirements.

The JDA method is applied to practical application, and the transformation matrix a needs to be solved through generalized eigenvalue decomposition, namely, the first k eigenvalues (k is the new feature space dimension) are solved through a formula (4):

wherein, For Lagrangian multiplier, λ is the regularization term coefficient, M _c can be calculated from equation (5):

Where n _c、m_c represents the number of samples from class c in the source domain and the destination domain, respectively, M=m/norm (M). The eigenvalue decomposition and solving process mainly involves calculating a matrix XMX ^T, wherein X is a w×n matrix (w represents the characteristic dimension of the source domain and the target domain, n represents the sum of the sample numbers of the source domain and the target domain), M is an n×n matrix, and the result of the continuous multiplication is a w×w matrix. Typically, the feature dimension will be much smaller than the total number of samples, i.e. w < n.

In computing the transformation matrix a, the problem of large scale matrix multiplication of both source and target domain data is encountered, namely the larger size of the matrices, X (w×n) and M (n×n). When the number of samples increases to a certain extent, memory overflow can occur in the XMX ^T calculation process to cause program error reporting and the JDA algorithm can not normally run.

According to the embodiment of the invention, the M _c matrix is decomposed into the Cartesian products of the two n multiplied by 1 vectors, so that the memory overflow problem caused by direct multiplication of two larger-scale matrixes is avoided, the computable data scale in a single machine environment is increased, and the JDA modeling solution is smoothly carried out in the single machine environment when the number of samples in a source domain and a target domain is increased.

Taking the prediction of the default under the financial wind control scene as an example, the prediction scheme for the default is generated as follows:

1) The scheme aims at: and establishing a set of evaluation system to predict the user default probability in the financial scene A. Because of the lack of labeled samples in the financial scene A, only a large number of unlabeled samples can be collected, and a financial scene B similar to the financial scene A can be collected. Therefore, the existing data training model can be used for predicting the default probability in the financial scene A, and the model effect needs to meet a certain wind control requirement.

2) Training: and collecting unlabeled samples in the financial scene A and labeled samples in the financial scene B, wherein the total number of the samples in the two scenes is about 10 ten thousand. And establishing a mathematical model by using a joint distribution self-adaptive method, and improving the accuracy of the model breach probability prediction in the financial scene A by reducing the data distribution difference in two scenes. In the JDA solving generalized eigenvalue decomposition process, the XMX ^T calculation process is optimized according to the embodiment of the invention, so that the solving process is ensured to be smoothly carried out, and the transformation matrix A is obtained. And mapping a source domain (financial scene B data) and a target domain (financial scene A data) to a public feature subspace by using the matrix A, and establishing a logistic regression model in the new feature subspace to realize prediction of the default probability under the financial scene A.

3) Effect evaluation: before the method of the embodiment of the invention is used, under a single machine environment, when the total number of samples in the two scenes of the financial scene A and the financial scene B exceeds a certain number, memory overflow in the calculation process often occurs, so that a program error occurs. After the XMX ^T matrix operation process is optimized by the method provided by the embodiment of the invention, the JDA+ classifier model can be normally trained even under larger-scale data, the default probability under the scene A is predicted on a test set, and the model is found to have higher KS and AUC (used for judging accuracy).

In the embodiment of the invention, the special form of the MMD matrix M _c is utilized to decompose M _c into two Cartesian products of n multiplied by 1 vectors, as shown in a formula (6):

wherein, N _c、m_c represents the number of samples from class c in the source domain and the destination domain, respectively.

Meanwhile, the M matrix and the norms thereof can be obtained by e _c through calculation, as shown in formulas (7) and (8):

the matrix XMX ^T is converted, and the calculation process is shown in a formula (9):

Wherein α _c is an influence coefficient generated after norm normalization.

The specific steps for calculating XMX ^T are as follows:

Setting source domain data Target domain data/>W represents the feature dimension of the source and target domain inputs X, the total number of samples n=n _s+n_t, calculation/>

1) The M _c matrix is decomposed into two vector products of vectors n x1 and 1 x n as shown in equation (10):

wherein,

2) Calculating the intermediate matrix U _c＝X·e_c(U_c to represent the shape as a wx1 vector, thereby obtaining

3) Determining M matrix norms(Scalar result), where α _ij represents the influence coefficient, where/>

Wherein the vector e _c property is utilized, i.e. the equation is satisfiedAccording to/>And tr (mA+nB) =m.tr (A) +n.tr (B), the/>

4) The final result XMX ^T is obtained as shown in formula (11):

the large matrix continuous multiplication calculation optimization method based on vector decomposition in the embodiment of the invention is not only suitable for MMD matrix continuous multiplication calculation, but also suitable for other special form matrix calculation scenes which can be disassembled into two vectors for multiplication.

The spatial complexity of XMX ^T is O (n ²) under the original calculation mode, and the spatial complexity of X.e.e ^T·X^T,X·e·e^T·X^T obtained by decomposing an M matrix vector is changed into O (w multiplied by n), wherein n represents the sum of the sample numbers of a source domain and a target domain, and w represents the feature dimension, wherein the feature dimension is far smaller than the total number of samples, namely w < n. Therefore, the embodiment of the invention reduces the space complexity of the JDA algorithm and relieves the memory overflow problem in the calculation process to a certain extent.

In summary, the embodiment of the invention separates the maximum mean difference of the source domain and the target domain, and has the following beneficial effects:

The foregoing is merely exemplary embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. The generation method of the classifier is characterized in that a sample of a source domain is a first user behavior characteristic in a first financial scene, a sample of a target domain is a second user behavior characteristic in a second financial scene, the source domain is the first financial scene, the target domain is the second financial scene, and the sample of the source domain and the sample of the target domain are images;

The method comprises the following steps:

Constructing a maximum mean difference of a source domain and a target domain based on a sample of the source domain and a sample of the target domain, comprising:

Constructing the maximum mean difference of the C-th class samples of the source domain and the target domain based on the C-th class samples of the source domain and the C-th class samples of the target domain, wherein C is less than or equal to C, C is the total number of the sample types of the source domain, and C and C are natural numbers;

separating the maximum mean difference of the source domain and the target domain, comprising:

Separating the maximum mean value difference of the class c samples of the source domain and the target domain to obtain the mean value difference of the class c samples after separation;

Decomposing an optimization target according to the mean value difference obtained by separation to obtain a transformation matrix aiming at the source domain, wherein the transformation matrix comprises the following components:

combining the sample of the source domain and the sample of the target domain to obtain a feature matrix;

Multiplying the feature matrix and the mean value difference of the separated c-th class sample to obtain an intermediate matrix U _c＝X·e_c, wherein X is the feature matrix, and e _c is the mean value difference of the separated c-th class sample;

Determining a representation of the optimization objective: And satisfies a ^TXHX^T a=i, where C is the total number of sample types in the source domain, tr is the trace of the matrix, a is the transformation matrix, M _c is the maximum mean difference of the class C samples, λ is the regularization term coefficient, H is the center matrix, and I is the identity matrix;

Substitution of XHX ^T in the optimization objective to Alpha _c is an influence coefficient generated after norm normalization;

Based on And carrying out generalized eigenvalue decomposition on the optimization target according to the following formula to obtain a transformation matrix A aiming at the source domain:

wherein/> Is a Lagrangian multiplier;

2. The method of claim 1, wherein constructing the maximum mean difference of the source domain and the target domain based on the source domain samples and the target domain samples comprises:

mapping the source domain samples and the target domain samples through a mapping space respectively to obtain mapped source domain samples and mapped target domain samples correspondingly;

Respectively carrying out mean value processing on the mapped source domain sample and the mapped target domain sample to correspondingly obtain the mean value of the source domain sample and the mean value of the target domain sample, and

And determining the difference between the average value of the source domain samples and the average value of the target domain samples as the maximum average value difference between the source domain and the target domain.

3. The method of claim 1, wherein separating the maximum mean difference of the source domain and the target domain comprises:

And separating the maximum mean value difference in a matrix form to obtain the separated mean value difference in a vector form.

4. A method according to claim 1 or 3, wherein the maximum mean difference is a matrix of N x N, where N is the sum of the number of samples in the source domain and the number of samples in the target domain, and N is a natural number greater than 2;

the separating the maximum mean difference of the source domain and the target domain includes:

performing matrix decomposition on the maximum mean value difference to obtain Cartesian products of the two separated mean value differences;

wherein the mean difference after separation is a vector of N1.

5. The method according to claim 1, wherein predicting the sample of the target domain based on the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain, to obtain the label corresponding to the sample of the target domain, includes:

obtaining a transformed source domain sample according to the source domain sample and the transformation matrix for the source domain;

6. The method of claim 1, wherein the generating a classifier corresponding to the target domain based on the samples of the source domain and the labels corresponding to the samples of the target domain comprises:

Training the classifier based on the sample of the source domain and the label corresponding to the sample of the target domain to obtain the classifier corresponding to the target domain.

7. The method of claim 1, wherein the step of determining the position of the substrate comprises,

The constructing the maximum mean difference between the source domain and the target domain based on the source domain sample and the target domain sample comprises the following steps:

Constructing a maximum mean difference of the first financial scene and the second financial scene based on a first user behavior feature in the first financial scene and a second user behavior feature in the second financial scene;

the predicting the sample of the target domain based on the sample of the source domain and the label corresponding to the sample, the sample of the target domain, and the transformation matrix for the source domain to obtain the label corresponding to the sample of the target domain includes:

Based on the first user behavior characteristics, the default labels corresponding to the first users, the second user behavior characteristics and the transformation matrix aiming at the first financial scene, performing default prediction on the second users of the second financial scene to obtain the default labels corresponding to the second users of the second financial scene;

The generating a classifier corresponding to the target domain based on the sample of the source domain and the label corresponding to the sample of the target domain includes:

and generating a classifier corresponding to the second financial scene based on the first user behavior characteristic of the first financial scene and the default label corresponding to the second user.

8. A classification method of data to be classified, characterized by being applied to a classifier of a target domain generated by the generation method of the classifier as claimed in any one of claims 1 to 7;

The method comprises the following steps:

determining data to be classified in the target domain;

9. The generation device of the classifier is characterized in that a sample of a source domain is a first user behavior characteristic in a first financial scene, a sample of a target domain is a second user behavior characteristic in a second financial scene, the source domain is the first financial scene, the target domain is the second financial scene, and the sample of the source domain and the sample of the target domain are images;

the device comprises:

The construction module is further configured to construct a maximum mean difference between the C-th class sample of the source domain and the C-th class sample of the target domain based on the C-th class sample of the source domain and the C-th class sample of the target domain, where C is less than or equal to C, C is the total number of sample types of the source domain, and C are both natural numbers;

The processing module is further used for combining the samples of the source domain and the samples of the target domain to obtain a feature matrix;

wherein/> Is a Lagrangian multiplier;

10. A classification apparatus of data to be classified, characterized by applying a classifier of a target domain generated by the generation method of the classifier according to any one of claims 1 to 7; the device comprises:

11. A classifier generation apparatus, the apparatus comprising:

a memory for storing executable instructions;

A processor for implementing the method of generating a classifier according to any one of claims 1 to 7, or the method of classifying data to be classified according to claim 8, when executing the executable instructions stored in the memory.

12. A storage medium storing executable instructions for causing a processor to implement the method of generating a classifier as claimed in any one of claims 1 to 7, or the method of classifying data to be classified as claimed in claim 8, when executed.