WO2020232874A1 - 基于迁移学习的建模方法、装置、计算机设备和存储介质 - Google Patents

基于迁移学习的建模方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020232874A1
WO2020232874A1 PCT/CN2019/102740 CN2019102740W WO2020232874A1 WO 2020232874 A1 WO2020232874 A1 WO 2020232874A1 CN 2019102740 W CN2019102740 W CN 2019102740W WO 2020232874 A1 WO2020232874 A1 WO 2020232874A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
sample
label
dimensionality reduction
target
Prior art date
Application number
PCT/CN2019/102740
Other languages
English (en)
French (fr)
Inventor
马新俊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020232874A1 publication Critical patent/WO2020232874A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This application relates to a modeling method, device, computer equipment and storage medium based on transfer learning.
  • Transfer learning refers to the transfer of knowledge learned from one scenario to another scenario, so that the model can perform well in a large number of new scenarios. prediction.
  • a modeling method, device, computer device, and storage medium based on transfer learning are provided.
  • a modeling method based on transfer learning including:
  • the first dimensionality reduction feature input and the basic model corresponding to the target label sample are tested to obtain the weight information corresponding to the first dimensionality reduction feature, and the weight information is higher than the first preset weight threshold.
  • Dimensionality reduction features as universal row features;
  • the general column feature and the general row feature are input into a basic model corresponding to the target label sample for model training to obtain a target model.
  • a modeling device based on transfer learning including:
  • the sample acquisition module is used to acquire the label sample to be learned and the target label sample
  • the feature dimensionality reduction module is used to perform nuclear principal component analysis on the label sample to be learned and the target label sample to obtain the first dimensionality reduction feature corresponding to the label sample to be learned, which corresponds to the target label sample
  • the second dimensionality reduction feature
  • a column feature acquisition module configured to input the first dimensionality reduction feature and the second dimensionality reduction feature into a trained general feature acquisition model to obtain general column features
  • the row feature acquisition module is used to test the first dimensionality reduction feature input and the basic model corresponding to the target label sample to obtain weight information corresponding to the first dimensionality reduction feature, and set the weight information higher than
  • the first dimensionality reduction feature with a preset weight threshold is used as the universal row feature
  • the model training module is used to input the general column features and the general row features into a basic model corresponding to the target label sample for model training to obtain a target model.
  • Computer readable instruction A computer device that includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the computer readable instructions When executed by the processor, the one or more Each processor performs the following steps:
  • the first dimensionality reduction feature input and the basic model corresponding to the target label sample are tested to obtain the weight information corresponding to the first dimensionality reduction feature, and the weight information is higher than the first preset weight threshold.
  • Dimensionality reduction features as universal row features;
  • the general column feature and the general row feature are input into a basic model corresponding to the target label sample for model training to obtain a target model.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the first dimensionality reduction feature input and the basic model corresponding to the target label sample are tested to obtain the weight information corresponding to the first dimensionality reduction feature, and the weight information is higher than the first preset weight threshold.
  • Dimensionality reduction features as universal row features;
  • the general column feature and the general row feature are input into a basic model corresponding to the target label sample for model training, to obtain a target model computer readable instruction.
  • Fig. 1 is an application environment diagram of a modeling method based on transfer learning in one or more embodiments.
  • Fig. 2 is a method flowchart of a modeling method based on transfer learning in one or more embodiments.
  • Fig. 3 is a flowchart of a method for updating a target model in a modeling method based on transfer learning in one or more embodiments.
  • Fig. 4 is a flowchart of a method for determining weight information in a modeling method based on transfer learning in one or more embodiments.
  • Fig. 5 is a block diagram of a modeling method and apparatus based on transfer learning in one or more embodiments.
  • Figure 6 is a block diagram of a computer device according to one or more embodiments.
  • the migration learning-based modeling method provided in the embodiment of the present invention can be applied to the application environment shown in FIG. 1.
  • the server 120 obtains the label sample to be learned and the target label sample from the terminal 110, and the server 120 transfers the label sample to be learned Perform nuclear principal component analysis with the target label sample to obtain the first dimensionality reduction feature corresponding to the label sample to be learned, and the second dimensionality reduction feature corresponding to the target label sample.
  • the server 120 compares the first dimensionality reduction feature and the second dimensionality reduction feature Input the trained general feature acquisition model to obtain the general column features.
  • the server 120 tests the first dimensionality reduction feature input into the basic model corresponding to the target label sample to obtain the weight information corresponding to the first dimensionality reduction feature.
  • the server 120 will The first dimensionality reduction feature whose weight information is higher than the preset weight threshold is used as the universal row feature, and the server 120 inputs the universal column feature and the universal row feature into the basic model corresponding to the target label sample for model training, to obtain the target model.
  • FIG. 2 it is a flowchart of a migration learning-based modeling method in one of the embodiments, and the method specifically includes the following steps:
  • Step 202 Obtain a label sample to be learned and a target label sample.
  • the label sample to be learned and the target label sample represent label samples of different business types
  • the label sample to be learned is a performance sample of business A
  • the target label sample is a very small number of performance samples of business B. It is understandable that both the label sample to be learned and the target label sample are samples with label information.
  • transfer learning refers to the transfer of knowledge learned in one scenario to another scenario.
  • the existing knowledge is called the source domain
  • the new knowledge to be learned is called the target domain.
  • the learned knowledge is transferred to the learning of another unknown knowledge, that is, from the source domain to the target domain.
  • the source domain may be a label sample to be learned
  • the target domain may be a target label sample.
  • the source domain may be a label sample to be learned about the user's repayment ability when making a car loan
  • the target domain may be a target label sample for the user's repayment ability when making a small loan.
  • the modeling method is used to establish the target model of the micro-loan business, so that the target model can evaluate the repayment ability of the user when making a small loan.
  • the server can obtain the label sample to be learned and the target label sample from other servers, and can also obtain the label sample to be learned and the target label sample from the terminal.
  • a sample with label information refers to the pre-defined label information in the sample. For example, when the label sample to be learned is a picture of a puppy, the label information in the label sample to be learned is "puppy".
  • Step 204 Perform nuclear principal component analysis on the label sample to be learned and the target label sample to obtain a first dimensionality reduction feature corresponding to the label sample to be learned and a second dimensionality reduction feature corresponding to the target label sample.
  • the server will perform kernel principal component analysis on the label sample to be learned and the target label sample.
  • the kernel principal component analysis is to transform the non-linearly separable data into a new low-dimensional subspace suitable for alignment and linear classification. That is, the label samples to be learned and the target label samples are subjected to dimensionality reduction processing.
  • the kernel principal component analysis is a very effective dimensionality reduction in machine learning.
  • the original high-dimensional data can be represented by a few representative dimensions, such as 1000 dimensions are represented by 100 dimensions without losing key data information.
  • the server uses nuclear principal component analysis to learn samples from the source domain (namely business A) and target domain (namely business B) to obtain a common cross-data domain subspace, and map all samples to this subspace to obtain new Characteristic representation. That is, the server will obtain the first dimensionality reduction feature corresponding to the label sample to be learned, and the second dimensionality reduction feature corresponding to the target label sample.
  • the server then sorts the feature vector according to the feature value to obtain the sorting result, and according to The feature vector whose ranking result is greater than the preset threshold establishes a cross-data domain subspace, and then maps the label sample to be learned and the target label sample into the cross-data domain subspace, and the first dimensionality reduction corresponding to the label sample to be learned can be accurately obtained Feature, the second dimensionality reduction feature corresponding to the target label sample.
  • Step 206 Input the first dimensionality reduction feature and the second dimensionality reduction feature into the trained general feature acquisition model to obtain general column features.
  • the general feature acquisition model is to solve the problem of the different distributions of the source domain and the target domain in transfer learning.
  • the method of minimizing the maximum mean difference is used to reduce the difference in edge probability distribution between domains.
  • the minimization of the maximum mean difference is extended to the conditional probability distribution, joint matching marginal probability distribution and conditional probability distribution between domains.
  • Minimizing the maximum mean difference refers to projecting and summing each sample, and using the size of the sum to express the distribution difference of the two data. It is understandable that the cross-data domain subspace completes knowledge transfer by mapping the source domain and target domain to the same space (or mapping one of them to the other space) and minimizing the distance between the source domain and the target domain. .
  • This category contains multiple conditions and all conditions hold at the same time.
  • the probability of is the joint probability, and the list of joint probabilities is called the joint distribution.
  • Y b) or P (a
  • conditional probability is the conditional probability distribution, that is, two related random variables X and Y are known.
  • the server can accurately obtain the general column features by inputting the first dimensionality reduction feature and the second dimensionality reduction feature into the trained general feature acquisition model.
  • Step 208 Test the basic model corresponding to the first dimensionality reduction feature input and the target label sample to obtain weight information corresponding to the first dimensionality reduction feature, and use the first dimensionality reduction feature whose weight information is higher than a preset weight threshold as the general Line characteristics.
  • Step 210 Input the general column feature and the general row feature into the basic model corresponding to the target label sample for model training to obtain the target model.
  • the server further obtains common row characteristics according to the contribution of each instance in the source domain to the training of the target domain model, and uses the L2,1 norm to select relevant instances in the source domain for model training to obtain a target model suitable for business B It can be understood that the L2,1 norm refers to the row sparse selection feature, which is used by the server to obtain the general row feature.
  • the server inputs the obtained general column features and general row features into the basic model corresponding to the target label sample for model training, that is, completes the process of migration learning, and can obtain a better target model.
  • the server analyzes the to-be-learned label sample and the target label sample core by principal component analysis, and obtains the first dimensionality reduction feature and the second dimensionality reduction feature.
  • the original high-dimensional data can be represented by a few representative dimensions.
  • the basic model is tested, and the weight information corresponding to the first dimensionality reduction feature is obtained.
  • the first dimensionality reduction feature whose weight information is higher than the preset weight threshold is used as the general row feature.
  • the general column feature and the general row feature can be used in only a small amount In the case of labeled samples, transfer learning from the source domain to the target domain is realized, thereby completing the establishment of the target model.
  • the method further includes the following steps:
  • Step 302 Obtain the sample to be evaluated, input the sample to be evaluated into the target model, and output sample label information corresponding to the sample to be evaluated.
  • the sample to be evaluated refers to a sample for verifying the target model.
  • the server inputs the sample to be evaluated into the target model, and can output sample label information corresponding to the sample to be evaluated. It is understandable that the sample to be evaluated is a sample without label information.
  • Step 304 Display the sample label information, and obtain label correction information corresponding to the sample label information.
  • Step 306 Adjust the weights in the target model according to the tag correction information, and update the target model according to the weights after each adjustment to obtain an updated target model.
  • the way the server displays the sample label information includes but is not limited to online display and sending to the corresponding terminal for display. After the server displays the sample label information, it will obtain the label correction information corresponding to the sample label information.
  • the sample to be evaluated is a user's vehicle loan affordability sample
  • the user's vehicle loan affordability sample label information is "medium level”
  • the sample label information is displayed, if the server receives the label correction information returned by the terminal When it is "high level”, the server will adjust the weights in the target model according to the tag correction information, and update the target model according to the weights after each adjustment to obtain the updated target model.
  • the online learning and real-time update of the target model can be achieved through the intervention of the terminal, and the target model is further updated according to the tag correction information returned by the terminal, so as to improve the processing capability of the target model for samples.
  • the corrected result is also incorporated into the training set, the model is trained again, the model is updated, and the next round of prediction is performed.
  • the server obtains the sample to be evaluated, inputs the sample to be evaluated into the target model, outputs the sample label information corresponding to the sample to be evaluated, then displays the sample label information, obtains the label correction information, and corrects the target model according to the label correction information. Adjust the weights of, and update the target model according to the weights after each adjustment to obtain the updated target model, which can realize the real-time update of the target model.
  • the method further includes: performing feature comparison between the first dimensionality reduction feature and the second dimensionality reduction feature to obtain feature similarity; and the first dimensionality reduction when the feature similarity is higher than a preset similarity threshold Features are used as general column features.
  • the first dimensionality reduction feature corresponding to the source domain and the second dimensionality reduction feature corresponding to the target domain are compared for feature similarity, and the first dimensionality reduction feature when the feature similarity is higher than the preset similarity threshold
  • the general column feature is used to train the basic model in combination with the general row feature to obtain the target model.
  • the server performs feature comparison between the first dimensionality reduction feature and the second dimensionality reduction feature to obtain the feature similarity; the first dimensionality reduction feature when the feature similarity is higher than the preset similarity threshold is used as the general column feature,
  • the general column features are used for transfer learning of the model to further obtain the target model.
  • the method further includes the following steps:
  • Step 402 Test the basic model corresponding to the target label sample with the first dimensionality reduction feature input, and output sample label information corresponding to the label sample to be learned.
  • Step 404 Display the sample label information, and obtain label correctness and error information corresponding to the sample label information.
  • the sample label information refers to the label information corresponding to the label sample to be learned
  • the label correctness and error information refers to the correctness and wrongness judgment information of the sample label information based on the label information in the label sample to be learned.
  • the server tests the first dimensionality reduction feature input and the basic model corresponding to the target label sample, outputs sample label information corresponding to the label sample to be learned, and displays the sample label information online or sends it to the corresponding terminal for display.
  • the label information in the label sample to be learned is "puppy"
  • the server inputs the first dimensionality reduction feature into the basic model corresponding to the target label sample to perform
  • the sample label information obtained is "kitten”
  • the server will judge the correctness of the sample label information based on the label information in the label sample to be learned.
  • the label correctness and error information includes but is not limited to correctness and error.
  • Step 406 Evaluate the first dimensionality reduction feature according to the label correctness and error information, and obtain feature contribution degree information corresponding to the first dimensionality reduction feature.
  • Step 408 Determine the weight information of the first dimensionality reduction feature according to the feature contribution degree information.
  • the basic model is used to obtain the contribution of each instance in the source domain to the training of the target domain model, and the server further determines the weight information according to the obtained contribution.
  • the weight is high, it means that the feature has a high applicability.
  • the weight is low, it means that the applicability of the feature is low.
  • the features with high applicability are screened out to obtain general row features, which can be used to further establish the target model.
  • the server evaluates the feature contribution degree of the first dimensionality reduction feature to the target domain model training according to the label correctness and error information, obtains the feature contribution degree information, and determines the weight information of the first dimensionality reduction feature according to the feature contribution degree information, and combines the weight information
  • the first dimensionality reduction feature higher than the preset weight threshold is used as the universal row feature.
  • the server tests the first dimensionality reduction feature input and the basic model corresponding to the target label sample, outputs sample label information corresponding to the label sample to be learned, and displays the sample label information to obtain the sample label information Corresponding label right and wrong information, further judge the contribution degree of the first dimensionality reduction feature to the target domain model training through the label right and wrong information, that is, evaluate the first dimensionality reduction feature according to the label right and wrong information, and obtain the feature contribution degree corresponding to the first dimensionality reduction feature Information, the weight information of the first dimensionality reduction feature is determined according to the feature contribution information, and the universal row feature can be further determined.
  • the server performs migration learning according to the universal row feature and the universal column feature to establish a target model suitable for the target domain.
  • the method further includes: dividing the general column features and the general row features into a predetermined number of training feature sets; and sequentially inputting the training feature sets into the input variables of the basic model for training until all training feature sets are trained After that, the trained target model is obtained.
  • a predetermined number of training feature sets are used to train the basic model to obtain the trained target model, and realize the migration learning from the source domain to the target domain.
  • the server divides the general column features and general row features into a predetermined number of training feature sets, and sequentially inputs the training feature sets into the input variables of the basic model for training, until all training feature sets are trained, and the trained
  • the target model can be based on transfer learning to build an effective model with a small number of labeled samples.
  • FIG. 5 it is a schematic diagram of a modeling device based on transfer learning in an embodiment, and the device includes:
  • the sample obtaining module 502 is used to obtain the label sample to be learned and the target label sample;
  • the feature dimensionality reduction module 504 is configured to perform nuclear principal component analysis on the label sample to be learned and the target label sample to obtain a first dimensionality reduction feature corresponding to the label sample to be learned and a second dimensionality reduction feature corresponding to the target label sample;
  • the column feature acquisition module 506 is configured to input the first dimensionality reduction feature and the second dimensionality reduction feature into the trained general feature acquisition model to obtain general column features;
  • the row feature acquisition module 508 is used to input the first dimensionality reduction feature input to the basic model corresponding to the target label sample for testing, obtain weight information corresponding to the first dimensionality reduction feature, and set the weight information to be higher than the first preset weight threshold Dimensionality reduction features are used as universal row features;
  • the model training module 510 is configured to input the general column features and the general row features into the basic model corresponding to the target label sample for model training to obtain the target model.
  • the model training module includes: a label information output module for obtaining samples to be evaluated, inputting the samples to be evaluated into the target model, and outputting sample label information corresponding to the samples to be evaluated; and a correction information obtaining module for Display the sample label information to obtain the label correction information corresponding to the sample label information; the model update module is used to adjust the weight in the target model according to the label correction information, and perform the target model according to the weight after each adjustment Update, get the updated target model.
  • the column feature acquisition module includes: a feature comparison module, which is used to perform feature comparison between the first dimensionality reduction feature and the second dimensionality reduction feature to obtain feature similarity; and the similarity judgment module is used to compare the features When the similarity is higher than the preset similarity threshold, the first dimensionality reduction feature is used as the general column feature.
  • the row feature acquisition module includes: testing the first dimensionality reduction feature input with the basic model corresponding to the target label sample, and outputting sample label information corresponding to the label sample to be learned; displaying the sample label information to obtain Label correctness and error information corresponding to the sample label information; evaluate the first dimensionality reduction feature according to the label correctness and error information, and obtain feature contribution degree information corresponding to the first dimensionality reduction feature; determine the weight information of the first dimensionality reduction feature according to the feature contribution degree information.
  • the model training module includes: dividing the general column features and general row features into a predetermined number of training feature sets; sequentially inputting the training feature sets into the input variables of the basic model for training until all training feature sets are trained , Get the trained target model.
  • the various modules in the above-mentioned migration learning-based modeling device can be implemented in whole or in part by software, hardware, and combinations thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • the processor can be a central processing unit (CPU), a microprocessor, a single-chip microcomputer, etc.
  • the aforementioned modeling device based on transfer learning can be implemented in a form of computer readable instructions.
  • a computer device is provided.
  • the computer device may be a server or a terminal.
  • the computer device When the computer device is a terminal, its internal structure diagram can be as shown in Figure 6.
  • the computer equipment includes a processor, a memory, and a network interface connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer readable instructions.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a modeling method based on transfer learning.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • one or more processors perform the following steps: Obtain a sample of a label to be learned And the target label sample; perform nuclear principal component analysis on the label sample to be learned and the target label sample to obtain the first dimensionality reduction feature corresponding to the label sample to be learned, and the second dimensionality reduction feature corresponding to the target label sample;
  • the dimensional features and the second dimensionality reduction feature are input into the trained general feature acquisition model to obtain general column features;
  • the first dimensionality reduction feature input is tested with the basic model corresponding to the target label sample, and the corresponding dimensionality feature is obtained.
  • For weight information the first dimensionality reduction feature whose weight information is higher than the preset weight threshold is used as the general row feature; the general column feature and the general row feature are input into the basic model corresponding to the target label sample for model training to obtain the target model.
  • one or more non-volatile computer-readable storage media storing computer-readable instructions.
  • one or more processors execute the following Steps: Obtain the label sample to be learned and the target label sample; perform nuclear principal component analysis on the label sample to be learned and the target label sample to obtain the first dimensionality reduction feature corresponding to the label sample to be learned, and the second reduction feature corresponding to the target label sample Dimensional features; input the first dimensionality reduction feature and the second dimensionality reduction feature into the trained general feature acquisition model to obtain general column features; test the first dimensionality reduction feature input into the basic model corresponding to the target label sample, and obtain the The weight information corresponding to the first dimensionality reduction feature, and the first dimensionality reduction feature whose weight information is higher than the preset weight threshold is used as the universal row feature; the universal column feature and the universal row feature are input into the basic model corresponding to the target label sample for model Train to get the target model.
  • a person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer.
  • a readable storage medium when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)

Abstract

一种基于迁移学习的建模方法,包括:将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特征;将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征;将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。

Description

基于迁移学习的建模方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请要求于2019年05月20日提交中国专利局,申请号为2019104188205,申请名称为“基于迁移学习的建模方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种基于迁移学习的建模方法、装置、计算机设备和存储介质。
背景技术
随着计算机技术领域的高速发展,现实生活中获取的数据成指数级增长。如何对海量的数据进行快速有效的处理,进而提取出用户所需要的有价值的信息,是研究者们普遍关心的问题。随着机器学习领域的不断创新,研究者们提出了迁移学习,迁移学习是指将一个场景中学到的知识迁移到另一个场景中,使得模型在大量全新的场景中也能做出很好的预测。
传统地对于模型的建立都需要大量的有业务表现的样本,但某些新开展的业务可能没有足够的样本,依据传统方法难以构建有效的模型;如果仅使用少量的当前业务数据建模,服务器在训练模型时容易过拟合且训练得到的模型不稳定;如果使用由其他业务的样本构建的模型,鉴于不同业务客群可能存在较大差别,训练得到的模型效果可能显著下降,无法在仅有少量带标签样本的情况下构建有效的模型。
发明内容
根据本申请公开的各种实施例,提供一种基于迁移学习的建模方法、装置、计算机设备和存储介质。
一种基于迁移学习的建模方法,包括:
获取待学习标签样本和目标标签样本;
将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
一种基于迁移学习的建模装置,包括:
样本获取模块,用于获取待学习标签样本和目标标签样本;
特征降维模块,用于将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
列特征获取模块,用于将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
行特征获取模块,用于将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
模型训练模块,用于将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
计算机可读指令一种计算机设备,包括存储器和一个或多个处理器,所 述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:
获取待学习标签样本和目标标签样本;
将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
获取待学习标签样本和目标标签样本;
将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型计算机可读指令。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中基于迁移学习的建模方法的应用环境图。
图2为根据一个或多个实施例中基于迁移学习的建模方法的方法流程图。
图3为根据一个或多个实施例中基于迁移学习的建模方法中进行目标模型更新的方法流程图。
图4为根据一个或多个实施例中基于迁移学习的建模方法中确定权重信息的方法流程图。
图5为根据一个或多个实施例中基于迁移学习的建模方法装置的框图。
图6为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本发明实施例中所提供的基于迁移学习的建模方法可以应用于如图1所示的应用环境中,服务器120从终端110获取待学习标签样本和目标标签样本,服务器120将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特 征,服务器120将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征,服务器120将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,服务器120将权重信息高于预设权重阈值的第一降维特征作为通用行特征,服务器120将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。
下述实施方式以基于迁移学习的建模方法应用于图1的服务器为例进行说明,但需要说明的是,实际应用中该方法并不仅限应用于上述服务器。
如图2所示,为其中一个实施例中的基于迁移学习的建模方法的流程图,该方法具体包括以下步骤:
步骤202,获取待学习标签样本和目标标签样本。
待学习标签样本和目标标签样本代表不同业务种类的标签样本,待学习标签样本为业务A的有表现样本,目标标签样本为极少量的业务B的有表现样本。可以理解的是,待学习标签样本和目标标签样本都是带有标签信息的样本。
具体地,迁移学习是指将一个场景中学到的知识迁移到另一个场景中。在迁移学习中,已有的知识叫做源域,要学习的新知识叫做目标域,将已经学习到的知识迁移到另一种未知的知识的学习,即从源域迁移到目标域。可以理解的是,源域可为待学习标签样本,目标域可为目标标签样本。
举例说明,假设已经有了一个可以高精确度分辨猫和狗的模型,若想训练一个能够分别不同品种的狗的目标模型,需要做的不是从头训练数据以得到目标模型,而是通过提取通用行特征和通用列特征,利用通用行特征和通用列特征训练最后几层神经元,得到可以分辨狗的品种的目标模型,这就是迁移学习。
在其中一个实施例中,源域可以是用户进行车辆贷款时偿还能力的待学习标签样本,目标域可以是用户进行小额贷款时偿还能力的目标标签样本,服务器通过迁移学习车辆贷款类业务的建模方法,以此建立小额贷款类业务 的目标模型,使得目标模型能够对用户进行小额贷款时的偿还能力进行评估。
服务器可从其他服务器中获取待学习标签样本和目标标签样本,也可从终端获取待学习标签样本和目标标签样本。带有标签信息的样本是指该样本中已有事先定义的标签信息。举例说明,例如当待学习标签样本为一个小狗的图片时,该待学习标签样本中的标签信息为“小狗”。
步骤204,将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特征。
服务器将对待学习标签样本和目标标签样本进行核主成份分析,核主成份分析是将非线性可分的数据转换到一个适合对齐进行线性分类的新的低维子空间上。即,将待学习标签样本和目标标签样本进行降维处理,核主成份分析是机器学习中非常有效的降维,可把原来很高维度的数据用很少的一些代表性维度来表示,比如1000多维用100维来表示,而不丢失关键的数据信息。
具体地,服务器采用核主成份分析为源域(即业务A)和目标域(即业务B)的样本学习得到一个共同的跨数据域子空间,并将所有样本映射到该子空间,获得新的特征表示。即,服务器将得到与待学习标签样本对应的第一降维特征,以及与目标标签样本对应的第二降维特征。
在其中一个实施例中,当待学习标签样本和目标标签样本需要进行降维到K时,可进行如下步骤:1)去平均值(即去中心化),即每一位特征减去各自的平均值。2)计算协方差矩阵。3)用特征值分解方法求协方差矩阵的特征值与特征向量。4)对特征值从大到小排序,选择其中最大的k个。然后将其对应的k个特征向量分别作为行向量组成特征向量矩阵P。5)将数据转换到k个特征向量构建的新空间中,即Y=PX。即通过获取第一特征平均值和第二特征平均值,再进行去平均值,获取与目标特征对应的特征值和特征向量,服务器再根据特征值对特征向量进行排序,得到排序结果,同时根据排序结果大于预设阈值的特征向量建立跨数据域子空间,再将待学习标签样本和目 标标签样本映射到跨数据域子空间中,能够准确地得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特征。
步骤206,将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征。
具体地,通用特征获取模型是解决迁移学习中源域和目标域分布不同的问题,在学习获取上述新的子空间中,采用最小化最大均值差异的方法减小域间边缘概率分布差异,同时将最小化最大均值差异扩展到域间的条件概率分布、联合匹配边缘概率分布和条件概率分布。最小化最大均值差异是指对每一个样本进行投影并求和,利用和的大小表述两个数据的分布差异。可以理解的是,跨数据域子空间通过将源域和目标域映射到相同的空间(或者将其中之一映射到另一个的空间中)并最小化源域和目标域的距离来完成知识迁移。
可以理解的是,假设有随机变量X与Y,此时,P (X=a,Y=b)用于表示X=a且Y=b的概率,这类包含多个条件且所有条件同时成立的概率为联合概率,联合概率的一览表称为联合分布。与联合概率对应的,P (X=a)或P (Y=b)这类仅与单个随机变量有关的概率为边缘概率,边缘概率的一览表称为边缘分布。在条件Y=b成立的情况下,X=a的概率,记作P (X=a|Y=b)或P (a|b)。条件概率的分布即条件概率分布,即已知两个相关的随机变量X和Y,随机变量Y在条件{X=x}下的条件概率分布是指当已知X的取值为某个特定值x之时,Y的概率分布。服务器通过将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,能够准确地得到通用列特征。
步骤208,将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征。
步骤210,将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。
由于源域中某些待学习标签样本与目标域样本无关,即源域中的每个实 例对目标域模型训练的贡献不同,源域中的实例对目标域模型适用度高的,权重就高,适用度低的,权重就低。服务器根据获取源域中的每个实例对目标域模型训练的贡献度,进一步获取通用行特征,利用L2,1范数选择源域中的相关实例进行模型训练,得到适用于业务B的目标模型,可以理解的是,L2,1范数是指行稀疏选择特征,用于服务器获取通用行特征。
具体地,服务器将获取得到的通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,即完成迁移学习的过程,能够得到较优的目标模型。
本实施例中,服务器将待学习标签样本和目标标签样本核主成份分析,得到第一降维特征和第二降维特征,可把原来高维度的数据用很少的一些代表性维度来表示,而不丢失关键的数据信息,再将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征,将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征,通过通用列特征和通用行特征能够在仅有少量带标签样本的情况下实现源域至目标域的迁移学习,从而完成目标模型的建立。
在其中一个实施例中,如图3所示,该方法还包括以下步骤:
步骤302,获取待评测样本,将待评测样本输入目标模型中,输出与待评测样本对应的样本标签信息。
待评测样本是指对目标模型进行验证的样本,服务器将待评测样本输入目标模型中,能够输出与待评测样本对应的样本标签信息。可以理解的是,待评测样本是不带有标签信息的样本。
步骤304,将样本标签信息进行显示,获取与样本标签信息对应的标签更正信息。
步骤306,根据标签更正信息对目标模型中的权值进行调节,根据每次调节后的权值对目标模型进行更新,得到更新后的目标模型。
服务器将样本标签信息进行显示的方式包括但不限于在线显示和发送至 对应的终端进行显示,当服务器将样本标签信息进行显示后,将获取与样本标签信息对应的标签更正信息。
举例说明,例如当待评测样本为用户车辆贷款承受能力样本,且用户车辆贷款承受能力的样本标签信息为“中等级”时,将该样本标签信息进行显示,若服务器接收终端返回的标签更正信息为“高等级”时,服务器将根据标签更正信息对目标模型中的权值进行调节,并根据每次调节后的权值对目标模型进行更新,得到更新后的目标模型。
本实施例中,可通过终端的介入,达到对目标模型的在线学习和实时更新,根据终端返回的标签更正信息进一步更新目标模型,提高目标模型的对样本的处理能力。用户实际使用时,当用户修正结果后,将修正的结果也并入训练集,再次训练模型,更新模型,进行下一轮预测。服务器通过获取待评测样本,再将待评测样本输入目标模型中,输出与待评测样本对应的样本标签信息,然后将样本标签信息进行显示,并获取标签更正信息,根据标签更正信息对目标模型中的权值进行调节,根据每次调节后的权值对目标模型进行更新,得到更新后的目标模型,能够实现目标模型的实时更新。
在其中一个实施例中,该方法还包括:将第一降维特征和第二降维特征进行特征比对,得到特征相似度;将特征相似度高于预设相似阈值时的第一降维特征作为通用列特征。
具体地,将与源域对应的第一降维特征和与目标域对应的第二降维特征进行特征相似度比对,并将特征相似度高于预设相似阈值时的第一降维特征作为通用列特征,通用列特征用于结合通用行特征对基础模型进行训练,得到目标模型。
本实施例中,服务器将第一降维特征和第二降维特征进行特征比对,得到特征相似度;将特征相似度高于预设相似阈值时的第一降维特征作为通用列特征,通用列特征用于进行模型的迁移学习,进一步得到目标模型。
在其中一个实施例中,如图4所示,该方法还包括以下步骤:
步骤402,将第一降维特征输入与目标标签样本对应的基础模型进行测 试,输出与待学习标签样本对应的样本标签信息。
步骤404,将样本标签信息进行显示,获取与样本标签信息对应的标签正误信息。
具体地,样本标签信息是指待学习标签样本所对应的标签信息,标签正误信息是对基于待学习标签样本中的标签信息对该样本标签信息做出的正误判别信息。服务器将第一降维特征输入与目标标签样本对应的基础模型进行测试,输出与待学习标签样本对应的样本标签信息,并将样本标签信息进行在线显示或发送至对应的终端进行显示。
举例说明,例如当待学习标签样本为一小狗的图片时,该待学习标签样本中的标签信息为“小狗”,当服务器将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到的样本标签信息为“小猫”时,此时服务器将根据基于待学习标签样本中的标签信息对样本标签信息的正误进行判别,标签正误信息包括但不限于正确和错误。
步骤406,根据标签正误信息评估第一降维特征,得到与第一降维特征对应的特征贡献度信息。
步骤408,根据特征贡献度信息确定第一降维特征的权重信息。
其中,基础模型用于获得源域中的每个实例对目标域模型训练的贡献度,服务器再根据获取到的贡献度进一步确定权重信息,当权重高时,意味着该特征适用度高,当权重低时,意味着该特征适用度低。将适用度高的特征筛选出来得到通用行特征,以用于后续进一步建立目标模型。
具体地,服务器根据标签正误信息评估第一降维特征对于目标域模型训练的特征贡献度,得到特征贡献度信息,并根据特征贡献度信息确定第一降维特征的权重信息,并将权重信息高于预设权重阈值的第一降维特征作为通用行特征。
本实施例中,服务器将第一降维特征输入与目标标签样本对应的基础模型进行测试,输出与待学习标签样本对应的样本标签信息,并将样本标签信息进行显示,能够获取与样本标签信息对应的标签正误信息,通过标签正误 信息进一步判断第一降维特征对目标域模型训练的贡献度,即根据标签正误信息评估第一降维特征,得到与第一降维特征对应的特征贡献度信息,根据特征贡献度信息确定第一降维特征的权重信息,能够进一步确定通用行特征,服务器根据通用行特征和通用列特征进行迁移学习,进而建立适用于目标域的目标模型。
在其中一个实施例中,该方法还包括:将通用列特征和通用行特征分成预定数量份的训练特征集;依次将训练特征集输入基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
预定数量份的训练特征集用于对基础模型进行训练,得到已训练的目标模型,实现从源域到目标域的迁移学习。
本实施例中,服务器将通用列特征和通用行特征分成预定数量份的训练特征集,并依次将训练特征集输入基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型,能够基于迁移学习实现在少量带标签样本的情况下构建有效的模型。
应该理解的是,虽然图2-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
如图5所示,为一实施例中的基于迁移学习的建模装置的示意图,该装置包括:
样本获取模块502,用于获取待学习标签样本和目标标签样本;
特征降维模块504,用于将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应 的第二降维特征;
列特征获取模块506,用于将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
行特征获取模块508,用于将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征;
模型训练模块510,用于将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。
在一个实施例中,模型训练模块包括:标签信息输出模块,用于获取待评测样本,将待评测样本输入目标模型中,输出与待评测样本对应的样本标签信息;更正信息获取模块,用于将样本标签信息进行显示,获取与样本标签信息对应的标签更正信息;模型更新模块,用于根据标签更正信息对目标模型中的权值进行调节,根据每次调节后的权值对目标模型进行更新,得到更新后的目标模型。
在一个实施例中,列特征获取模块包括:特征比对模块,用于将第一降维特征和第二降维特征进行特征比对,得到特征相似度;相似度判断模块,用于将特征相似度高于预设相似阈值时的第一降维特征作为通用列特征。
在一个实施例中,行特征获取模块包括:将第一降维特征输入与目标标签样本对应的基础模型进行测试,输出与待学习标签样本对应的样本标签信息;将样本标签信息进行显示,获取与样本标签信息对应的标签正误信息;根据标签正误信息评估第一降维特征,得到与第一降维特征对应的特征贡献度信息;根据特征贡献度信息确定第一降维特征的权重信息。
在一个实施例中,模型训练模块包括:将通用列特征和通用行特征分成预定数量份的训练特征集;依次将训练特征集输入基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
关于基于迁移学习的建模装置的具体限定可以参见上文中对于基于迁移学习的建模方法的限定,在此不再赘述。上述基于迁移学习的建模装置中的 各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。该处理器可以为中央处理单元(CPU)、微处理器、单片机等。上述基于迁移学习的建模装置可以实现为一种计算机可读指令的形式。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,也可以是终端。当该计算机设备为终端时,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机可读指令。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于迁移学习的建模方法。本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:获取待学习标签样本和目标标签样本;将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特征;将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征;将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。
上述对于计算机设备的限定可以参见上文中对于基于迁移学习的建模方法的具体限定,在此不再赘述。
请继续参阅图6,一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取待学习标签样本和目标标签样本;将待学习标签样本和目标标签样本进行核主成份分析,得到与待学习标签样本对应的第一降维特征,与目标标签样本对应的第二降维特征;将第一降维特征和第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;将第一降维特征输入与目标标签样本对应的基础模型进行测试,得到与第一降维特征对应的权重信息,将权重信息高于预设权重阈值的第一降维特征作为通用行特征;将通用列特征和通用行特征输入与目标标签样本对应的基础模型中进行模型训练,得到目标模型。
上述对于计算机可读存储介质的限定可以参见上文中对于基于迁移学习的建模方法的具体限定,在此不再赘述。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-OnlyMemory,ROM)等。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种基于迁移学习的建模方法,包括:
    获取待学习标签样本和目标标签样本;
    将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
    将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
    将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型之后,还包括:
    获取待评测样本,将所述待评测样本输入所述目标模型中,输出与所述待评测样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签更正信息;及
    根据所述标签更正信息对所述目标模型中的权值进行调节,根据每次调节后的权值对所述目标模型进行更新,得到更新后的目标模型。
  3. 根据权利要求1所述的方法,其特征在于,所述将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征,包括:
    将所述第一降维特征和所述第二降维特征进行特征比对,得到特征相似度;及
    将所述特征相似度高于预设相似阈值的第一降维特征作为通用列特征。
  4. 根据权利要求1所述的方法,其特征在于,所述将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,包括:
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,输出与所述待学习标签样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签正误信息;
    根据所述标签正误信息评估所述第一降维特征,得到与所述第一降维特征对应的特征贡献度信息;及
    根据所述特征贡献度信息确定所述第一降维特征的权重信息。
  5. 根据权利要求1所述的方法,其特征在于,所述将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型进行模型训练,得到目标模型,包括:
    将所述通用列特征和所述通用行特征分成预定数量份的训练特征集;及
    依次将所述训练特征集输入所述基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
  6. 一种基于迁移学习的建模装置,包括:
    样本获取模块,用于获取待学习标签样本和目标标签样本;
    特征降维模块,用于将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
    列特征获取模块,用于将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
    行特征获取模块,用于将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
    模型训练模块,用于将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
  7. 根据权利要求6所述的装置,其特征在于,所述模型训练模块包括:
    标签信息输出模块,用于获取待评测样本,将所述待评测样本输入所述目标模型中,输出与所述待评测样本对应的样本标签信息;
    更正信息获取模块,用于将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签更正信息;及
    模型更新模块,用于根据所述标签更正信息对所述目标模型中的权值进行调节,根据每次调节后的权值对所述目标模型进行更新,得到更新后的目标模型。
  8. 根据权利要求6所述的装置,其特征在于,所述列特征获取模块包括:
    特征比对模块,用于将所述第一降维特征和所述第二降维特征进行特征比对,得到特征相似度;及
    相似度判断模块,用于将所述特征相似度高于预设相似阈值的第一降维特征作为通用列特征。
  9. 根据权利要求6所述的装置,其特征在于,行特征获取模块还用于将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,输出与所述待学习标签样本对应的样本标签信息;将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签正误信息;根据所述标签正误信息评估所述第一降维特征,得到与所述第一降维特征对应的特征贡献度信息;及根据所述特征贡献度信息确定所述第一降维特征的权重信息。
  10. 根据权利要求6所述的装置,其特征在于,模型训练模块还用于将所述通用列特征和所述通用行特征分成预定数量份的训练特征集;及依次将所述训练特征集输入所述基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
  11. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行 时,使得所述一个或多个处理器执行以下步骤:
    获取待学习标签样本和目标标签样本;
    将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
    将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
    将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
  12. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取待评测样本,将所述待评测样本输入所述目标模型中,输出与所述待评测样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签更正信息;及
    根据所述标签更正信息对所述目标模型中的权值进行调节,根据每次调节后的权值对所述目标模型进行更新,得到更新后的目标模型。
  13. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    将所述第一降维特征和所述第二降维特征进行特征比对,得到特征相似度;及
    将所述特征相似度高于预设相似阈值的第一降维特征作为通用列特征。
  14. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,输出与所述待学习标签样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签正误信息;
    根据所述标签正误信息评估所述第一降维特征,得到与所述第一降维特征对应的特征贡献度信息;及
    根据所述特征贡献度信息确定所述第一降维特征的权重信息。
  15. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    将所述通用列特征和所述通用行特征分成预定数量份的训练特征集;及
    依次将所述训练特征集输入所述基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
  16. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取待学习标签样本和目标标签样本;
    将所述待学习标签样本和所述目标标签样本进行核主成份分析,得到与所述待学习标签样本对应的第一降维特征,与所述目标标签样本对应的第二降维特征;
    将所述第一降维特征和所述第二降维特征输入已训练的通用特征获取模型中,得到通用列特征;
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,得到与所述第一降维特征对应的权重信息,将所述权重信息高于预设权重阈值的第一降维特征作为通用行特征;及
    将所述通用列特征和所述通用行特征输入与所述目标标签样本对应的基础模型中进行模型训练,得到目标模型。
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指 令被所述处理器执行时还执行以下步骤:
    获取待评测样本,将所述待评测样本输入所述目标模型中,输出与所述待评测样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签更正信息;及
    根据所述标签更正信息对所述目标模型中的权值进行调节,根据每次调节后的权值对所述目标模型进行更新,得到更新后的目标模型。
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    将所述第一降维特征和所述第二降维特征进行特征比对,得到特征相似度;及
    将所述特征相似度高于预设相似阈值的第一降维特征作为通用列特征。
  19. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    将所述第一降维特征输入与所述目标标签样本对应的基础模型进行测试,输出与所述待学习标签样本对应的样本标签信息;
    将所述样本标签信息进行显示,获取与所述样本标签信息对应的标签正误信息;
    根据所述标签正误信息评估所述第一降维特征,得到与所述第一降维特征对应的特征贡献度信息;及
    根据所述特征贡献度信息确定所述第一降维特征的权重信息。
  20. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    将所述通用列特征和所述通用行特征分成预定数量份的训练特征集;及
    依次将所述训练特征集输入所述基础模型的输入变量中进行训练,直到所有训练特征集训练完毕,得到已训练的目标模型。
PCT/CN2019/102740 2019-05-20 2019-08-27 基于迁移学习的建模方法、装置、计算机设备和存储介质 WO2020232874A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910418820.5 2019-05-20
CN201910418820.5A CN110210625B (zh) 2019-05-20 2019-05-20 基于迁移学习的建模方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2020232874A1 true WO2020232874A1 (zh) 2020-11-26

Family

ID=67787850

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102740 WO2020232874A1 (zh) 2019-05-20 2019-08-27 基于迁移学习的建模方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN110210625B (zh)
WO (1) WO2020232874A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159085A (zh) * 2020-12-30 2021-07-23 北京爱笔科技有限公司 分类模型的训练及基于图像的分类方法、相关装置
CN114021180A (zh) * 2021-10-11 2022-02-08 清华大学 一种电力系统动态安全域确定方法、装置、电子设备及可读介质
CN115396831A (zh) * 2021-05-08 2022-11-25 中国移动通信集团浙江有限公司 交互模型生成方法、装置、设备及存储介质
CN117708592A (zh) * 2023-12-12 2024-03-15 清新文化艺术有限公司 基于文化创意的艺术科技融合数字平台

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581250B (zh) * 2019-09-30 2023-12-29 深圳无域科技技术有限公司 模型生成方法、装置、计算机设备和存储介质
CN110929877B (zh) * 2019-10-18 2023-09-15 平安科技(深圳)有限公司 基于迁移学习的模型建立方法、装置、设备及存储介质
CN114501515A (zh) * 2020-11-11 2022-05-13 中兴通讯股份有限公司 模型训练方法和装置、电子设备、计算机可读存储介质
CN116910573B (zh) * 2023-09-13 2023-12-05 中移(苏州)软件技术有限公司 异常诊断模型的训练方法及装置、电子设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104523269A (zh) * 2015-01-15 2015-04-22 江南大学 一种面向癫痫脑电信号迁移环境的自适应识别方法
WO2015069824A2 (en) * 2013-11-06 2015-05-14 Lehigh University Diagnostic system and method for biological tissue analysis
CN106326214A (zh) * 2016-08-29 2017-01-11 中译语通科技(北京)有限公司 一种基于迁移学习的跨语言情感分析方法及装置
CN107292246A (zh) * 2017-06-05 2017-10-24 河海大学 基于hog‑pca和迁移学习的红外人体目标识别方法
CN107506775A (zh) * 2016-06-14 2017-12-22 北京陌上花科技有限公司 模型训练方法及装置
CN109710512A (zh) * 2018-12-06 2019-05-03 南京邮电大学 基于测地线流核的神经网络软件缺陷预测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091602A (zh) * 2014-07-11 2014-10-08 电子科技大学 一种基于模糊支持向量机的语音情感识别方法
US9710729B2 (en) * 2014-09-04 2017-07-18 Xerox Corporation Domain adaptation for image classification with class priors
JP2019527871A (ja) * 2017-06-13 2019-10-03 ベイジン ディディ インフィニティ テクノロジー アンド ディベロップメント カンパニー リミティッド 到着予定時刻を決定するシステム及び方法
CN107679859B (zh) * 2017-07-18 2020-08-25 中国银联股份有限公司 一种基于迁移深度学习的风险识别方法以及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015069824A2 (en) * 2013-11-06 2015-05-14 Lehigh University Diagnostic system and method for biological tissue analysis
CN104523269A (zh) * 2015-01-15 2015-04-22 江南大学 一种面向癫痫脑电信号迁移环境的自适应识别方法
CN107506775A (zh) * 2016-06-14 2017-12-22 北京陌上花科技有限公司 模型训练方法及装置
CN106326214A (zh) * 2016-08-29 2017-01-11 中译语通科技(北京)有限公司 一种基于迁移学习的跨语言情感分析方法及装置
CN107292246A (zh) * 2017-06-05 2017-10-24 河海大学 基于hog‑pca和迁移学习的红外人体目标识别方法
CN109710512A (zh) * 2018-12-06 2019-05-03 南京邮电大学 基于测地线流核的神经网络软件缺陷预测方法

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159085A (zh) * 2020-12-30 2021-07-23 北京爱笔科技有限公司 分类模型的训练及基于图像的分类方法、相关装置
CN113159085B (zh) * 2020-12-30 2024-05-28 北京爱笔科技有限公司 分类模型的训练及基于图像的分类方法、相关装置
CN115396831A (zh) * 2021-05-08 2022-11-25 中国移动通信集团浙江有限公司 交互模型生成方法、装置、设备及存储介质
CN114021180A (zh) * 2021-10-11 2022-02-08 清华大学 一种电力系统动态安全域确定方法、装置、电子设备及可读介质
CN114021180B (zh) * 2021-10-11 2024-04-12 清华大学 一种电力系统动态安全域确定方法、装置、电子设备及可读介质
CN117708592A (zh) * 2023-12-12 2024-03-15 清新文化艺术有限公司 基于文化创意的艺术科技融合数字平台
CN117708592B (zh) * 2023-12-12 2024-05-07 清新文化艺术有限公司 基于文化创意的艺术科技融合数字平台

Also Published As

Publication number Publication date
CN110210625A (zh) 2019-09-06
CN110210625B (zh) 2023-04-07

Similar Documents

Publication Publication Date Title
WO2020232874A1 (zh) 基于迁移学习的建模方法、装置、计算机设备和存储介质
Mebane Jr et al. Genetic optimization using derivatives: the rgenoud package for R
Titsias et al. Spike and slab variational inference for multi-task and multiple kernel learning
JP6182242B1 (ja) データのラベリングモデルに係る機械学習方法、コンピュータおよびプログラム
WO2019015246A1 (zh) 图像特征获取
US20190065957A1 (en) Distance Metric Learning Using Proxies
US10387749B2 (en) Distance metric learning using proxies
CN112699215B (zh) 基于胶囊网络与交互注意力机制的评级预测方法及系统
CN114144770A (zh) 用于生成用于模型重新训练的数据集的系统和方法
US20230021551A1 (en) Using training images and scaled training images to train an image segmentation model
CN112420125A (zh) 分子属性预测方法、装置、智能设备和终端
CN111309823A (zh) 用于知识图谱的数据预处理方法及装置
WO2021147405A1 (zh) 客服语句质检方法及相关设备
CN113762005B (zh) 特征选择模型的训练、对象分类方法、装置、设备及介质
US20200051098A1 (en) Method and System for Predictive Modeling of Consumer Profiles
US11688175B2 (en) Methods and systems for the automated quality assurance of annotated images
Lim et al. More powerful selective kernel tests for feature selection
Westphal et al. Improving model selection by employing the test data
CN113010687B (zh) 一种习题标签预测方法、装置、存储介质以及计算机设备
Dessureault et al. Explainable global error weighted on feature importance: The xGEWFI metric to evaluate the error of data imputation and data augmentation
CN114937166A (zh) 图像分类模型构建方法、图像分类方法及装置、电子设备
CN113760407A (zh) 信息处理方法、装置、设备及存储介质
CN112884028A (zh) 一种系统资源调整方法、装置及设备
US11599783B1 (en) Function creation for database execution of deep learning model
Fredriksson et al. An Empirical Evaluation of Algorithms for Data Labeling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19929459

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19929459

Country of ref document: EP

Kind code of ref document: A1