WO2022174436A1 - 分类模型增量学习实现方法、装置、电子设备及介质 - Google Patents

分类模型增量学习实现方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2022174436A1
WO2022174436A1 PCT/CN2021/077147 CN2021077147W WO2022174436A1 WO 2022174436 A1 WO2022174436 A1 WO 2022174436A1 CN 2021077147 W CN2021077147 W CN 2021077147W WO 2022174436 A1 WO2022174436 A1 WO 2022174436A1
Authority
WO
WIPO (PCT)
Prior art keywords
incremental
unlabeled
classification model
samples
sample
Prior art date
Application number
PCT/CN2021/077147
Other languages
English (en)
French (fr)
Inventor
何玉林
黄启航
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Priority to PCT/CN2021/077147 priority Critical patent/WO2022174436A1/zh
Publication of WO2022174436A1 publication Critical patent/WO2022174436A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present application relates to the field of computer technology, for example, to a method, apparatus, electronic device, and medium for implementing incremental learning of a classification model.
  • Semi-supervised learning as a combination of supervised learning and unsupervised learning, uses limited labeled samples and a large number of unlabeled samples for modeling and learning.
  • the general steps of the semi-supervised learning method are to use the labeled data to initially build the model, and then to train and optimize the model according to the distribution characteristics of the unlabeled data, so as to achieve the effect of improving the accuracy of the model.
  • the classification model must be retrained, and it is difficult to incrementally learn on the basis of the existing classification model or the learning cost is high. Therefore, how to incrementally learn the classification model under semi-supervised conditions becomes particularly important.
  • the present application provides a method, device, electronic device and medium for implementing incremental learning of a classification model, so as to achieve incremental learning based on a large number of unlabeled samples to improve model prediction accuracy.
  • an embodiment of the present application provides a method for implementing incremental learning of a classification model, and the method includes:
  • the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels;
  • an embodiment of the present application also provides a device for implementing incremental learning of a classification model, and the device includes:
  • a sample acquisition module for acquiring at least one unlabeled incremental sample
  • the sample prediction module is used to input the unlabeled incremental samples one by one into the established classification model for class prediction; the classification model is at least partially obtained by using the extreme learning machine to model the initial samples with labels and incomplete labels;
  • the incremental learning module is used to perform incremental learning on the established classification model according to the category prediction results of the unlabeled incremental samples and the corresponding unlabeled incremental samples, so as to realize the training and update of the classification model.
  • the embodiments of the present application also provide an electronic device, including:
  • a storage device for storing one or more programs
  • the one or more processing apparatuses When the one or more programs are executed by the one or more processing apparatuses, the one or more processing apparatuses implement the method for implementing incremental learning of a classification model as provided in the embodiments of the present application.
  • an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processing device, implements the classification model increment as provided in any embodiment of the present application Learn how to do it.
  • the embodiment of the present application provides a method for implementing incremental learning of a classification model, which can at least partially use initial samples with labels and incomplete labels to use extreme learning machine modeling to obtain a classification model, and use the pre-established classification model to analyze the classification model one by one.
  • the category prediction is performed on the labeled incremental samples, and then based on the category prediction results of the unlabeled incremental samples and the corresponding unlabeled incremental samples, the established classification model is incrementally learned and updated.
  • FIG. 1 is a flowchart of a method for implementing incremental learning of a classification model provided in an embodiment of the present application
  • FIG. 2 is a flowchart of another method for implementing incremental learning of a classification model provided in an embodiment of the present application
  • Fig. 3 is the operation process diagram of a kind of classification model incremental learning provided in the embodiment of the present application.
  • FIG. 4 is a schematic diagram of a sample data set information provided in an embodiment of the present application.
  • Fig. 5a is a schematic diagram of reinforcement learning comparison under a sample data set provided in the embodiment of the present application.
  • 5b is a schematic diagram of reinforcement learning comparison under another sample data set provided in the embodiment of the present application.
  • FIG. 5c is a schematic diagram of reinforcement learning comparison under another sample data set provided in the embodiment of the present application.
  • FIG. 6 is a structural block diagram of an apparatus for implementing incremental learning of a classification model provided in an embodiment of the present application
  • FIG. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
  • FIG. 1 is a flowchart of a method for implementing incremental learning of a classification model provided in an embodiment of the present application.
  • the embodiments of the present application may be applicable to the case of performing incremental learning on the category recognition model.
  • the method can be performed by an apparatus for implementing incremental learning of a classification model, and the apparatus can be implemented in software and/or hardware and integrated on any electronic device with a network communication function.
  • the method for implementing incremental learning of a classification model provided in the embodiment of the present application may include the following steps:
  • the sample category label can be used to represent the category information of the sample object.
  • the data sample set can include three parts: the initial sample data set D 1 with the sample category label but incomplete sample category label. , an unlabeled incremental sample dataset D 2 with no sample class labels but the samples actually cover all classes, and a validation sample dataset D 3 with complete sample class labels.
  • a sample data set with five sample attributes and four sample categories is set to realize the training and update operations of the classification model, in which the sample attributes can be denoted as A 1 , A 2 respectively , A 3 , A 4 , A 5 , the categories of the samples can be denoted as C 1 , C 2 , C 3 , and C 4 respectively.
  • S120 input the unlabeled incremental samples one by one into the established classification model for class prediction; the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels.
  • the initial samples with known partial sample labels can be used in advance.
  • the data is modeled using extreme learning machine to obtain an extreme learning machine model.
  • the extreme learning machine is a neural network including a single input layer, a single hidden layer and a single output layer.
  • the number of nodes in the input layer and the output layer can be known, and Hidden layer nodes can be allocated through experience and trials, such as selecting the size of the data. about.
  • the parameter calculation process of the classification model may be: where W and B are the input weights and biases, respectively, a matrix with each value randomly initialized between [-1, 1], ⁇ is the sigmoid activation function, and the formula is The input of the extreme learning machine is denoted as X, and the output is denoted as Y. Based on the initial sample data set D1, the extreme learning machine is used for data modeling to obtain a classification model.
  • the unlabeled incremental samples are successively put into the classification model for category prediction, and the category prediction results for the unlabeled incremental samples are obtained.
  • the category prediction result Y output by the classification model is a matrix transformed by one-hot encoding. For example, using extreme learning machines to build classification models enables predictions to be made on new data. Given at least one new unlabeled incremental sample x, the classification model can predict the input unlabeled incremental sample one by one and return the prediction result
  • the incremental learning algorithm is expanded by using the classification model to perform category prediction on the unlabeled incremental samples.
  • the required incremental samples with labels solve the defect that the classification model must be retrained and cannot learn incrementally on the basis of the model when updating the classification model, and the labelling of the labelled data is incomplete.
  • the effect of incremental operation to improve the prediction accuracy of the model the good learning accuracy is still maintained in the case of incomplete labels, so that the algorithm has good learnability, and the model can be updated quickly given new sample data, And keep the model complexity at a similar level to the sample data complexity, reducing the cost and complexity of model learning.
  • FIG. 2 is a flowchart of another method for implementing incremental learning of a classification model provided in an embodiment of the present application.
  • the embodiment of the present application describes the foregoing embodiment on the basis of the foregoing embodiment. Or various alternative solutions in multiple embodiments are combined.
  • the method for implementing incremental learning of a classification model provided in the embodiment of the present application may include the following steps:
  • S220 Input the unlabeled incremental samples one by one into the established classification model for class prediction; the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels.
  • the initial samples can be used for the initial training of the classification model, and the incremental samples can be used for incremental learning of the established classification model (if there are no labels, it cannot be used in semi-supervised learning).
  • a category prediction result can be obtained through the classification model, but not all category prediction results should be accepted, that is, there will be some inaccurate prediction results.
  • the unlabeled incremental samples and the category prediction results for the unlabeled incremental samples form a labeled incremental sample, and a pair of labeled incremental samples is established based on the composition.
  • the classification model for incremental learning is established based on the composition.
  • the output Y of the classification model is a one-hot encoded matrix.
  • the class prediction output of an unlabeled incremental sample after being input to the classification model also obeys a similar rule. For example, if the class prediction for unlabeled incremental samples is Usually, the maximum element value in the matrix is taken as the real result of the prediction, the position is set to 1, and the other positions are set to 0, so as to obtain
  • the category prediction result for the unlabeled incremental sample Based on the value of each element value in the matrix output from the category prediction result for the unlabeled incremental sample, it can be known whether the category prediction result for the unlabeled incremental sample is accurate. If it is determined that the category prediction result of the unlabeled incremental sample is accurate, it indicates that the sample category of the unlabeled incremental sample can be known through the category prediction result. At this time, the unlabeled incremental sample and its category prediction result can be formed into a labeled The data is fed into the classification model for regular incremental learning to improve the prediction accuracy of the model.
  • the category prediction result YC for the unlabeled incremental sample is accurate, usually the category prediction result YC is a category known to the classification model, and the classification model can utilize the composition of multiple labeled incremental samples.
  • the conventional incremental learning operation is performed on the classification model, and after the conventional incremental update of the classification model, Among them, W C and B C are similar to the generation methods of W and B.
  • incremental learning is performed on the established classification model according to the category prediction result of the unlabeled incremental samples and the corresponding unlabeled incremental samples, and further includes the following operations:
  • the category prediction result output by the classification model is a matrix transformed by one-hot encoding, and the model prediction result of a sample also obeys a similar rule.
  • the preset threshold can be 1
  • the category prediction result of the unlabeled incremental sample is Inaccurate, that is, the class of the sample cannot be accurately known from the class prediction result of the pair of unlabeled incremental samples.
  • the category prediction result for the unlabeled incremental samples is acceptable, that is, for the unlabeled incremental samples.
  • the maximum element value in the corresponding matrix of the category prediction result is between [0.95, 1.05]
  • the category prediction result of the unlabeled incremental sample is considered to be accurate, otherwise, the prediction result is considered to be inaccurate.
  • the category prediction result of the unlabeled incremental sample is accurate, the accurate unlabeled incremental sample and the category prediction result of the sample can be combined to update the model according to the conventional incremental learning method.
  • the classification model predicts that the category prediction of the sample data is likely to be different from the known category label, belonging to a new category label or a Abnormal data, at this time, the unlabeled incremental sample data can be moved into a pending set for storage, which can be called an abnormal set S ab here.
  • performing new class label recognition on the stored unlabeled incremental sample set to obtain a new class label which may include the following steps A1-A3:
  • Step A1 Perform new class mining on the stored unlabeled incremental sample set, and screen out a new class cluster with the highest density and a cluster size larger than a preset value.
  • the density-based new cluster mining algorithm can be used to mine new clusters of the stored unlabeled incremental sample set, and find a cluster with the highest density under a certain number limit. The data of the new cluster is considered to belong to the same new class label.
  • a distance assumption is made for the unlabeled incremental samples in the abnormal set, that is, it is assumed that the samples with closer distances are more likely to have the same labels.
  • the new density-based cluster mining algorithm returns a new cluster c with the largest density and a number greater than ms.
  • the specific process is described as follows:
  • the density-based clustering method can stop the search process when the first cluster that meets the conditions is found, and the search process is a greedy process, so as to ensure the quality of clusters and the search speed as much as possible.
  • Step A2 Input the verification samples with complete label categories into the established classification model one by one for category prediction.
  • Step A3 According to the category prediction result of the verification sample and the value of the new cluster, identify the true label category to which the new cluster belongs, and use it as the new class label of the stored unlabeled incremental samples.
  • the prediction results obtained by bringing the complete labeled verification sample data sets into the classification model one by one can be denoted as YV′, and the real results thereof can be denoted as YV.
  • Some of the results in YV' are new class labels. Which unknown class accounts for the largest proportion of the real labels in the data of these new labels. From a statistical point of view, the unknown true label is most likely to be the same as the model's new label. class corresponds. If the corresponding unknown label cannot be found, it means that the "new class" is a known class rather than a new class, and this part of the data is processed as the known class data.
  • the specific algorithm flow is described as follows:
  • Input validation sample data set ⁇ XV, YV ⁇ , T 1 ; Output: true category C of T 1 ; (2) Put the validation set input XV into the model to get the prediction result YV'; (3) Put YV In ', all the positions whose values are T 1 are recorded in Pos; (4) extract all the values in YV whose positions are in Pos, and record them in Val; (5) count the frequency of occurrence of each label value in Val, and take the highest value and not The known category of the model is C; (6) If such a C is found, return C, otherwise return None, indicating that the corresponding real category cannot be found.
  • S260 Form an incremental sample with a new class label from the identified new class label and the stored unlabeled incremental sample, and perform incremental learning on the established classification model to implement training and update of the classification model.
  • the category prediction result YC for unlabeled incremental samples is a category unknown to the classification model. and perform new class recognition, and try to incrementally update the labels of this group of samples with the same new label. After the new class label is incremented, the newly identified new class label and the corresponding stored unlabeled incremental samples can be formed into incremental samples with new class labels, and incremental learning is performed on the established classification model. After incremental learning, Among them, W C and B C are similar to the generation methods of W and B, where ⁇ (0,1] is the confidence factor. The closer the value is to 1, the higher the confidence that the data belongs to the new class. When it is 1, the value of Y C' is all 0. The advantage of this setting is that The loss of information will be reduced.
  • Figure 4 shows a schematic diagram of the sample data set information
  • Figure 5a, Figure 5b and Figure 5c respectively show a schematic diagram of the comparison of reinforcement learning algorithms under different sample data sets.
  • the incremental learning algorithm is expanded by using the classification model to perform category prediction on the unlabeled incremental samples.
  • the required incremental samples with labels solve the defect that the classification model must be retrained and cannot learn incrementally on the basis of the model when updating the classification model, and the labelling of the labelled data is incomplete.
  • the effect of incremental operation to improve the prediction accuracy of the model the good learning accuracy is still maintained in the case of incomplete labels, so that the algorithm has good learnability, and the model can be updated quickly given new sample data, And keep the model complexity at a similar level as the sample data complexity, reducing the model learning complexity.
  • this embodiment The semi-supervised incremental learning based on extreme learning machine in China has a new class mining ability for the default situation of the initial data label, and realizes the incremental operation based on a large amount of unlabeled data to improve the prediction accuracy of the model. Compared with other algorithms, it has more application scenarios. widely.
  • FIG. 6 is a structural block diagram of an apparatus for implementing incremental learning of a classification model provided in an embodiment of the present application.
  • the embodiments of the present application may be applicable to the case of performing incremental learning on the category recognition model.
  • the apparatus can be implemented in software and/or hardware, and can be integrated in any electronic device with a network communication function.
  • the apparatus for implementing incremental learning of a classification model may include the following steps: a sample acquisition module 610 , a sample prediction module 620 , and an incremental learning module 630 . in:
  • a sample acquisition module 610 configured to acquire at least one unlabeled incremental sample
  • the sample prediction module 620 is used to input the unlabeled incremental samples one by one into the established classification model for class prediction; the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels;
  • the incremental learning module 630 is configured to perform incremental learning on the established classification model according to the category prediction result of the unlabeled incremental samples and the corresponding unlabeled incremental samples, so as to realize the training and update of the classification model.
  • the incremental learning module 630 includes:
  • the unlabeled incremental sample and the category prediction result for the unlabeled incremental sample form a labeled incremental sample, and based on the composed labeled incremental sample Perform incremental learning on the established classification model.
  • the incremental learning module 630 includes:
  • the recognized new class labels and the stored unlabeled incremental samples are formed into incremental samples with new class labels, and incremental learning is performed on the established classification model.
  • performing new class label identification on the stored unlabeled incremental sample set to obtain a new class label including:
  • the class prediction result of the verification sample and the value of the new class cluster identify the true label class to which the new class cluster belongs, and use it as the new class label of the stored unlabeled incremental sample.
  • the category prediction result output by the classification model is a matrix transformed by one-hot encoding; the incremental learning module 630 further includes:
  • the apparatus for implementing incremental learning of a classification model provided in this embodiment of the present application can execute the method for implementing incremental learning of a classification model provided in any embodiment of the present application, and has the corresponding functions and effects of executing the method for implementing incremental learning of a classification model. , and for the detailed process, refer to the relevant operations of the method for implementing incremental learning of the classification model in the foregoing embodiment.
  • FIG. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
  • the electronic device provided in this embodiment of the present application includes: one or more processors 710 and a storage device 720 ; the number of processors 710 in the electronic device may be one or more, and one or more processors 710 in FIG. 7
  • the processor 710 is taken as an example; the storage device 720 is used to store one or more programs; the one or more programs are executed by the one or more processors 710, so that the one or more processors 710 implement the The method for implementing incremental learning of a classification model according to any one of the application embodiments.
  • the electronic device may further include: an input device 730 and an output device 740 .
  • the processor 710 , the storage device 720 , the input device 730 and the output device 740 in the electronic device may be connected by a bus or in other ways.
  • the connection by a bus is taken as an example.
  • the storage device 720 in the electronic device can be used to store one or more programs, and the programs can be software programs, computer-executable programs, and modules, as provided in the embodiments of the present application Program instructions/modules corresponding to the incremental learning implementation method of the classification model.
  • the processor 710 executes various functional applications and data processing of the electronic device by running the software programs, instructions and modules stored in the storage device 720 , that is, implementing the method for implementing incremental learning of the classification model in the above method embodiments.
  • the storage device 720 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the electronic device, and the like. Additionally, storage device 720 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, storage device 720 may further include memory located remotely from processor 710, which remote memory may be connected to the device through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the input device 730 may be used to receive input numerical or character information, and generate key signal input related to user setting and function control of the electronic device.
  • the output device 740 may include a display device such as a display screen.
  • the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels;
  • the program can also perform the classification model increment provided in any embodiment of the present application Learn the relevant operations in the implementation method.
  • An embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, is used to execute a method for implementing incremental learning of a classification model, and the method includes:
  • the classification model is at least partially obtained by using extreme learning machine modeling with initial samples with labels and incomplete labels;
  • the program when executed by the processor, the program may also be used to execute the method for implementing incremental learning of the classification model provided in any embodiment of the present application.
  • the computer storage medium of the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above .
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • suitable medium including but not limited to: wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
  • LAN local area network
  • WAN wide area network

Abstract

一种分类模型增量学习实现方法、装置、电子设备及介质。所述方法包括:获取至少一个无标签增量样本(S110);将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到(S120);依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新(S130)。

Description

分类模型增量学习实现方法、装置、电子设备及介质 技术领域
本申请涉及计算机技术领域,例如涉及一种分类模型增量学习实现方法、装置、电子设备及介质。
背景技术
半监督学习作为有监督学习和无监督学习的结合体,其利用有限的有标签样本和大量无标签样本进行建模和学习。半监督学习方法的一般步骤是利用有标签数据初步建立模型,再根据无标签数据的分布特点来训练和优化模型,从而达到提升模型精度的效果。但是,在有新的无标签样本提供时,分类模型必须重新训练,而很难在已有分类模型基础上增量学习或是学习成本较高。因此如何在半监督情况下对分类模型进行增量学习变得尤为重要。
发明内容
本申请提供了一种分类模型增量学习实现方法、装置、电子设备及介质,以实现能够根据大量的无标签样本进行增量学习提高模型预测精度。
第一方面,本申请实施例中提供了一种分类模型增量学习实现方法,所述方法包括:
获取至少一个无标签增量样本;
将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
第二方面,本申请实施例中还提供了一种分类模型增量学习实现装置,所述装置包括:
样本获取模块,用于获取至少一个无标签增量样本;
样本预测模块,用于将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
增量学习模块,用于依据对无标签增量样本的类别预测结果与对应无标签 增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
第三方面,本申请实施例中还提供了一种电子设备,包括:
一个或多个处理装置;
存储装置,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如本申请实施例中提供的所述分类模型增量学习实现方法。
第四方面,本申请实施例中还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理装置执行时实现如本申请任意实施例中提供的所述分类模型增量学习实现方法。
本申请实施例中提供了一种分类模型增量学习实现方法,可至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到分类模型,并使用预先建立的分类模型逐个对无标签增量样本进行类别预测,进而基于对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习更新。采用本申请方案,在分类模型训练完成后,如果有新的数据样本提供时,通过使用分类模型对无标签增量样本进行类别预测来扩充增量学习所需的带标签增量样本,解决了在对分类模型进行更新时必须重新训练无法在模型基础上增量学习的缺陷,实现有标签数据的标签不完整的情况下仍旧根据大量的无标签数据进行增量运算提高模型预测精度的效果,并使得模型复杂度始终同数据样本复杂度保持在类似水平,降低模型学习成本和复杂度。
附图说明
图1是本申请实施例中提供的一种分类模型增量学习实现方法的流程图;
图2是本申请实施例中提供的另一种分类模型增量学习实现方法的流程图;
图3是本申请实施例中提供的一种分类模型增量学习的运算过程图;
图4是本申请实施例中提供的一种样本数据集信息的示意图;
图5a是本申请实施例中提供的一种样本数据集下增强学习对比示意图;
图5b是本申请实施例中提供的另一种样本数据集下增强学习对比示意图;
图5c是本申请实施例中提供的又一种样本数据集下增强学习对比示意图;
图6是本申请实施例中提供的一种分类模型增量学习实现装置的结构框图;
图7是本申请实施例中提供的一种电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
在更加详细地讨论示例性实施例之前,应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作(或步骤)描述成顺序的处理,但是其中的许多操作(或步骤)可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。
下面针对本申请中提供的分类模型增量学习实现方法、装置、电子设备及存储介质,通过以下各个实施例及其可选方案进行详细阐述。
图1是本申请实施例中提供的一种分类模型增量学习实现方法的流程图。本申请实施例可适用于对类别识别模型进行增量学习的情况。该方法可由分类模型增量学习实现装置来执行,该装置可以采用软件和/或硬件的方式实现,并集成在任何具有网络通信功能的电子设备上。如图1所示,本申请实施例中提供的分类模型增量学习实现方法,可包括以下步骤:
S110、获取至少一个无标签增量样本。
样本类别标签可以用于表示样本对象的类别信息,在对分类模型进行模型训练与更新时,数据样本集具体可包括三部分:有样本类别标签但样本类别标签不完整的初始样本数据集D 1、无样本类别标签但样本实际上涵盖所有类别的无标签增量样本数据集D 2以及样本类别标签完整的验证样本数据集D 3
为了便于阐述分类模型增量学习实现方案,设定一个有五个样本属性和四个样本类别的样本数据集来实现分类模型的训练与更新操作,其中样本属性可分别记作A 1,A 2,A 3,A 4,A 5,样本的类别可分别记作C 1,C 2,C 3,C 4
S120、将无标签增量样本逐个输入到建立的分类模型进行类别预测;分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到。
在给定具有部分类别标签但类别标签不完整的初始样本数据集D 1后,由于初始样本数据集D 1的部分样本类别标签是已知的,因此可预先利用已知部分样本标签的初始样本数据使用极限学习机进行数据建模得到一个极限学习机模型。
可选地,极限学习机是一个包含单输入层、单隐含层和单输出层的神经网络,给定初始样本数据集D 1数据规模后,可获知输入层和输出层的节点数,而 隐含层节点可通过经验和尝试来分配,例如选取数据规模的
Figure PCTCN2021077147-appb-000001
左右。
可选地,分类模型的参数计算过程可为:
Figure PCTCN2021077147-appb-000002
其中,W和B分别为输入权重和偏置,是每个值随机初始化在[-1,1]之间的矩阵,σ是Sigmoid激活函数,公式为
Figure PCTCN2021077147-appb-000003
将极限学习机输入记作X,输出记作Y,基于初始样本数据集D 1使用极限学习机进行数据建模得到一个分类模型。
当有新的无标签增量样本过来,将无标签增量样本逐次投入到分类模型中进行类别预测,得到对无标签增量样本的类别预测结果。其中,分类模型输出的类别预测结果Y是经过独热编码转换的矩阵。例如,使用极限学习机建立分类模型后,能够对新的数据进行预测。给定新的至少一个无标签增量样本x,分类模型可逐个预测输入的无标签增量样本返回预测结果
Figure PCTCN2021077147-appb-000004
S130、依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
根据本实施例中提供的分类模型增量学习实现方法,在分类模型训练完成后,如果有新的数据样本提供时,通过使用分类模型对无标签增量样本进行类别预测来扩充增量学习所需的带标签增量样本,解决了在对分类模型进行更新时必须重新训练无法在模型基础上增量学习的缺陷,实现有标签数据的标签不完整的情况下仍旧根据大量的无标签数据进行增量运算提高模型预测精度的效果,在标签不完整的情况下仍旧保持良好的学习精度使得算法具有良好的可学习性,同时给定有新的样本数据时能够较快地对模型进行更新,且让模型复杂度始终同样本数据复杂度保持在类似水平,降低模型学习成本和复杂度。
图2是本申请实施例中提供的另一种分类模型增量学习实现方法的流程图,本申请实施例在上述实施例的基础上对前述实施例进行说明,本申请实施例可以与上述一个或者多个实施例中各个可选方案结合。如图2所示,本申请实施例中提供的分类模型增量学习实现方法,可包括以下步骤:
S210、获取至少一个无标签增量样本。
S220、将无标签增量样本逐个输入到建立的分类模型进行类别预测;分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到。
S230、确定各个对无标签增量样本的类别预测结果是否准确。
参见图3,初始样本可用于分类模型的初步训练,增量样本可用于对已建立的分类模型的增量学习(如果无标签则在半监督学习中无法使用)。对于每个无标签增量样本而言,均可通过分类模型得到一个类别预测结果,但并不是所有类别预测结果都应被接受,即会存在一些预测不准确的结果,因此需要对得 到的类别预测结果准确度做进一步筛选,分离出可能存在预测异常的类别预测结果。进而,对于一组新的确定要增量学习更新的样本数据S=(X C,Y C),根据类别预测结果Y C的预测准确度,可针对分了模型做不同的更新操作。
在本实施例的一种可选方案中,参见图3,依据对无标签增量样本的类别预测结果与对应无标签增量样本对建立的分类模型进行增量学习,可包括以下:
若确定对无标签增量样本的类别预测结果准确,则无标签增量样本与对无标签增量样本的类别预测结果组成一个有标签增量样本,并基于组成的有标签增量样本对建立的分类模型进行增量学习。
分类模型的输出Y是一个经过独热编码后的矩阵。一个无标签增量样本在输入到分类模型后输出的类别预测结果也服从类似的规律。例如,假如对无标签增量样本的类别预测结果为
Figure PCTCN2021077147-appb-000005
通常取矩阵中最大元素值作为预测的真实结果,将该位置为1,其他位置为0,从而得到
Figure PCTCN2021077147-appb-000006
基于对无标签增量样本的类别预测结果输出的矩阵中各个元素值的取值大小即可获知对无标签增量样本的类别预测结果是否准确。如果确定对无标签增量样本的类别预测结果准确,表明通过类别预测结果可获知无标签增量样本的样本类别,此时则可将该无标签增量样本及其类别预测结果组成一个有标签数据投入到分类模型中进行常规的增量学习从而提升模型的预测精度。
示例性地,如果对无标签增量样本的类别预测结果Y C是准确地,通常这个类别预测结果Y C是分类模型已知的一个类别,分类模型可以利用组成的多个有标签增量样本对分类模型进行常规增量学习操作,在对分类模型进行常规增量更新后,
Figure PCTCN2021077147-appb-000007
其中W C和B C同W和B的生成方法类似。
在本实施例的一种可选方案中,依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,还包括以下操作:
基于对无标签增量样本的类别预测结果对应矩阵中最大元素取值相对预设阈值的偏离程度,确定对无标签增量样本的类别预测结果是否准确。
分类模型输出的类别预测结果是经过独热编码转换的矩阵,一个样本的模型预测结果也服从类似的规律。但是,由分析可知,若对无标签增量样本的类别预测结果对应矩阵值最大元素值大幅偏离预设阈值(预设阈值可为1),那么该对无标签增量样本的类别预测结果是不准确的,即从该对无标签增量样本的类别预测结果无法准确获知样本的类别。例如,对于类别预测结果
Figure PCTCN2021077147-appb-000008
或者
Figure PCTCN2021077147-appb-000009
两个对无标签增量样本的类别预测结果对应矩阵中的最大元素值分别为0.5和1.4,都明显偏离理论最优值1,那么这两个对无标签增量样本的类别预测结果很大可能就是不准确的。
可选地,通过分析可知,当最大元素取值相对预设阈值的偏离量为±0.05以内时,对无标签增量样本的类别预测结果是能够被接受的,即对无标签增量样本的类别预测结果对应矩阵中最大元素值在[0.95,1.05]之间时,认为该无标签增量样本的类别预测结果是准确的,否则,认为预测结果不准确。这样,当无标签增量样本的类别预测结果准确时,可将准确的无标签增量样本以及对样本的类别预测结果进行组合按照常规增量学习方式进行模型更新。
S240、若确定对无标签增量样本的类别预测结果不准确,则对预测不准确的无标签增量样本进行存储。
参见图3,如果对无标签增量样本的类别预测结果不准确,可认为分类模型判断预测该样本数据的类别预测很可能与已知类别标签都不相同,属于一个新的类别标签或是一个异常数据,此时则可将该无标签增量样本数据移入一个待定集合中进行存储,这里可称为异常集合S ab
S250、在存储的无标签增量样本达到预设数据量时,对存储的无标签增量样本集进行新类标签识别得到新类标签。
参见图3,当异常集合S ab={x 1,x 2,...,x p}中存储的无标签增量样本的数据量p达到一个指定阈值时,有理由相信异常集合中包含分类模型无法识别新类别的无标签增量样本。此时,可对存储的无标签增量样本集进行新类标签识别得到无标签增量样本中隐藏的新类别,以构建无标签增量样本的新类标签。
在本实施例的一种可选方案中,参见图3,对存储的无标签增量样本集进行新类标签识别得到新类标签,可包括以下步骤A1-A3:
步骤A1、对存储的无标签增量样本集进行新类挖掘,筛选得到密度最大且簇大小大于预设值的一个新类簇。
针对异常集合中无标签增量样本集,可以使用基于密度的新簇挖掘算法对存储的无标签增量样本集进行新类簇的挖掘,找到在一定数量限制条件下密度最大的一个簇,这个新类簇的数据被认为都是属于同一个新的类别标签。在使用基于密度的新簇挖掘算法进行新类簇挖掘时,对异常集合中无标签增量样本进行距离假设,即假设距离越近的样本其标签越可能相同。
可选地,给定最小簇大小ms(ms<p),距离增量Δd,基于密度的新簇挖掘算法返回密度最大且数量大于ms的一个新类簇c,具体过程描述如下:
(1)输入:S ab,ms,Δd,输出:簇c;(2)将异常集合中包括的每一个无标签增量样本看作一个簇,从而有{x 1},{x 2},...,{x p};(3)求出无标签增量样本之间的最小距离d;(4)将距离小于或等于d且不属于同一个簇的一组无标签增量样本所在的簇合并为一个簇,如果找不到这样的一组样本,则令d=d+Δd,并 再次执行本操作,直到找到为止;(5)判断合并后的簇的大小是否超过ms,如果没有超过,跳转到操作(4),如果超过,则返回该簇,结束算法。
可选地,当d增大到样本间最大距离时,此时所有的无标签增量样本会被划分到同一个簇中,该簇的大小将超过给定的条件,从而达到终止条件。该基于密度的聚类方法可以在发现第一个满足条件的簇时,会停止搜索过程,且该搜索过程是贪心过程,从而能尽可能保证簇的质量以及搜寻速度。
步骤A2、将标签类别完整的验证样本逐个输入到建立的分类模型进行类别预测。
步骤A3、依据对验证样本的类别预测结果与新类簇的取值,识别新类簇所属的真实标签类别,并作为存储的无标签增量样本的新类标签。
挖掘出的新类簇要同样本所属真实的类别标签作对应,以样本的类别分别为C 1,C 2,C 3,C 4为例,这里假定初始样本数据集D 1中有的标签为C 1和C 2,基于初始样本数据集D 1建立的分类模型最初通常只能识别C 1和C 2;即建立的分类模型通常无法分辨新类簇究竟是C 3还是C 4,在此可先将获取的新类簇暂记作T 1,进而找到T 1同真实类别标签的对应关系,即T 1=C 3或T 1=C 4
参见图3,可将标签完整的验证样本数据集逐个带入分类模型所得的预测结果记作YV',而其真实结果记作YV。在YV'中有一些结果是新类标签,统计这些新标签的数据的真实标签中哪种未知类别占比最多,则从统计学角度来看该未知的真实标签就最有可能同模型的新类对应。如果找不到可对应的未知标签,则说明该“新类”是已知的某种类别而不是新类,将这部分数据按照已知类别数据处理。具体算法流程描述如下:
(1)输入:验证样本数据集{XV,YV},T 1;输出:T 1的真实类别C;(2)将验证集输入XV放入模型中得预测结果YV';(3)将YV'中所有值为T 1的位置记录于Pos;(4)抽取YV中位置在Pos的所有值,记录于Val;(5)统计Val中各标签值出现的频次,并取其中最高的且不是模型已知的类别为C;(6)如果找到这样的C,则返回C,否则返回None,表示找不到对应的真实类别。
S260、将识别得到的新类标签与存储的无标签增量样本组成具有新类标签的增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
参见图3,在对无标签增量样本的类别预测结果不准确的情况下,对无标签增量样本的类别预测结果Y C是分类模型未知的一个类别,通过识别无标签增量样本中隐藏的新类簇并进行新类识别,尝试将这一组有相同新标签的样本进行标签增量更新。在进行新类标签增量之后,可将识别得到的新类标签与对应存储的无标签增量样本组成具有新类标签的增量样本,对建立的分类模型进行增 量学习。增量学习后,
Figure PCTCN2021077147-appb-000010
其中W C和B C同W和B的生成方法类似,
Figure PCTCN2021077147-appb-000011
其中λ∈(0,1]为置信因子,该值越接近1,表明对数据属于新类的确信程度越高。当为1时,Y C'的值全为0。这样设定的好处在于信息的损失会有所减少。图4示出了样本数据集信息的示意图,图5a、图5b以及图5c分别示出了在不同样本数据集下的增强学习算法对比示意图。
根据本实施例中提供的分类模型增量学习实现方法,在分类模型训练完成后,如果有新的数据样本提供时,通过使用分类模型对无标签增量样本进行类别预测来扩充增量学习所需的带标签增量样本,解决了在对分类模型进行更新时必须重新训练无法在模型基础上增量学习的缺陷,实现有标签数据的标签不完整的情况下仍旧根据大量的无标签数据进行增量运算提高模型预测精度的效果,在标签不完整的情况下仍旧保持良好的学习精度使得算法具有良好的可学习性,同时给定有新的样本数据时能够较快地对模型进行更新,且让模型复杂度始终同样本数据复杂度保持在类似水平,降低模型学习复杂度。同时,在无标签数据的真实标签不属于已知标签中任何一个时,能够大致判断数据的预测是否异常,并能够对异常数据进行新类挖掘,与一般半监督增强学习相比,本实施例中基于极限学习机的半监督增量学习对初始数据标签缺省的情况具备新类挖掘能力,实现根据大量的无标签数据进行增量运算提高模型预测精度,相比于其他算法,应用场合更加广泛。
在上述实施例的基础上,可选地,由于在每次增量计算过程中都需要对一个矩阵做求逆运算,当矩阵的规模不断变大时,求逆运算的开销将会变得越来越大。考虑到该过程是增量计算过程,因此可以考虑重复利用一部分已有的计算过程,从而可以节省大量的计算。
给定矩阵A及其伪逆矩阵
Figure PCTCN2021077147-appb-000012
其行增量矩阵[A A C]的伪逆结果为
Figure PCTCN2021077147-appb-000013
其中
Figure PCTCN2021077147-appb-000014
C=A C-AD。类似的,其列增量矩阵
Figure PCTCN2021077147-appb-000015
的伪逆结果为
Figure PCTCN2021077147-appb-000016
其中
Figure PCTCN2021077147-appb-000017
C=A C-D TA。将以上计算过程带入分类模型的增量过程中,可得迭代公式:
如果对无标签增量样本的类别预测结果Y C是分类模型已知的一个类别,则
Figure PCTCN2021077147-appb-000018
其中A=σ(WX+B),A C=σ(W CX C+B C),
Figure PCTCN2021077147-appb-000019
Figure PCTCN2021077147-appb-000020
C=A C-AD。如果对无标签增量样本的类别预测结果 Y C不是分类模型已知的一个类别,则
Figure PCTCN2021077147-appb-000021
其中A=σ(WX+B),A C=σ(W CX C+B C),
Figure PCTCN2021077147-appb-000022
C=A C-AD。
图6是本申请实施例中提供的一种分类模型增量学习实现装置的结构框图。本申请实施例可适用于对类别识别模型进行增量学习的情况。该装置可以采用软件和/或硬件的方式实现,并集成在任何具有网络通信功能的电子设备上。
如图6所示,本申请实施例中提供的分类模型增量学习实现装置,可包括以下步骤:样本获取模块610、样本预测模块620和增量学习模块630。其中:
样本获取模块610,用于获取至少一个无标签增量样本;
样本预测模块620,用于将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
增量学习模块630,用于依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
在上述实施例的基础上,可选地,增量学习模块630包括:
若确定所述对无标签增量样本的类别预测结果准确,则无标签增量样本与对无标签增量样本的类别预测结果组成一个有标签增量样本,并基于组成的有标签增量样本对建立的分类模型进行增量学习。
在上述实施例的基础上,可选地,增量学习模块630包括:
若确定所述对无标签增量样本的类别预测结果不准确,则对预测不准确的无标签增量样本进行存储;
在存储的无标签增量样本达到预设数据量时,对存储的无标签增量样本集进行新类标签识别得到新类标签;
将识别得到的新类标签与存储的无标签增量样本组成具有新类标签的增量样本,对建立的分类模型进行增量学习。
在上述实施例的基础上,可选地,对存储的无标签增量样本集进行新类标签识别得到新类标签,包括:
对存储的无标签增量样本集进行新类挖掘,筛选得到密度最大且簇大小大于预设值的一个新类簇;
将标签类别完整的验证样本逐个输入到建立的分类模型进行类别预测;
依据对验证样本的类别预测结果与新类簇的取值,识别所述新类簇所属的 真实标签类别,并作为存储的无标签增量样本的新类标签。
在上述实施例的基础上,可选地,所述分类模型输出的类别预测结果是经过独热编码转换的矩阵;增量学习模块630还包括:
基于所述对无标签增量样本的类别预测结果对应矩阵中最大元素取值相对预设阈值的偏离程度,确定对无标签增量样本的类别预测结果是否准确。
本申请实施例中所提供的分类模型增量学习实现装置可执行上述本申请任意实施例中所提供的分类模型增量学习实现方法,具备执行该分类模型增量学习实现方法相应的功能和效果,详细过程参见前述实施例中分类模型增量学习实现方法的相关操作。
图7是本申请实施例中提供的一种电子设备的结构示意图。如图7所示结构,本申请实施例中提供的电子设备包括:一个或多个处理器710和存储装置720;该电子设备中的处理器710可以是一个或多个,图7中以一个处理器710为例;存储装置720用于存储一个或多个程序;所述一个或多个程序被所述一个或多个处理器710执行,使得所述一个或多个处理器710实现如本申请实施例中任一项所述的分类模型增量学习实现方法。
该电子设备还可以包括:输入装置730和输出装置740。
该电子设备中的处理器710、存储装置720、输入装置730和输出装置740可以通过总线或其他方式连接,图7中以通过总线连接为例。
该电子设备中的存储装置720作为一种计算机可读存储介质,可用于存储一个或多个程序,所述程序可以是软件程序、计算机可执行程序以及模块,如本申请实施例中所提供的分类模型增量学习实现方法对应的程序指令/模块。处理器710通过运行存储在存储装置720中的软件程序、指令以及模块,从而执行电子设备的各种功能应用以及数据处理,即实现上述方法实施例中分类模型增量学习实现方法。
存储装置720可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储装置720可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储装置720可进一步包括相对于处理器710远程设置的存储器,这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置730可用于接收输入的数字或字符信息,以及产生与电子设备的用户设置以及功能控制有关的键信号输入。输出装置740可包括显示屏等显示 设备。
并且,当上述电子设备所包括一个或者多个程序被所述一个或者多个处理器710执行时,程序进行如下操作:
获取至少一个无标签增量样本;
将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
当然,本领域技术人员可以理解,当上述电子设备所包括一个或者多个程序被所述一个或者多个处理器710执行时,程序还可以进行本申请任意实施例中所提供的分类模型增量学习实现方法中的相关操作。
本申请实施例中提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时用于执行分类模型增量学习实现方法,该方法包括:
获取至少一个无标签增量样本;
将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
可选的,该程序被处理器执行时还可以用于执行本申请任意实施例中所提供的分类模型增量学习实现方法。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据 信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于:电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、无线电频率(RadioFrequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。

Claims (12)

  1. 一种分类模型增量学习实现方法,包括:
    获取至少一个无标签增量样本;
    将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
    依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
  2. 根据权利要求1所述的方法,其中,依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,包括:
    若确定所述对无标签增量样本的类别预测结果准确,则无标签增量样本与对无标签增量样本的类别预测结果组成一个有标签增量样本,并基于组成的有标签增量样本对建立的分类模型进行增量学习。
  3. 根据权利要求1所述的方法,其中,依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,包括:
    若确定所述对无标签增量样本的类别预测结果不准确,则对预测不准确的无标签增量样本进行存储;
    在存储的无标签增量样本达到预设数据量时,对存储的无标签增量样本集进行新类标签识别得到新类标签;
    将识别得到的新类标签与存储的无标签增量样本组成具有新类标签的增量样本,对建立的分类模型进行增量学习。
  4. 根据权利要求3所述的方法,其中,对存储的无标签增量样本集进行新类标签识别得到新类标签,包括:
    对存储的无标签增量样本集进行新类挖掘,筛选得到密度最大且簇大小大于预设值的一个新类簇;
    将标签类别完整的验证样本逐个输入到建立的分类模型进行类别预测;
    依据对验证样本的类别预测结果与新类簇的取值,识别所述新类簇所属的真实标签类别,并作为存储的无标签增量样本的新类标签。
  5. 根据权利要求1所述的方法,其中,所述分类模型输出的类别预测结果是经过独热编码转换的矩阵;
    相应地,依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,还包括:
    基于所述对无标签增量样本的类别预测结果对应矩阵中最大元素取值相对 预设阈值的偏离程度,确定对无标签增量样本的类别预测结果是否准确。
  6. 一种分类模型增量学习实现装置,包括:
    样本获取模块,用于获取至少一个无标签增量样本;
    样本预测模块,用于将无标签增量样本逐个输入到建立的分类模型进行类别预测;所述分类模型至少部分采用有标签且标签不完整的初始样本使用极限学习机建模得到;
    增量学习模块,用于依据对无标签增量样本的类别预测结果与对应无标签增量样本,对建立的分类模型进行增量学习,以实现分类模型的训练更新。
  7. 根据权利要求6所述的装置,其中,增量学习模块包括:
    若确定所述对无标签增量样本的类别预测结果准确,则无标签增量样本与对无标签增量样本的类别预测结果组成一个有标签增量样本,并基于组成的有标签增量样本对建立的分类模型进行增量学习。
  8. 根据权利要求6所述的装置,其中,增量学习模块包括:
    若确定所述对无标签增量样本的类别预测结果不准确,则对预测不准确的无标签增量样本进行存储;
    在存储的无标签增量样本达到预设数据量时,对存储的无标签增量样本集进行新类标签识别得到新类标签;
    将识别得到的新类标签与存储的无标签增量样本组成具有新类标签的增量样本,对建立的分类模型进行增量学习。
  9. 根据权利要求8所述的装置,其中,对存储的无标签增量样本集进行新类标签识别得到新类标签,包括:
    对存储的无标签增量样本集进行新类挖掘,筛选得到密度最大且簇大小大于预设值的一个新类簇;
    将标签类别完整的验证样本逐个输入到建立的分类模型进行类别预测;
    依据对验证样本的类别预测结果与新类簇的取值,识别所述新类簇所属的真实标签类别,并作为存储的无标签增量样本的新类标签。
  10. 根据权利要求6所述的装置,其中,所述分类模型输出的类别预测结果是经过独热编码转换的矩阵;增量学习模块还包括:
    基于所述对无标签增量样本的类别预测结果对应矩阵中最大元素取值相对预设阈值的偏离程度,确定对无标签增量样本的类别预测结果是否准确。
  11. 一种电子设备,包括:
    一个或多个处理装置;
    存储装置,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现权利要求1-5中任一项所述的分类模型增量学习实现方法。
  12. 一种计算机可读介质,存储有计算机程序,该程序被处理装置执行时实现权利要求1-5中任一项所述的分类模型增量学习实现方法。
PCT/CN2021/077147 2021-02-22 2021-02-22 分类模型增量学习实现方法、装置、电子设备及介质 WO2022174436A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/077147 WO2022174436A1 (zh) 2021-02-22 2021-02-22 分类模型增量学习实现方法、装置、电子设备及介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/077147 WO2022174436A1 (zh) 2021-02-22 2021-02-22 分类模型增量学习实现方法、装置、电子设备及介质

Publications (1)

Publication Number Publication Date
WO2022174436A1 true WO2022174436A1 (zh) 2022-08-25

Family

ID=82931912

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/077147 WO2022174436A1 (zh) 2021-02-22 2021-02-22 分类模型增量学习实现方法、装置、电子设备及介质

Country Status (1)

Country Link
WO (1) WO2022174436A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372819A (zh) * 2023-12-07 2024-01-09 神思电子技术股份有限公司 用于有限模型空间的目标检测增量学习方法、设备及介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820687A (zh) * 2015-04-22 2015-08-05 中国科学院信息工程研究所 一种有向链接式分类器构造方法及分类方法
CN108764281A (zh) * 2018-04-18 2018-11-06 华南理工大学 一种基于半监督自步学习跨任务深度网络的图像分类方法
US20180322416A1 (en) * 2016-08-30 2018-11-08 Soochow University Feature extraction and classification method based on support vector data description and system thereof
CN108920446A (zh) * 2018-04-25 2018-11-30 华中科技大学鄂州工业技术研究院 一种工程文本的处理方法
CN109034190A (zh) * 2018-06-15 2018-12-18 广州深域信息科技有限公司 一种动态选择策略的主动样本挖掘的物体检测系统及方法
CN110244689A (zh) * 2019-06-11 2019-09-17 哈尔滨工程大学 一种基于判别性特征学习方法的auv自适应故障诊断方法
CN112132179A (zh) * 2020-08-20 2020-12-25 中国人民解放军战略支援部队信息工程大学 基于少量标注样本的增量学习方法及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820687A (zh) * 2015-04-22 2015-08-05 中国科学院信息工程研究所 一种有向链接式分类器构造方法及分类方法
US20180322416A1 (en) * 2016-08-30 2018-11-08 Soochow University Feature extraction and classification method based on support vector data description and system thereof
CN108764281A (zh) * 2018-04-18 2018-11-06 华南理工大学 一种基于半监督自步学习跨任务深度网络的图像分类方法
CN108920446A (zh) * 2018-04-25 2018-11-30 华中科技大学鄂州工业技术研究院 一种工程文本的处理方法
CN109034190A (zh) * 2018-06-15 2018-12-18 广州深域信息科技有限公司 一种动态选择策略的主动样本挖掘的物体检测系统及方法
CN110244689A (zh) * 2019-06-11 2019-09-17 哈尔滨工程大学 一种基于判别性特征学习方法的auv自适应故障诊断方法
CN112132179A (zh) * 2020-08-20 2020-12-25 中国人民解放军战略支援部队信息工程大学 基于少量标注样本的增量学习方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372819A (zh) * 2023-12-07 2024-01-09 神思电子技术股份有限公司 用于有限模型空间的目标检测增量学习方法、设备及介质
CN117372819B (zh) * 2023-12-07 2024-02-20 神思电子技术股份有限公司 用于有限模型空间的目标检测增量学习方法、设备及介质

Similar Documents

Publication Publication Date Title
US10719301B1 (en) Development environment for machine learning media models
EP3467723B1 (en) Machine learning based network model construction method and apparatus
US20230195845A1 (en) Fast annotation of samples for machine learning model development
US11537506B1 (en) System for visually diagnosing machine learning models
US20230267368A1 (en) System, device and method of detecting abnormal datapoints
US10732694B2 (en) Power state control of a mobile device
JP2021193615A (ja) 量子データの処理方法、量子デバイス、コンピューティングデバイス、記憶媒体、及びプログラム
CN115578248B (zh) 一种基于风格引导的泛化增强图像分类算法
JP2023042582A (ja) サンプル分析の方法、電子装置、記憶媒体、及びプログラム製品
WO2022174436A1 (zh) 分类模型增量学习实现方法、装置、电子设备及介质
CN114492601A (zh) 资源分类模型的训练方法、装置、电子设备及存储介质
US20190392331A1 (en) Automatic and self-optimized determination of execution parameters of a software application on an information processing platform
US11068779B2 (en) Statistical modeling techniques based neural network models for generating intelligence reports
WO2022198477A1 (zh) 分类模型增量学习实现方法、装置、电子设备及介质
WO2023078009A1 (zh) 一种模型权重获取方法以及相关系统
US20230004870A1 (en) Machine learning model determination system and machine learning model determination method
US20230229570A1 (en) Graph machine learning for case similarity
US20220051077A1 (en) System and method for selecting components in designing machine learning models
Moreira et al. Prototype generation using self-organizing maps for informativeness-based classifier
CN114898184A (zh) 模型训练方法、数据处理方法、装置及电子设备
CN115410250A (zh) 阵列式人脸美丽预测方法、设备及存储介质
CN114428720A (zh) 基于p-k的软件缺陷预测方法、装置、电子设备及介质
CN114154581A (zh) 一种基于mpi的分布式admm垃圾邮件分类方法
CN114020916A (zh) 文本分类方法、装置、存储介质和电子设备
WO2022031839A1 (en) Systems and methods for improved core sample analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21926154

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205AX DATED 14.12.2023)