CN111539474B

CN111539474B - Classifier model transfer learning method

Info

Publication number: CN111539474B
Application number: CN202010329243.5A
Authority: CN
Inventors: 赵海堂; 刘航
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2020-04-23
Filing date: 2020-04-23
Publication date: 2022-05-10
Anticipated expiration: 2040-04-23
Also published as: CN111539474A

Abstract

The invention provides a classifier model transfer learning method, which comprises the following steps: inserting a new sensor array at the end of the life cycle of an original sensor array, and starting to acquire response signals of the original sensor array and the new sensor array; predicting the category information of the new sensor array according to the classification model information of the original sensor array, and calling the category information as the pseudo label of the new sensor array; performing feature extraction on the response signal of the new sensor array, forming a training set together with the pseudo label, and training to obtain a classifier model of the new sensor array; and adjusting the obtained classifier model parameters layer by layer to optimize. The technical scheme of the invention simplifies the process of updating the classifier model after the sensor is replaced, does not interrupt the classification process, does not discard the previous classifier model, and can also realize other standard models as core classifiers.

Description

Classifier model transfer learning method

Technical Field

The invention relates to the technical field of intelligent instruments and meters, in particular to a classifier model migration learning method.

Background

In a widely used situation, damage or poisoning of a sensor may render the sensor or sensors in the array unusable, which is permanent and the element can no longer be operated. The sensors are inevitably replaced during long-term use, and the whole sensor array is usually recalibrated after the sensors are replaced, because the difference of the sensitive layers causes the signals of the replacement elements to be greatly different from the input signals of the original system, and the classification model of the original system cannot be directly used. For these problems, it is necessary to re-model the sensor system. In practical application, in the service cycle of the detection instrument, the identification precision of the classifier (namely, the pattern identification system) in the detection instrument is still in an effective range, so that the classification information of the original system can be fully utilized. The method utilizes the classification model information of the original sensor array to calibrate the new sensor array, simplifies the modeling process of the traditional classifier, and completes the updating of the classification model of the new sensor array by forming a training sample by the pseudo label obtained by predicting the original classifier model and the output signal of the new sensor array.

The system structure of the intelligent instruments is slightly improved, and the method can be applied by adding a row of redundant sensor socket arrays, so that the updating of the classifier model in the system is finally completed.

Disclosure of Invention

In light of the above-mentioned technical problem, a classifier model migration learning method is provided. The invention mainly utilizes a classifier model transfer learning method, which is characterized by comprising the following steps:

step S1: inserting a new sensor array at the end of the life cycle of an original sensor array, and starting to acquire response signals of the original sensor array and the new sensor array;

step S2: predicting the category information of the new sensor array according to the classification model information of the original sensor array, and calling the category information as the pseudo label of the new sensor array;

step S3: performing feature extraction on the response signal of the new sensor array, forming a training set together with the pseudo label, and training to obtain a classifier model of the new sensor array;

step S4: and optimizing the classifier model parameters obtained in the step S3 by adjusting layer by layer.

Further, setting the classifier model of the original sensor array as a three-layer BP neural network, extracting network parameters of the original classifier model, namely a Weight matrix Weight and a threshold matrix Bias, in a coexistence period of two groups of sensor arrays, acquiring a feature vector of a data set by the original sensor array as an input vector X, and calculating to obtain a hidden layer output as follows:

the output of the output layer is:

O＝f²(H^T·Weight₂+Bias₂)；

Weight₁and Bias₁Weight matrix and threshold matrix between input layer and hidden layer, Weight₂And Bias₂The weight matrix and the threshold matrix between the hidden layer and the output layer. f. of¹Excitation function for hidden layer neurons, f²As a function of excitation of neurons in the output layer.

Network structure for setting new classifier model and original classifier modelThe models are completely the same, and the input information before the hidden layer of the original classifier model is not excited is extracted

Bias₁Expressed as Weight₀,X₀In the form of 1, the feature vector of the dataset acquired for the new sensor array is taken as a new input vector X_newIn the coexistence period, the gas information sensed by the two groups of sensor arrays is the same, namely, except that the input samples are different, the output of the hidden layer and the output layer are close to the same classification result, so that an overdetermined equation set can be established:

X_new·W₁＝sum₁；

obtaining a first layer weight vector W containing bias terms₁Then, the new output H of the hidden layer is calculated_new＝f¹(X_new·W₁) Extracting input information sum before the output layer of the original classifier model is not excited₂＝H^T·Weight₂+Bias₂Establishing an overdetermined equation set:

H_new·W₂＝sum₂；

obtaining a second layer weight vector W containing bias terms₂(ii) a And building the obtained weight parameters to obtain an optimized new classifier model.

Compared with the prior art, the invention has the following advantages:

the invention is mainly used for intelligent detection instrument equipment. When the sensors of these devices are replaced, since the signals of the sensor elements of the same type may also be greatly different from the signals of the original sensor elements, the recognition effect is reduced after the sensor elements are directly replaced, and the classifier model after the sensor elements are replaced needs to be updated. The technical scheme of the invention has the beneficial effects that: the process of updating the classifier model after the sensor is replaced is simplified, the classification process is not interrupted, the previous classifier model is not discarded, and other standard models can be used as core classifiers.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of the system of the present invention.

FIG. 2 is a flow chart of the classifier model transfer learning algorithm of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

As shown in fig. 1-2, the present invention provides a classifier model migration learning method, which comprises the following steps:

step S1: inserting a new sensor array at the end of the life cycle of the original sensor array, and starting to acquire response signals of the original sensor array and the new sensor array. The present application is implemented by means of the system structure in fig. 1, and a row of sensor array sockets is added on the basis of the original system.

S2: and predicting the class information of the new sensor array through the classification model information of the original sensor array, and calling the class information as the pseudo label of the new sensor array. The classification model information of the original sensor array is the pattern recognition system of the array, which can also be called a classifier, and since the gas information of the original sensor array and the gas information of the new sensor array in the coexistence period are the same, the classifier of the original sensor array can be used for predicting the label of the new sensor array, which is not a standard label and is called a pseudo label.

S3: and performing feature extraction on the response signal of the new sensor array, forming a training set together with the pseudo label, and training to obtain a classifier model of the new sensor array. In the present application, the training set refers to a set of data sets with category information, the response signal of the new sensor array is subjected to feature extraction to be an input sample, the pseudo label is the category information, and the two form the training set, which can be used for training the classifier.

S4: and optimizing the classifier model parameters obtained in the step S3 by adjusting layer by layer. Setting the classifier model of the original sensor array as a three-layer BP neural network, extracting network parameters of the original classifier model, namely a Weight matrix Weight and a threshold matrix Bias, in a coexistence period of two groups of sensor arrays, acquiring a feature vector of a data set by the original sensor array as an input vector X, and calculating to obtain a hidden layer output as follows:

the output of the output layer is:

O＝f²(H^T·Weight₂+Bias₂)；

Weight₁and Bias₁Weight matrix and threshold matrix between input layer and hidden layer, Weight₂And Bias₂Weight matrix and threshold matrix between the hidden layer and the output layer. f. of¹Excitation function for hidden layer neurons, f²As a function of excitation of neurons in the output layer.

Setting the network structure of the new classifier model to be identical to that of the original classifier model, and extracting the input information before the hidden layer of the original classifier model is not excited

X_new·W₁＝sum₁；

H_new·W₂＝sum₂；

obtaining a second layer weight vector W containing bias terms₂(ii) a And building the obtained weight parameters to obtain an optimized new classifier model. Similarly, other optimization methods, such as genetic algorithms, may also be used as a preferred method for optimizing the model parameters herein.

Example (b):

as a preferred embodiment of the present application, an electronic nose is taken as an example. The system architecture inside the electronic nose is based on an array of independent gas sensors to drive the electronics and appropriate algorithms to achieve gas identification. In practice, however, the sensors operate for a long time, and this ideal situation is still largely unfeasible today. Due to the limited lifetime of the sensors due to unknown dynamic processes in the sensor system (e.g. sensor poisoning or aging), replacement of sensor elements is inevitably required over a long period of time. The working principle of the gas sensor for sensing the gas to be measured mainly depends on the characteristics of the gas sensitive film, and even though the sensors of the same type have different sensitive films, different response values can be generated when the same sample is measured. Therefore, after replacement of the sensor elements, a new sensor array also needs to be re-modeled.

The traditional method for obtaining a sensor classification model is to place a sensor array in a laboratory environment, provide a plurality of different chemicals to the sensor array, and establish a training sample library to train a pattern recognition system. The goal of this training process is to configure the recognition system to produce a unique classification for each chemical in order to achieve automatic recognition. The sensor array has large data quantity and high complexity, so that the traditional method is difficult to realize automation.

After the method is applied, the process of obtaining the model is quick and easy to operate, and the classifier model of the new sensor array is obtained by adopting a model transfer learning mode, so that the experiment cost is reduced. Under the condition of not carrying out laboratory measurement, a new sensor array is inserted at the end of the service cycle of the original sensor array, the classification model of the original sensor array is transferred to the new sensor array, and on the basis, the model parameters are adjusted to optimize the obtained classifier model so as to enable the classifier model to be more suitable for the detection environment. The operation of the common user is simple.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A classifier model transfer learning method is characterized by comprising the following steps:

s1: inserting a new sensor array at the end of the life cycle of an original sensor array, and starting to acquire response signals of the original sensor array and the new sensor array;

s2: predicting the category information of the new sensor array according to the classification model information of the original sensor array, and calling the category information as the pseudo label of the new sensor array;

s3: performing feature extraction on the response signal of the new sensor array, forming a training set together with the pseudo label, and training to obtain a classifier model of the new sensor array;

s4: adjusting the classifier model parameters obtained in the step S3 layer by layer to optimize;

setting the classifier model of the original sensor array as a three-layer BP neural network, extracting network parameters of the original classifier model, namely a Weight matrix Weight and a threshold matrix Bias, in a coexistence period of two groups of sensor arrays, acquiring a feature vector of a data set by the original sensor array as an input vector X, and calculating to obtain a hidden layer output as follows:

the output of the output layer is:

O＝f²(H^T·Weight₂+Bias₂)；

among them, Weight₁And Bias₁Weight matrix and threshold matrix between input layer and hidden layer, Weight₂And Bias₂A weight matrix and a threshold matrix between the hidden layer and the output layer; f. of¹Excitation function for hidden layer neurons, f²As an excitation function for neurons in the output layer;

Bias₁Expressed as Weight₀，X₀1 as a form; the feature vector of the dataset acquired for the new sensor array is taken as a new input vector X_newTwo sets of transmissions due to coexistence periodThe gas information sensed by the sensor array is the same, namely except that the input samples are different, the output of the hidden layer and the output layer are close to the same class result, and accordingly an over-determined equation set can be established:

X_new·W₁＝sum₁；

H_new·W₂＝sum₂；