CN116451083A - Unsupervised carbon dioxide emission monitoring method based on deep transfer learning - Google Patents

Unsupervised carbon dioxide emission monitoring method based on deep transfer learning Download PDF

Info

Publication number
CN116451083A
CN116451083A CN202310454552.9A CN202310454552A CN116451083A CN 116451083 A CN116451083 A CN 116451083A CN 202310454552 A CN202310454552 A CN 202310454552A CN 116451083 A CN116451083 A CN 116451083A
Authority
CN
China
Prior art keywords
data
domain
discriminator
loss
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310454552.9A
Other languages
Chinese (zh)
Inventor
陈磊
杨玲
徐炜
王健
郭诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenergy Maanshan Electric Power Generation Co ltd
Original Assignee
Wenergy Maanshan Electric Power Generation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenergy Maanshan Electric Power Generation Co ltd filed Critical Wenergy Maanshan Electric Power Generation Co ltd
Priority to CN202310454552.9A priority Critical patent/CN116451083A/en
Publication of CN116451083A publication Critical patent/CN116451083A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/004CO or CO2
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display
    • G01N33/0067General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display by measuring the rate of variation of the concentration
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display
    • G01N33/0068General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display using a computer specifically programmed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning
    • Y02P90/84Greenhouse gas [GHG] management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Food Science & Technology (AREA)
  • Combustion & Propulsion (AREA)
  • Analytical Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unsupervised carbon dioxide emission monitoring method based on deep transfer learning, and relates to the technical field of carbon dioxide emission monitoring. The unsupervised carbon dioxide emission monitoring method based on deep transfer learning has the following advantages and effects: tags of target domain data are often difficult or very costly to acquire during the transfer learning process. The use of a model trained using only source domain labeled data with a common non-migration method directly for predicting target domains is often unsatisfactory. The model can be trained under the condition of lacking the target domain label based on the deep unsupervised transfer learning method. Compared with the traditional countermeasure field self-adaptive method, the method adopts a double-flow structure, simultaneously focuses on the influence of edge distribution difference and the difference of conditional distribution, and represents the relative importance of feature mobility and separability through balance factors.

Description

Unsupervised carbon dioxide emission monitoring method based on deep transfer learning
Technical Field
The invention relates to the technical field of carbon dioxide emission monitoring, in particular to an unsupervised carbon dioxide emission monitoring method based on deep transfer learning.
Background
One effective way of carbon neutralization is carbon trade, which is premised on accurate carbon monitoring. For carbon emission equipment of different types, the data distribution of carbon emission data may be greatly changed, so that the generalization of a carbon dioxide concentration prediction model directly constructed according to a training sample collected at random is poor, and a training sample set adopted in the training process and a prediction sample set in the prediction process have great difference in data distribution, so that the accuracy of carbon dioxide concentration prediction is influenced.
The unsupervised carbon dioxide emission monitoring method based on deep transfer learning has the following advantages and effects: tags of target domain data are often difficult or very costly to acquire during the transfer learning process. The use of a model trained using only source domain labeled data with a common non-migration method directly for predicting target domains is often unsatisfactory. The model can be trained under the condition of lacking the target domain label based on the deep unsupervised transfer learning method. Compared with the traditional countermeasure field self-adaptive method, the method adopts a double-flow structure, simultaneously focuses on the influence of edge distribution difference and the difference of conditional distribution, and represents the relative importance of feature mobility and separability through balance factors.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides an unsupervised carbon dioxide emission monitoring method based on deep transfer learning, which solves the technical problems in the background art.
(II) technical scheme
In order to achieve the above purpose, the invention is realized by the following technical scheme: an unsupervised carbon dioxide emission monitoring method based on deep transfer learning comprises the following steps:
s1: preprocessing data;
s2: preprocessing data;
s3: setting up a model;
s4: training a model;
s5: testing a model;
the method comprises the steps that S1, first carbon emission data corresponding to first carbon emission equipment and second carbon emission data corresponding to second carbon emission equipment are obtained, wherein the types of the first carbon emission equipment and the second carbon emission equipment are different; taking the first carbon emission device as source domain data and the second carbon emission device as target domain data; collecting source domain and target domain data to obtain labeled source domain data < X s ,Y s > and unlabeled target domain data X t Wherein X represents data and Y represents its corresponding tag; taking a carbon emission data set of a power plant as an example, data such as temperature, humidity, coal consumption and the like at a certain sampling time form a feature vector, wherein one feature vector is a sample X, X is a transverse quantity of d dimension, d represents the acquired sample, Y is a scalar, namely a label corresponding to the sample, represents the concentration of carbon dioxide, and a marked sample set { (X) can be obtained through data collection for a period of time i ,y i ) Distinguishing the boilers of different models, and obtaining a labeling sample set corresponding to each model of boiler;
the step S2 comprises the step of carrying out normalization processing on source domain data and target domain data, and aims to eliminate the influence of the problems of quantity level differentiation, different data value ranges, unobvious data trend and the like in an original data sample on model training, and simultaneously improve model precision and model training speed;
the S3 comprises a model network adopting a double-flow structure, wherein the model network adopting the double-flow structure comprises two characteristic extraction neural networks G1 and G2; two tag classifiers C1, C2, C1 are primary classifiers, C2 is a final classifier; an countermeasure field discriminator D, wherein D packetsGlobal discriminator G d Local discriminatork=1, 2, …, K is the number of data categories; and a distributed difference explicit measurement module, wherein the method for building the model comprises the following steps:
s31: selecting a proper network as a feature extractor, inputting tagged source domain data and untagged target domain data into G1 and G2, wherein the output features of the tags passing through G1 and G2 are fs1, ft1, fs2 and ft2, wherein fs1 and ft1 respectively represent X s Through the output characteristics of G1, X t Outputting the characteristic through G1; fs2, ft2 denote X respectively s Through the output characteristics of G2, X t Outputting the characteristic through G2; g1, G2 may employ one of a Resnet, VGG, and multiple CNN networks;
s32: c1 and C2 are conventional label classifiers such as neural networks and support vector machines; for classifying the data. Training a label classifier by adopting source domain labeled data, and if cross entropy loss training is adopted, the general expression of label classifier loss is as follows:
D s representing source domain data, n s The number of source domain data is represented,x represents i Probability of belonging to class k, C y Representing a tag classifier, G f A representation feature extractor;
s33: taking fs1, ft1 as input to the challenge domain evaluator D, D is capable of reducing edge distribution differences between source domain and target domain data by evaluating input features from the source domain or the target domain; a common domain discriminator consists of a multi-layer perceptron and a Softmax function; marking the source domain data as 1, marking the target domain data as 0, if the input of a sample is carried out, outputting the sample from the source domain or the target domain, and calculating the loss value of the domain discriminator according to the actual result and the predicted value; if trained with a cross entropy loss function, the loss of the challenge domain discriminator can be expressed as:
x∈X s ∪X t m represents the number of samples of one batch, d i A domain label representing the i-th sample,represents the output of the ith sample through D, θ G1 ,θ d Respectively representing parameters in G1 and D;
wherein the global domain discriminator G d The loss can be expressed as:
D s representing source domain data, D t Representing target domain data, n s ,n t Respectively represent D s ,D t Data, L ce Representing cross entropy loss as a loss function of the domain classifier;
the local domain discriminator is subdivided into K domain discriminatorsEach class discriminator is responsible for matching the source domain data associated with class k with the target domain data, the partitioning over the target domain being based on the pseudo tags generated by the tag classifier. The loss function of the local area discriminator may be calculated as:
is a domain nameDevice for preventing and/or stopping the flow of air>Cross entropy loss of class k corresponding to the domain discriminator,/->Is X i A probability distribution predicted as k classes;
usingTo measure the importance of domain discriminators, global domain discriminatorsExpressed as:
local area discriminatorExpressed as:
sample representing class k in source domain and target domain, respectively,/->Representing the loss of the local subfield discriminator on class k; finally, the dynamic challenge factor ω is expressed as:
in the above-described antagonistic domain adaptive structure, its final learning objective can be expressed as:
θ f ,θ y ,θ drespectively represent G1, C1, G d ,/>Wherein the value of ω is self-calculated over the network;
s34: the domain discriminator ensures the mobility contained in the feature, but paying too much attention to the mobility of the data can lead to the reduction of the mobility of the class in the data, and a balance factor is introduced for balancing the mobility and the mobility of the class:
maximum mean difference MMD (D s ,D t ) The method is a common estimation method for calculating the alignment degree of data distribution between two domains, and is used for measuring the mobility of the domains; the separability of the classes in the domain is measured by using a discrimination evaluation method maxJ (W) based on linear discriminant analysis, which is defined as follows:
wherein S is b Is an inter-class scattering matrix, S w Is an intra-class scattering matrix; clearly, the larger maxJ (W) means better separability;
since the estimated values of the two evaluation criteria are not usually on the same order of magnitude, the estimated values need to be further normalized to obtain
The balance factor is defined as follows:
of which smallerIndicating a better domain alignment, smaller +.>Indicating better class authenticability;
s35: in combination with the above S31, S32, S33, S34, the loss of the final superstructure is defined as:
wherein τ and ω are parameters calculated by the network itself;
s36: in the lower layer structure, according to the advantages of the maximum mean difference method, the Hilbert space embedding of the joint distribution is selected to measure the difference of two joint distributions P and Q, the distribution in one domain is transferred into the Regenerated Kernel Hilbert Space (RKHS), and the joint probability distribution loss can be obtained by directly calculating the MMD distance between a source domain and a target domain in the RKHS:
P S (x s ,y s ),P T (x T ,y T ) Representing the joint probability distribution of the source domain and the target domain respectively,respectively represent D s ,D t Features in RKHS corresponding to the ith data in (a)>Respectively represent D s ,D t Class labels corresponding to the ith and j data;
the step S4 includes step S41: in the upper layer structure, X is s ,X t As G1 input, training G1 and D to obtain optimal parameters using resistance training; because the target domain does not contain a label, only training C1 by adopting the source domain data, using the trained C1 for the prediction of the target domain data category, and taking the output of C1 as a pseudo label of the target domain dataThe training loss for C1 is as follows:
combining S35 can result in loss in the superstructure:
s42: x in the lower layer structure s ,X t As input of G2, the feature Z extracted by G2 is obtained s ,Z t ,Z s ,Z t Respectively X s ,X t The output characteristics obtained through G2; by < Z s ,Y s >,Calculate L jmmd
S43: to integrate the migration ability of G1, G2 after training, X is calculated s The outputs of G1 and G2 are fused, and the fused features are used as the input of C2 for training, and the training loss of C2 is expressed as follows:
s44: from the network losses discussed in S41, S42, S43, the optimization objective of our proposed model can be expressed as
S5: after the model training is finished at S4, the test data is predicted using the feature extractors G1, G2 and the classification network C2.
(III) beneficial effects
The invention provides an unsupervised carbon dioxide emission monitoring method based on deep transfer learning. The beneficial effects are as follows:
the unsupervised carbon dioxide emission monitoring method based on deep transfer learning can train out a model under the condition of lacking a target domain label. Compared with the traditional countermeasure field self-adaptive method, the method adopts a double-flow structure, simultaneously focuses on the influence of edge distribution difference and the difference of conditional distribution, and represents the relative importance of feature mobility and separability through balance factors.
Drawings
Figure 1 is a schematic diagram of the model building of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention discloses an unsupervised carbon dioxide emission monitoring method based on deep transfer learning, which comprises the following steps:
1) Collecting data;
2) Preprocessing data;
3) Setting up a model;
4) Training a model;
5) Testing a model;
the method comprises the steps of 1) obtaining first carbon emission data corresponding to first carbon emission equipment and second carbon emission data corresponding to second carbon emission equipment, wherein the types of the first carbon emission equipment and the second carbon emission equipment are different; taking the first carbon emission device as source domain data and the second carbon emission device as target domain data; collecting source domain and target domain data to obtain labeled source domain data < X s ,Y s > and unlabeled target domain data X t Wherein X represents data and Y represents its corresponding tag; taking a carbon emission data set of a power plant as an example, data such as temperature, humidity, coal consumption and the like at a certain sampling time form a feature vector, wherein one feature vector is a sample X, X is a transverse quantity of d dimension, d represents the acquired sample, Y is a scalar, namely a label corresponding to the sample, represents the concentration of carbon dioxide, and a marked sample set { (X) can be obtained through data collection for a period of time i ,y i ) Distinguishing between different models of boilers can result in a set of labeled samples for each model of boiler.
The step 2) includes performing normalization processing on the source domain data and the target domain data, so as to eliminate the influence of the problems of different quantity levels, different data value ranges, insignificant data trend and the like in the original data samples on model training, and improve model accuracy and model training speed.
The step 3) comprises the following parts:
the model network adopts a double-flow structure and mainly comprises two characteristic extraction neural networks G1 and G2; two tag classifiers C1, C2, C1 are primary classifiers, C2 is a final classifier; a challenge domain discriminator D, wherein D comprises a global discriminator G d Local discriminatorK is the data category number; and a distributed difference explicit measurement module, wherein the method for building the model comprises the following steps:
31 Selecting an appropriate network asFeature extractor for inputting labeled source domain data and unlabeled target domain data into G1, G2, and outputting features of fs1, ft1, fs2, ft2 via G1, G2, wherein fs1, ft1 respectively represent X s Through the output characteristics of G1, X t Outputting the characteristic through G1; fs2, ft2 denote X respectively s Through the output characteristics of G2, X t Outputting the characteristic through G2; g1, G2 may employ a Resnet, VGG, multiple CNN networks, etc.
32 C1 and C2 are conventional label classifiers such as neural networks and support vector machines; for classifying the data. Training a label classifier by adopting source domain labeled data, and if cross entropy loss training is adopted, the general expression of label classifier loss is as follows:
D s representing source domain data, n s The number of source domain data is represented,x represents i Probability of belonging to class k, C y Representing a tag classifier, G f Representing the feature extractor.
33 Fs1, ft1 as input to the challenge domain discriminator D capable of reducing edge distribution differences between source domain and target domain data by discriminating whether the input features are from the source domain or the target domain; a common domain discriminator consists of a multi-layer perceptron and a Softmax function; the source domain data is marked as 1, the target domain data is marked as 0, if the input of a sample is output, the sample is from the source domain or the target domain, and the loss value of the domain discriminator is calculated according to the actual result and the predicted value. If trained with a cross entropy loss function, the loss of the challenge domain discriminator can be expressed as:
x∈X s ∪X t m represents the number of samples of one batch, d i A domain label representing the i-th sample,represents the output of the ith sample through D, θ G1 ,θ d The parameters in G1 and D are shown respectively.
Wherein the global domain discriminator G d The loss can be expressed as:
D s representing source domain data, D t Representing target domain data, n s ,n t Respectively represent D s ,D t Data, L ce Representing cross entropy loss as a loss function of the domain classifier,
the local domain discriminator is subdivided into K domain discriminatorsEach class discriminator is responsible for matching the source domain data associated with class k with the target domain data, the partitioning over the target domain being based on the pseudo tags generated by the tag classifier. The loss function of the local area discriminator may be calculated as:
is a domain discriminator, < >>Cross entropy loss of class k corresponding to the domain discriminator,/->Is X i Predicted as a probability distribution of k classes.
UsingTo measure the importance of domain discriminators, global domain discriminatorsExpressed as:
local area discriminatorExpressed as:
sample representing class k in source domain and target domain, respectively,/->Representing the loss of the local subfield discriminator on class k. Finally, the dynamic challenge factor ω is expressed as:
in the above-described antagonistic domain adaptive structure, its final learning objective can be expressed as:
θ f ,θ y ,θ drespectively represent G1, C1, G d ,/>Wherein the value of ω is self-calculated over the network.
34 A domain discriminator guarantees the mobility contained by the feature, but paying too much attention to the mobility of the data can lead to a decrease in the mobility of the class in the data, introducing a balancing factor for balancing its mobility with the mobility:
maximum mean difference MMD (D s ,D t ) The method is a common estimation method for calculating the alignment degree of data distribution between two domains, and is used for measuring the mobility of the domains; the separability of the classes in the domain is measured by using a discrimination evaluation method maxJ (W) based on linear discriminant analysis, which is defined as follows:
wherein S is b Is an inter-class scattering matrix, S w Is an intra-class scattering matrix. Clearly, a larger maxJ (W) means better separability.
Since the estimated values of the two evaluation criteria are not usually on the same order of magnitude, the estimated values need to be further normalized to obtain
The balance factor is defined as follows:
of which smallerIndicating a better domain alignment, smaller +.>Indicating better class authenticability.
35 In combination with 31) 32) 33) 34) above, the loss of the final superstructure is defined as:
where τ and ω are both parameters that the network has self-calculated.
36 In the lower layer structure, we choose to use the hilbert space embedding of the joint distribution to measure the difference of two joint distributions P and Q, transfer the distribution in one domain into the Regenerated Kernel Hilbert Space (RKHS), and obtain the joint probability distribution loss by directly calculating the MMD distance of the source domain and the target domain in the RKHS:
P S (x s ,y s ),P T (x T ,y T ) Representing the joint probability distribution of the source domain and the target domain respectively,respectively represent D s ,D t Features in RKHS corresponding to the ith data in (a)>Respectively represent D s ,D t Class labels corresponding to the ith and j data in the database.
The step 4) includes:
41 Upper layer structure, X is as follows s ,X t As G1 input, training G1 and using resistance trainingObtaining optimal parameters; because the target domain does not contain a label, only training C1 by adopting the source domain data, using the trained C1 for the prediction of the target domain data category, and taking the output of C1 as a pseudo label of the target domain data
The training loss for C1 is as follows:
bond 35) can result in loss in the superstructure:
42 Will X in the lower layer structure s ,X t As input of G2, the feature Z extracted by G2 is obtained s ,Z t ,Z s ,Z t Respectively X s ,X t The output characteristics obtained through G2; by < Z s ,Y s >,Calculate L jmmd
43 To integrate the migration ability of G1, G2 after training, X is calculated s The outputs of G1 and G2 are fused, and the fused features are used as the input of C2 for training, and the training loss of C2 is expressed as follows:
44 According to the network loss discussed above under 41) 42) 43), the optimization objective of our proposed model can be expressed as
After the training is finished, the feature extractors G1 and G2 and the classification network C2 are used to predict the test data in step 5.
It should be noted that, in the description of the present invention, the positional or positional relation indicated by the terms such as "upper", "lower", "left", "right", "front", "rear", etc. are merely for convenience of describing the present invention based on the description of the present invention shown in the drawings, and are not intended to indicate or imply that the apparatus or element to be referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The terms "first" and "second" in this technical solution are merely references to the same or similar structures, or corresponding structures that perform similar functions, and are not an arrangement of the importance of these structures, nor are they ordered, or are they of a comparative size, or other meaning.
In addition, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., the connection may be a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two structures. It will be apparent to those skilled in the art that the specific meaning of the terms described above in this application may be understood in the light of the general inventive concept in connection with the present application.

Claims (1)

1. An unsupervised carbon dioxide emission monitoring method based on deep transfer learning comprises the following steps:
s1: preprocessing data;
s2: preprocessing data;
s3: setting up a model;
s4: training a model;
s5: testing a model;
the method is characterized in that:
the S1 comprises the step of obtaining a first carbon row corresponding to first carbon emission equipmentPlacing data and second carbon emission data corresponding to a second carbon emission device, wherein the first carbon emission device and the second carbon emission device are different in model; taking the first carbon emission device as source domain data and the second carbon emission device as target domain data; acquiring source domain and target domain data to obtain labeled source domain data<X s ,Y s >Label-free target domain data X t Wherein X represents data and Y represents its corresponding tag; taking a carbon emission data set of a power plant as an example, data such as temperature, humidity, coal consumption and the like at a certain sampling time form a feature vector, wherein one feature vector is a sample X, X is a transverse quantity of d dimension, d represents the acquired sample, Y is a scalar, namely a label corresponding to the sample, represents the concentration of carbon dioxide, and a marked sample set { (X) can be obtained through data collection for a period of time i ,y i ) Distinguishing the boilers of different models, and obtaining a labeling sample set corresponding to each model of boiler;
the step S2 comprises the step of carrying out normalization processing on source domain data and target domain data, and aims to eliminate the influence of the problems of quantity level differentiation, different data value ranges, unobvious data trend and the like in an original data sample on model training, and simultaneously improve model precision and model training speed;
the S3 comprises a model network adopting a double-flow structure, wherein the model network adopting the double-flow structure comprises two characteristic extraction neural networks G1 and G2; two tag classifiers C1, C2, C1 are primary classifiers, C2 is a final classifier; a challenge domain discriminator D, wherein D comprises a global discriminator G d Local discriminatorK is the data category number; and a distributed difference explicit measurement module, wherein the method for building the model comprises the following steps:
s31: selecting proper network as feature extractor, inputting labeled source domain data and unlabeled target domain data into G1, G2, outputting fs1, ft1, fs2, ft2 via G1, G2, and obtaining the final productFs1, ft1 respectively represent X s Through the output characteristics of G1, X t Outputting the characteristic through G1; fs2, ft2 denote X respectively s Through the output characteristics of G2, X t Outputting the characteristic through G2; g1, G2 may employ one of a Resnet, VGG, and multiple CNN networks;
s32: c1 and C2 are conventional label classifiers such as neural networks and support vector machines; for classifying the data. Training a label classifier by adopting source domain labeled data, and if cross entropy loss training is adopted, the general expression of label classifier loss is as follows:
D s representing source domain data, n s The number of source domain data is represented,x represents i Probability of belonging to class k, C y Representing a tag classifier, G f A representation feature extractor;
s33: taking fs1, ft1 as input to the challenge domain evaluator D, D is capable of reducing edge distribution differences between source domain and target domain data by evaluating input features from the source domain or the target domain; a common domain discriminator consists of a multi-layer perceptron and a Softmax function; marking the source domain data as 1, marking the target domain data as 0, if the input of a sample is carried out, outputting the sample from the source domain or the target domain, and calculating the loss value of the domain discriminator according to the actual result and the predicted value; if trained with a cross entropy loss function, the loss of the challenge domain discriminator can be expressed as:
x∈X s ∪X t m represents the number of samples of one batch, d i A domain label representing the i-th sample,represents the output of the ith sample through D, θ G1 ,θ d Respectively representing parameters in G1 and D;
wherein the global domain discriminator G d The loss can be expressed as:
D s representing source domain data, D t Representing target domain data, n s ,n t Respectively represent D s ,D t The data of the data are stored in a memory, L (L) ce Representing cross entropy loss as a loss function of the domain classifier;
the local domain discriminator is subdivided into K domain discriminatorsk=1, 2, …, K, each class discriminator is responsible for matching the source domain data and the target domain data associated with class K, the partitioning on the target domain being based on the pseudo tag generated by the tag classifier. The loss function of the local area discriminator may be calculated as:
is a domain discriminator, < >>Cross entropy loss of class k corresponding to the domain discriminator,/->Is X i A probability distribution predicted as k classes;
usingTo measure the importance of domain discriminators, global domain discriminatorsExpressed as:
local area discriminatorExpressed as:
sample representing class k in source domain and target domain, respectively,/->Representing the loss of the local subfield discriminator on class k; finally, the dynamic challenge factor ω is expressed as:
in the above-described antagonistic domain adaptive structure, its final learning objective can be expressed as:
respectively represent G1, C1, G d ,/>Wherein the value of ω is self-calculated over the network;
s34: the domain discriminator ensures the mobility contained in the feature, but paying too much attention to the mobility of the data can lead to the reduction of the mobility of the class in the data, and a balance factor is introduced for balancing the mobility and the mobility of the class:
maximum mean difference MMD (D s ,D t ) The method is a common estimation method for calculating the alignment degree of data distribution between two domains, and is used for measuring the mobility of the domains; the separability of the classes in the domain is measured by using a discrimination evaluation method maxJ (W) based on linear discriminant analysis, which is defined as follows:
wherein S is b Is an inter-class scattering matrix, S w Is an intra-class scattering matrix; clearly, the larger maxJ (W) means better separability;
since the estimated values of the two evaluation criteria are not usually on the same order of magnitude, the estimated values need to be further normalized to obtain
The balance factor is defined as follows:
of which smallerIndicating a better domain alignment, smaller +.>Indicating better class authenticability;
s35: in combination with the above S31, S32, S33, S34, the loss of the final superstructure is defined as:
wherein τ and ω are parameters calculated by the network itself;
s36: in the lower layer structure, according to the advantages of the maximum mean difference method, the Hilbert space embedding of the joint distribution is selected to measure the difference of two joint distributions P and Q, the distribution in one domain is transferred into the Regenerated Kernel Hilbert Space (RKHS), and the joint probability distribution loss can be obtained by directly calculating the MMD distance between a source domain and a target domain in the RKHS:
P S (x s ,y s ),P T (x T ,y T ) Representing the joint probability distribution of the source domain and the target domain respectively,
respectively represent D s ,D t Features in RKHS corresponding to the i-th data,
respectively represent D s ,D t Class labels corresponding to the ith and j data;
the step S4 includes step S41: in the upper layer structure, X is s ,X t As G1 input, training G1 and D to obtain optimal parameters using resistance training; because the target domain does not contain a label, only training C1 by adopting the source domain data, using the trained C1 for the prediction of the target domain data category, and taking the output of C1 as a pseudo label of the target domain dataThe training loss for C1 is as follows:
combining S35 can result in loss in the superstructure:
s42: x in the lower layer structure s ,X t As input of G2, the feature Z extracted by G2 is obtained s ,Z t ,Z s ,Z t Respectively X s ,X t The output characteristics obtained through G2; by means of<Z s ,Y s >,Calculate L jmmd
S43: to integrate the migration ability of G1, G2 after training, X is calculated s The outputs of G1 and G2 are fused, and the fused features are used as the input of C2 for training, and the training loss of C2 is expressed as follows:
s44: from the network losses discussed in S41, S42, S43, the optimization objective of our proposed model can be expressed as
S5: after the model training is finished at S4, the test data is predicted using the feature extractors G1, G2 and the classification network C2.
CN202310454552.9A 2023-04-25 2023-04-25 Unsupervised carbon dioxide emission monitoring method based on deep transfer learning Pending CN116451083A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310454552.9A CN116451083A (en) 2023-04-25 2023-04-25 Unsupervised carbon dioxide emission monitoring method based on deep transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310454552.9A CN116451083A (en) 2023-04-25 2023-04-25 Unsupervised carbon dioxide emission monitoring method based on deep transfer learning

Publications (1)

Publication Number Publication Date
CN116451083A true CN116451083A (en) 2023-07-18

Family

ID=87125450

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310454552.9A Pending CN116451083A (en) 2023-04-25 2023-04-25 Unsupervised carbon dioxide emission monitoring method based on deep transfer learning

Country Status (1)

Country Link
CN (1) CN116451083A (en)

Similar Documents

Publication Publication Date Title
CN112446591B (en) Zero sample evaluation method for student comprehensive ability evaluation
CN109977780A (en) A kind of detection and recognition methods of the diatom based on deep learning algorithm
CN114509266B (en) Bearing health monitoring method based on fault feature fusion
CN107274020A (en) A kind of learner&#39;s subject based on collaborative filtering thought always surveys result prediction system and method
CN111949535B (en) Software defect prediction device and method based on open source community knowledge
CN111860106B (en) Unsupervised bridge crack identification method
CN108492298A (en) Based on the multispectral image change detecting method for generating confrontation network
CN113283288B (en) Nuclear power station evaporator eddy current signal type identification method based on LSTM-CNN
CN107220663B (en) Automatic image annotation method based on semantic scene classification
CN114694178A (en) Method and system for monitoring safety helmet in power operation based on fast-RCNN algorithm
CN112365497A (en) High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
CN111753877B (en) Product quality detection method based on deep neural network migration learning
CN106340007A (en) Image processing-based automobile body paint film defect detection and identification method
CN109523514A (en) To the batch imaging quality assessment method of Inverse Synthetic Aperture Radar ISAR
CN111783616A (en) Data-driven self-learning-based nondestructive testing method
CN116451083A (en) Unsupervised carbon dioxide emission monitoring method based on deep transfer learning
CN116050507B (en) Carbon dioxide emission monitoring method and system
CN112183163A (en) Natural scene text detection method based on full convolution residual error network
Li et al. Early drought plant stress detection with bi-directional long-term memory networks
CN114627333A (en) Zinc flotation froth image classification algorithm and system for improving deep active learning
CN114782349A (en) Domain-adaptive crude oil leakage detection model training method, detection method and device
CN113553708A (en) Method and device for tracing key influence factors of simulation model
CN112069621A (en) Method for predicting residual service life of rolling bearing based on linear reliability index
Su et al. New Particle Formation Events Detection with Deep Learning
CN115808504B (en) Online drift compensation method for gas sensor for concentration prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination