CN117542432A - Aerosol component detection method based on machine learning - Google Patents

Aerosol component detection method based on machine learning Download PDF

Info

Publication number
CN117542432A
CN117542432A CN202311301561.0A CN202311301561A CN117542432A CN 117542432 A CN117542432 A CN 117542432A CN 202311301561 A CN202311301561 A CN 202311301561A CN 117542432 A CN117542432 A CN 117542432A
Authority
CN
China
Prior art keywords
aerosol
machine learning
training
data
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311301561.0A
Other languages
Chinese (zh)
Inventor
李莹
白杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN202311301561.0A priority Critical patent/CN117542432A/en
Publication of CN117542432A publication Critical patent/CN117542432A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital
    • G01N2033/0068General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital using a computer specifically programmed
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/20Air quality improvement or preservation, e.g. vehicle emission control or emission reduction by using catalytic converters

Abstract

The invention discloses an aerosol component detection method based on machine learning, which comprises the following steps: acquiring sampling film aerosol data based on the sampling film from which the aerosol is collected; acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data; establishing a training data set and a test data set based on the absorption spectrum slope and the training component data; training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model; inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol. By linearly fitting the absorption spectrum slope of the aerosol with the aerosol components, the aerosol components can be rapidly and accurately measured without damaging the sampling film, and the method has the advantages of high efficiency and low cost, and is beneficial to large-scale popularization.

Description

Aerosol component detection method based on machine learning
Technical Field
The invention relates to the field of aerosol component measurement, in particular to an aerosol component detection method based on machine learning.
Background
Among the fine particulate contaminants (PM 2.5), carbonaceous aerosols are the main component, which can have a significant impact on air quality, human health and global climate change. Elemental Carbon (EC) and Organic Carbon (OC) are two broad classes of atmospheric carbonaceous aerosols, the content of EC and OC in the atmosphere being a central driver of climate change due to the light absorption properties of carbonaceous aerosols over the entire wavelength region. Accurate measurement of the content of carbonaceous aerosols (EC and OC) is therefore critical for assessing health and climate effects.
In the prior art, the aerosol composition is generally determined by measuring the EC and OC contents in a sampling film by using a thermo-optical method or a solvent extraction method analysis after sampling the atmospheric composition through the sampling film. However, this method would destroy the sample of the sampling film, and the sampling film needs to be replaced for each test, and the detection is performed in a special laboratory, so that the process of detecting the aerosol component is costly and long in time, and cannot be popularized and applied.
Accordingly, there remains a need for improvements and developments in the art.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention aims to provide an aerosol component detection method based on machine learning, so as to solve the problems of high cost, long time and incapability of popularization and application in the process of detecting aerosol components caused by the fact that a sampling film sample needs to be destroyed when detecting the aerosol components in the prior art.
The technical scheme of the invention is as follows:
a machine learning based aerosol composition detection method, the method comprising:
acquiring sampling film aerosol data based on the sampling film from which the aerosol is collected;
acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data;
establishing a training data set and a test data set based on the absorption spectrum slope and the training component data;
training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model;
inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol.
In one embodiment, acquiring sampling film aerosol data based on a sampling film from which aerosols were collected, comprises:
selecting a plurality of sampling films to sample the aerosol;
and respectively characterizing the sampling films based on the characterization experiment requirements to obtain sampling film aerosol data, wherein the sampling film aerosol data comprise estimated aerosol component data and aerosol absorption spectrum.
In one embodiment, acquiring the absorption spectrum slope and training component data of the aerosol from the sampling film aerosol data comprises:
taking the estimated aerosol component data as the training component data;
dividing the aerosol absorption spectrum into a plurality of segments;
and calculating the slope of a line connecting the starting point and the end point in each segment to obtain the absorption spectrum slope of the aerosol.
In one embodiment, when calculating the slope of the line connecting the start and end points in each of the segments, the formula is applied:
wherein lambda is 1 Lambda is the wavelength at the start of the segment 2 And A is the absorption coefficient of the segment at the corresponding wavelength for the segment.
In one embodiment, the characterizing the sampling films based on the characterizing experimental requirements, respectively, to obtain the sampling film aerosol data, includes:
weighing the sampling film before and after sampling under the condition of fixed temperature and humidity to obtain the mass components of the aerosol;
detecting the sampling film by a thermo-optical transmission analyzer to obtain the concentration of the aerosol;
detecting the sampling film through a digital colorimeter to obtain color space components of the sampling film;
fitting to obtain the estimated aerosol component data based on the mass component, the concentration and the color space component;
and detecting the sampling film by a solid ultraviolet visible spectrophotometer to obtain the absorption spectrum of the aerosol.
In one embodiment, creating a training data set and a test data set based on the absorption spectrum slope and the training component data comprises:
the absorption spectrum slope corresponds to the training component data to obtain a training database;
and randomly dividing the data in the training database according to the proportion of 9:1 to form the training data set and the test data set.
In one embodiment, training a machine learning model based on the training data set and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol composition detection model, comprising:
based on the training data set, applying 10 times ten times cross validation to the machine learning model to obtain a trained machine learning model;
and evaluating the performance of the trained machine learning model based on the test data set, and obtaining the aerosol component detection model when the trained machine learning model meets a preset index.
In one embodiment, the aerosol is a carbonaceous aerosol comprising elemental carbon aerosols and organic carbon aerosols.
Embodiments of the present invention also provide a computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the machine learning-based aerosol composition detection method as set forth in any one of the above.
The embodiment of the invention also provides a terminal device, which comprises: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the machine learning based aerosol composition detection method as set forth in any one of the above.
In summary, the invention discloses a machine learning-based aerosol component detection method, which comprises the following steps: acquiring sampling film aerosol data based on the sampling film from which the aerosol is collected; acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data; establishing a training data set and a test data set based on the absorption spectrum slope and the training component data; training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model; inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol. By linearly fitting the absorption spectrum slope of the aerosol with the aerosol components, the aerosol components can be rapidly and accurately measured without damaging the sampling film, and the method has the advantages of high efficiency and low cost, and is beneficial to large-scale popularization.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without creative effort for a person of ordinary skill in the art.
Fig. 1 is a flowchart of a machine learning-based aerosol component detection method according to the present invention.
Fig. 2 is a flow chart of a method for detecting a sol component of an air based on machine learning according to the present invention.
Fig. 3 is a schematic diagram of an aerosol component detection apparatus based on machine learning according to the present invention.
Detailed Description
The present application provides a method for detecting aerosol components based on machine learning, and for making the purposes, technical solutions and effects of the present application clearer and more specific, the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
It should be understood that the sequence number and the size of each step in this embodiment do not mean the sequence of execution, and the execution sequence of each process is determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiment of the present application.
In the prior art, in order to determine aerosol components in air, the content of Organic Carbon (OC) and Elemental Carbon (EC) is generally measured by thermo-optical reflectance (TOR) analysis, and the content of OC is measured by solvent extraction. However, both TOR and solvent extraction methods destroy the sampling membrane sample in the analysis process, and the TOR needs to store the optical filter and transport it to a special laboratory for analysis and detection, which is costly and time-consuming, resulting in poor applicability of the method for detecting carbonaceous aerosols and unfavorable for large-scale deployment.
With the popularity of color sensor technology, researchers have begun to propose the use of non-destructive and optical methods, such as the use of scanners, cameras, and digital colorimeters, for quantitative measurement of OC and EC content, over the past few years. The color sensing method obtained by digital colorimetry and digital photography is applied to estimate XYZ color or CIELab color, and the EC and OC contents are determined by estimating the EC and OC loads on different sampling films through color components to produce a multi-parameter reference calibration system.
The content of EC and OC can be measured nondestructively at low cost by various devices such as scanners, cameras and digital colorimeters, but digital cameras and scanners can generate noise during use, interfering with the exact color of the filter load. Moreover, summarizing the complete colors according to the parametric model does not include all embedded information about the true colors, and is often limited to the visible wavelength region, and incremental information can only be obtained by combining different parameters. In this case, the detection accuracy for the EC and OC contents in the air is low, and the measurement range is limited to the visible wavelength region, and strong absorption that may exist in the short wavelength and ultraviolet spectrum cannot be sufficiently considered.
The embodiment of the invention discloses an aerosol component detection method based on machine learning, which aims to solve the problems that in the prior art, a sampling film sample needs to be destroyed when aerosol components are detected, so that the process cost for detecting the aerosol components is high, the time is long, the precision cannot be ensured, and popularization and application cannot be realized.
As shown in fig. 1, the method comprises the steps of:
and S100, acquiring sampling film aerosol data based on the sampling film with the collected aerosol.
Specifically, based on the characterization experiment requirements, a plurality of sampling films are selected to sample the aerosol respectively, wherein the sampling films select specific materials according to the specific characterization experiment requirements so as to ensure that sampling film aerosol data can be obtained through various characterization modes, and the sampling film aerosol data comprises estimated aerosol component data and aerosol absorption spectrum. Further, weighing the sampling film before and after sampling under the condition of fixed temperature and humidity to obtain the mass components of the aerosol; detecting the sampling film by a thermo-optical transmission analyzer to obtain the concentration of the aerosol; and detecting the sampling film by a digital colorimeter to acquire color space components of the sampling film. Based on the mass component, the concentration and the color space component, the estimated aerosol component data corresponding to the content of each component in the aerosol can be obtained through fitting and estimation. Further, detecting the sampling film by a solid ultraviolet visible spectrophotometer to acquire the absorption spectrum of the aerosol so as to complete acquisition of aerosol data of the sampling film.
In one embodiment, the sampling membrane comprises a polytetrafluoroethylene sampling membrane and a quartz sampling membrane, wherein the sampling membrane is firstly loaded into an air sampler to sample air in a specific area, and the sampling membrane is retracted after a preset time, so that the sampling membrane for collecting aerosol is obtained. Under the condition of controlling temperature and humidity, the mass components of the aerosol can be obtained by weighing a polytetrafluoroethylene sampling film by using a high-precision (+ -0.001 mg) balance before and after sampling, and the absorption coefficient can be obtained by analyzing the quartz sampling film by using a solid ultraviolet visible spectrophotometer, so that the absorption spectrum of the aerosol is obtained. Specifically, the air sampler is a low-capacity PM2.5 air sampler to detect PM2.5 fine particulate matters in the atmosphere; the solid state ultraviolet spectrophotometer measures radiation intensity using a double integrating sphere arrangement, with wavelengths in the range of 200-2000nm, to reflect or transmit on a quartz sampling film.
Further, obtaining the aerosol composition on the sampling film further comprises detecting the concentration of the training aerosol by a thermo-optical transmission analyzer according to standard methods analysis. Alternatively, the concentration of the aerosol is obtained by quantitatively analyzing OC and EC in the adaptive sampling membrane using a Sunset thermo-optical carbon analyzer in the NIOSH method. Further, color space components of the training sampling film, including L, a, and b values (CIELab mode), are measured by a digital colorimeter, and then the obtained color components are fitted to a multiple linear regression model to estimate the content of aerosols (such as EC aerosols and OC aerosols) collected on the sampling film to obtain composition information of the aerosols on the sampling film. And combining the concentration of the aerosol, the mass component of the aerosol and the component information of the aerosol obtained by fitting to obtain the estimated aerosol component data.
And collecting aerosol through the sampling film, wherein the change of the sampling film can be used for corresponding to different components of the aerosol, so that the component information of the aerosol is detected. In particular, the content of Elemental Carbon (EC) and Organic Carbon (OC) in the atmosphere is a core driver of climate change due to the light absorption properties of carbonaceous aerosols throughout the wavelength region. Accurate measurement of the content of carbonaceous aerosols (EC aerosols and OC aerosols) is therefore critical for assessing health and climate effects. In one embodiment, the aerosols collected by the sampling membrane are carbonaceous aerosols, including EC aerosols and OC aerosols.
Further, as shown in fig. 2, the sampling film is put into a spectrometer to obtain an absorption spectrum of a sample on the sampling film, and an absorption spectrum slope corresponding to the absorption spectrum is calculated. The absorption spectrum slope (SAAS) of the aerosol refers to the derivative of the absorption coefficient with respect to wavelength per wavelength, and the SAAS can be used to determine the composition of the aerosol by reflecting the change rate of the absorption intensity of the aerosol on the sampling film at different wavelengths. The absorption spectrum comprises an infrared spectrum, an ultraviolet spectrum, a Raman spectrum and the like, and the spectrum type is switched according to the aerosol component detected in actual needs. And after the absorption spectrum is obtained by detection, calculating the absorption spectrum slope of the corresponding absorption spectrum. Specifically, as shown in the upper left graph in fig. 2, the higher the concentration of the aerosol adsorbed by the sampling film, the darker the color, and the higher the concentration of the aerosol, the stronger the absorbance, and the corresponding absorption spectrum is located above the whole spectrum, corresponding to the data detected in the spectrometer, see the lower left spectrum in fig. 2.
In one embodiment of the invention, a UV-3600 iPrus ultraviolet visible near infrared spectrophotometer is used as a solid state UV/VIS spectrometer to detect the absorption spectra of OC and EC collected on the sampling film. The parameters of the UV-3600 iPrus ultraviolet visible near infrared spectrophotometer are shown in Table 1, which has the design of a double grating monochromator, and is provided with three detectors covering ultraviolet-visible-near infrared regions in a marked manner so as to ensure the sensitivity of full-wavelength detection. And can meet the test of various samples through abundant optional accessories. The spectrometer is characterized in that:
(1) Three detectors are provided, including a PMT detector that detects the ultraviolet and visible regions, and InGaAs and PbS detectors that detect the near infrared region. The InGaAs detector compensates for the low sensitivity of the PMT and PbS converted wavelengths, thereby ensuring high sensitivity measurement in the whole detection wavelength range. The noise is less than 0.00003Abs at 1500nm wavelength detection, reaching ultra-low noise levels.
(2) High-performance double-grating monochromator is adopted to realize high resolution (the resolution is up to 0.1 nm) and ultra-low stray light (the stray light at 340nm is below 0.00005%). The measuring wavelength range is 185nm-3300nm, and can be used for measuring in the wide-band range of ultraviolet, visible and near infrared, so as to meet the measuring requirements of different fields.
(3) The solid sample is measured using a multifunctional large sample chamber and an integrating sphere accessory, and the high-precision absolute reflectance measurement is performed using an absolute reflectance measurement device ASR series that ensures measurement precision. In addition, accessories such as an electronic cold and hot type constant temperature tank frame, an ultra-micro tank frame and the like are also arranged so as to adapt to wide application and measurement.
(4) The matched Labsolutions UV-Vis software comprises the functions of a spectrum module, a luminosity module, a dynamics and report editing module and the like. The system has the functions of automatic spectrum evaluation, automatic Excel data transmission, automatic sample test and the like, and can be upgraded to DB or CS version to realize stronger data management so as to ensure data integrity and credibility.
TABLE 1UV-3600i Plus ultraviolet visible near infrared Spectrophotometer Specification parameter Table
It should be noted that the uv-vis-nir spectrophotometer is only an example of the present invention as an explanation, and those skilled in the art can select a specific spectrometer for detection according to the required spectral data.
Further, as shown in fig. 1, after step S100, the method includes:
s200, acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data.
Specifically, the estimated aerosol component data is first used as the training component data to train in a machine learning model. Further, after detecting the absorption spectrum of the aerosol by a spectrometer, calculating an absorption spectrum slope includes:
dividing the absorption spectrum into a plurality of segments;
and calculating the slope of a line connecting the starting point and the end point in each segment to obtain the absorption spectrum slope of the aerosol.
Specifically, the slope of the absorption spectrum on different segments is calculated by a segmentation method to approximate the true absorption spectrum slope. Wherein each segment comprises the same wavelength, namely, the absorption spectrum is uniformly divided into different segments; the segment at least comprises 2 wavelengths so as to calculate the change rate of the absorption coefficient in the segment along with the change of the wavelengths, and the segment summary comprises 20 wavelengths at most so as to ensure the calculation accuracy and avoid the overlarge slope error of the absorption spectrum.
Specifically, when calculating the slope of the line connecting the start point and the end point in each of the segments, the formula is applied:
wherein lambda is 1 Lambda is the wavelength at the start of the segment 2 And A is the absorption coefficient of the segment at the corresponding wavelength for the segment. Specifically, A (lambda) 1 ) For the absorption coefficient at the start of the segment, A (lambda 2 ) Is the absorption coefficient at the end of the fragment. And calculating slopes with different numbers according to different dividing numbers of the absorption spectrum and corresponding to the slopes of the absorption spectrum, so as to fit a linear regression model of the slopes of the absorption spectrum.
Further, as shown in fig. 1, after step S200, the steps include:
s300, establishing a training data set and a testing data set based on the absorption spectrum slope and the training component data.
Specifically, the absorption spectrum slope corresponds to the training component data to obtain a training database, and then the data in the training database are randomly divided according to the ratio of 9:1 to form the training data set and the test data set. In this way, the absorption spectrum slope is correlated with the aerosol composition in the training dataset and the test dataset, thereby simplifying the nonlinear fitting relationship between different environmental parameters to achieve a sufficiently accurate optically non-destructive method.
Further, as shown in fig. 1, after step S300, the steps include:
and S400, training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model.
Specifically, the training data set and the test data set are used for training a machine learning model to obtain a fitting linear relation between an absorption spectrum slope and aerosol (EC/OC) components, so that a corresponding aerosol component detection model is obtained, and relevant data of the aerosol to be detected are input into the aerosol component detection model to obtain the component condition in the aerosol to be detected, so that quick and accurate component detection is completed, the cost is low, the efficiency is high, and the large-scale popularization is facilitated.
Specifically, a plurality of different machine learning algorithms are included in the machine learning model to estimate the composition of EC aerosol and OC aerosol in the training aerosol. Optionally, the algorithms applied in the machine learning model include a random forest algorithm (RF), a support vector machine algorithm (SVM), an elimination MLR algorithm, and the like. The algorithm actually applied in the machine learning model can be selected according to the needs, and is not limited herein.
Further, in the training process of the machine learning model based on the training data set, specifically includes: based on the training data set, 10 times ten times of cross validation is applied to the machine learning model to obtain a trained machine learning model. The data of the training data set is divided into ten parts, and then each part is used as a verification set, and the other parts are used as training sets for training and verification. And finally, acquiring the optimal super parameters, using all data in the training data set as a training set, and training by using the optimal super parameters to acquire a trained machine learning model. Optionally, the ten-fold cross-validation selects the number of repetitions as desired.
Further, after the trained machine learning model is obtained, based on the test data set, evaluating performance of the trained machine learning model, and when the trained machine learning model meets a preset index, obtaining the aerosol component detection model. Specifically, the preset index includes an average determination coefficient and a root mean square error. Optionally, the preset index may be determined according to needs and a corresponding machine learning algorithm, which is not limited herein.
Specifically, as shown in fig. 2, after sampling aerosol with a graphite sampling film, the aerosol component detection model obtained by training with the machine learning model comprises a lossless EC estimation model and a lossless OC estimation model, wherein a light color point is corresponding data in the training data set, and a solid line is a fitting result obtained by fitting based on the training data set; the deep color point is corresponding data in the test data set, the data is not applied to the training process of the corresponding aerosol component detection model, the broken line is a fitting result obtained by fitting the test data set, the fitting results of the training data set and the test data set in the lossless EC estimation model and the lossless OC estimation model can be seen to be similar, and the aerosol component detection model has excellent performance.
Further, as shown in fig. 1, after step S400, the method includes:
s500, inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol.
It should be noted that carbonaceous aerosols such as EC aerosols and OC aerosols are used as examples of aerosols in the embodiments of the present invention, but other aerosol contaminant components may likewise be tested using the methods of the present invention to determine the components of the corresponding aerosols.
Based on the above-mentioned aerosol component detection method based on machine learning, the present embodiment provides an aerosol component detection apparatus, as shown in fig. 3, including:
an acquisition module 100 for acquiring sampling film aerosol data based on the sampling film from which the aerosol was collected;
the measurement module 200 is used for acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data;
a data screening and classifying module 300 for creating a training data set and a test data set based on the absorption spectrum slope and the training component data;
the model building module 400 is configured to train a machine learning model based on the training data set, and perform performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model;
and the aerosol component detection execution module 500 is configured to input the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol.
Based on the above-mentioned aerosol component detection method based on machine learning, the embodiment of the present invention further provides a computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the aerosol component detection method based on machine learning as described in the above-mentioned embodiment.
Based on the aerosol component detection method based on machine learning, the application also provides terminal equipment, wherein the terminal equipment comprises a processor, a memory and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the machine learning based aerosol composition detection method as described above.
In addition, the specific processes that the storage medium and the plurality of instruction processors in the terminal device load and execute are described in detail in the above method, and are not stated here.
In summary, the invention discloses a method, a device, a storage medium and a terminal device for detecting aerosol components based on machine learning, wherein the method comprises the following steps: acquiring sampling film aerosol data based on the sampling film from which the aerosol is collected; acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data; establishing a training data set and a test data set based on the absorption spectrum slope and the training component data; training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model; inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol. By linearly fitting the absorption spectrum slope of the aerosol with the aerosol components, the aerosol components can be rapidly and accurately measured without damaging the sampling film, and the method has the advantages of high efficiency and low cost, and is beneficial to large-scale popularization.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. A machine learning based aerosol composition detection method, the method comprising:
acquiring sampling film aerosol data based on the sampling film from which the aerosol is collected;
acquiring the absorption spectrum slope and training component data of the aerosol according to the sampling film aerosol data;
establishing a training data set and a test data set based on the absorption spectrum slope and the training component data;
training a machine learning model based on the training data set, and performing performance evaluation on the machine learning model based on the test data set to obtain an aerosol component detection model;
inputting the aerosol data of the sampling film into the aerosol component detection model to obtain components of the aerosol.
2. The machine learning based aerosol composition detection method of claim 1, wherein obtaining sampled film aerosol data based on a sampled film from which an aerosol was collected, comprises:
selecting a plurality of sampling films to sample the aerosol;
and respectively characterizing the sampling films based on the characterization experiment requirements to obtain sampling film aerosol data, wherein the sampling film aerosol data comprise estimated aerosol component data and aerosol absorption spectrum.
3. The machine learning based aerosol composition detection method of claim 2, wherein obtaining the absorption spectrum slope and training composition data of the aerosol from the sampled film aerosol data comprises:
taking the estimated aerosol component data as the training component data;
dividing the aerosol absorption spectrum into a plurality of segments;
and calculating the slope of a line connecting the starting point and the end point in each segment to obtain the absorption spectrum slope of the aerosol.
4. A machine learning based aerosol composition detection method according to claim 3, wherein in calculating the slope of the line connecting the start point and the end point in each of the segments, the formula is applied:
wherein lambda is 1 Lambda is the wavelength at the start of the segment 2 And A is the absorption coefficient of the segment at the corresponding wavelength for the segment.
5. The machine learning based aerosol component detection method of claim 2, wherein characterizing the sampling films based on characterization experimental requirements, respectively, results in the sampling film aerosol data, comprising:
weighing the sampling film before and after sampling under the condition of fixed temperature and humidity to obtain the mass components of the aerosol;
detecting the sampling film by a thermo-optical transmission analyzer to obtain the concentration of the aerosol;
detecting the sampling film through a digital colorimeter to obtain color space components of the sampling film;
fitting to obtain the estimated aerosol component data based on the mass component, the concentration and the color space component;
and detecting the sampling film by a solid ultraviolet visible spectrophotometer to obtain the absorption spectrum of the aerosol.
6. The machine learning based aerosol composition detection method of claim 2, wherein establishing a training data set and a test data set based on the absorption spectrum slope and the training composition data comprises:
the absorption spectrum slope corresponds to the training component data to obtain a training database;
and randomly dividing the data in the training database according to the proportion of 9:1 to form the training data set and the test data set.
7. The machine learning based aerosol composition detection method of claim 6, wherein training a machine learning model based on the training dataset and performing performance evaluation on the machine learning model based on the test dataset to obtain an aerosol composition detection model comprises:
based on the training data set, applying 10 times ten times cross validation to the machine learning model to obtain a trained machine learning model;
and evaluating the performance of the trained machine learning model based on the test data set, and obtaining the aerosol component detection model when the trained machine learning model meets a preset index.
8. The machine learning based aerosol composition detection method of claim 1, wherein the aerosol is a carbonaceous aerosol comprising elemental carbon aerosols and organic carbon aerosols.
9. A computer-readable storage medium storing one or more programs executable by one or more processors to perform the steps in the machine-learning-based aerosol composition detection method of any of claims 1-8.
10. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the machine learning based aerosol composition detection method of any of claims 1-8.
CN202311301561.0A 2023-10-08 2023-10-08 Aerosol component detection method based on machine learning Pending CN117542432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311301561.0A CN117542432A (en) 2023-10-08 2023-10-08 Aerosol component detection method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311301561.0A CN117542432A (en) 2023-10-08 2023-10-08 Aerosol component detection method based on machine learning

Publications (1)

Publication Number Publication Date
CN117542432A true CN117542432A (en) 2024-02-09

Family

ID=89781393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311301561.0A Pending CN117542432A (en) 2023-10-08 2023-10-08 Aerosol component detection method based on machine learning

Country Status (1)

Country Link
CN (1) CN117542432A (en)

Similar Documents

Publication Publication Date Title
US8493441B2 (en) Absorbance measurements using portable electronic devices with built-in camera
CN111855595B (en) Spectral data calibration method based on black and white calibration plate
JP6091493B2 (en) Spectroscopic apparatus and spectroscopy for determining the components present in a sample
CN104897607A (en) Food modeling and rapid detecting integration method and system adopting portable NIRS (near infrared spectroscopy)
CN106596436B (en) Multi-parameter water quality real-time online monitoring device based on spectrum method
CN105372195B (en) A kind of micro ultraviolet specrophotometer quality determining method and detection kit
Sun et al. Accurate age estimation of bloodstains based on visible reflectance spectroscopy and chemometrics methods
CN105486655A (en) Rapid detection method for organic matters in soil based on infrared spectroscopic intelligent identification model
CN110108658B (en) Infrared spectrum identification method and system for polluted gas
CN106770058A (en) The quick special purpose device and its application method of the soil nitrate-N based on infrared spectrum
CN112461806B (en) Fluorescence spectrum detection method based on smart phone
US20130262008A1 (en) Measurement of light-absorption qualities in the visible spectrum using a camera
CN116665057A (en) River channel water quality monitoring method and system based on image processing
WO2017019762A1 (en) Image based photometry
KR101244068B1 (en) A method for measuring concentration of air and water pollutants
CN110887800A (en) Data calibration method for online water quality monitoring system by using spectroscopy
CN109541100B (en) Multichannel wavelength signal drift processing method and device and multichannel detector
CN111879709B (en) Lake water body spectral reflectivity inspection method and device
JP2021515203A (en) Methods and systems for calibrating and using cameras to detect analytes in a sample
JP2001343324A (en) Method for correcting base line of infrared ray absorption spectrum, and program recording medium therefor
CN117542432A (en) Aerosol component detection method based on machine learning
CN111896497A (en) Spectral data correction method based on predicted value
CN108169215A (en) A kind of computational methods of emission spectrometer time of integration upper limit setting
CN104880422B (en) A kind of characterization method of visualized array sensor
CN114965281B (en) Wavelength correction method and device for Mars surface composition detector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination