WO2024130805A1 - Identification method and system for polyethylene terephthalate recycled material - Google Patents

Identification method and system for polyethylene terephthalate recycled material Download PDF

Info

Publication number
WO2024130805A1
WO2024130805A1 PCT/CN2023/071537 CN2023071537W WO2024130805A1 WO 2024130805 A1 WO2024130805 A1 WO 2024130805A1 CN 2023071537 W CN2023071537 W CN 2023071537W WO 2024130805 A1 WO2024130805 A1 WO 2024130805A1
Authority
WO
WIPO (PCT)
Prior art keywords
pet
data
sample
decision tree
recycled material
Prior art date
Application number
PCT/CN2023/071537
Other languages
French (fr)
Chinese (zh)
Inventor
李建军
杨茜
吴博
庞承焕
谢晓琼
李卫领
陈平绪
叶南飚
吴昊
Original Assignee
金发科技股份有限公司
国高材高分子材料产业创新中心有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 金发科技股份有限公司, 国高材高分子材料产业创新中心有限公司 filed Critical 金发科技股份有限公司
Publication of WO2024130805A1 publication Critical patent/WO2024130805A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/44Resins; Plastics; Rubber; Leather
    • G01N33/442Resins; Plastics
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N11/00Investigating flow properties of materials, e.g. viscosity, plasticity; Analysing materials by determining flow properties
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3563Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N23/00Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
    • G01N23/22Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by measuring secondary emission from the material
    • G01N23/223Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by measuring secondary emission from the material by irradiating the sample with X-rays or gamma-rays and by measuring X-ray fluorescence
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N5/00Analysing materials by weighing, e.g. weighing small particles separated from a gas or liquid
    • G01N5/04Analysing materials by weighing, e.g. weighing small particles separated from a gas or liquid by removing a component, e.g. by evaporation, and weighing the remainder
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02WCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
    • Y02W30/00Technologies for solid waste management
    • Y02W30/50Reuse, recycling or recovery technologies
    • Y02W30/62Plastics recycling; Rubber recycling

Definitions

  • the invention relates to the technical field of recycled material identification, and in particular to a method and system for identifying recycled polyethylene terephthalate materials.
  • Recycled plastics have low carbon properties, and using recycled materials to improve the recycling rate of plastics has become an effective means of sustainable development.
  • recycled materials are not suitable for all fields. Therefore, in order to trace the source of the product, it is necessary to identify whether the product contains recycled materials.
  • PET polyethylene terephthalate
  • the existing identification methods of recycled materials mainly rely on instrumental analysis combined with manual identification.
  • the oligomers in polyester are physically recovered in advance by dissolution-mixed precipitant precipitation-filtration method, dissolution-single precipitant precipitation filtration method or extraction method, and the oligomer peak information of polyester is detected by high performance liquid chromatography, and then the identification result is obtained manually based on the oligomer peak information;
  • the detection signal of methyl benzoate para-substituent is obtained by high performance liquid chromatography, and then the identification result is obtained manually based on the detection signal;
  • a sulfolane-dichloromethane dissolution precipitation extraction chemically recovers the first sequence of cyclic oligomers of recycled polyester, and then uses high performance liquid chromatography for identification.
  • the purpose of the present invention is to overcome the deficiencies of the above-mentioned prior art and provide a method and system for identifying polyethylene terephthalate recycled materials, so as to reduce the subjective influence of the user's personal experience on the identification results of PET recycled materials and improve the accuracy of the identification results.
  • the present invention provides a method for identifying polyethylene terephthalate recycled materials, comprising:
  • PET samples include PET recycled material samples and PET virgin material samples;
  • characteristic data includes but is not limited to chain extender data and depolymerization product data
  • sample analysis includes sample analysis based on thermal cracking and sample analysis based on alcohol depolymerization.
  • performing sample analysis on the PET sample to obtain characteristic data of the PET sample includes:
  • the PET sample is detected by using a pyrolysis gas chromatography-mass spectrometer to obtain pyrolysis gas chromatography-mass spectrometer data, wherein the pyrolysis gas chromatography-mass spectrometer data includes chain extender data and depolymerization product data, wherein the thermal desorption temperature of the detection process is 250-300° C., maintained for 20 min, and the thermal cracking temperature of the detection process is 560-600° C., maintained for 0.2 min.
  • performing sample analysis on the PET sample to obtain characteristic data of the PET sample includes:
  • the PET sample is subjected to a depolymerization reaction using a stainless steel reactor or a microwave high-pressure synthesis reactor at a reaction temperature of 180-250° C. and a reaction time of 2-3 hours to obtain a PET sample solution;
  • the material components in the PET sample solution are detected by GCMS, LCMS or Py-GCMS to obtain spectrum data;
  • the spectrum data is searched for fragment ion peaks to obtain the chain extender data;
  • the spectral data are searched for fragment ion peaks using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database to obtain the depolymerization product data.
  • the chain extender data includes chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines, and amides related to the chain extender.
  • the feature data is used to train a preset decision tree model until the preset decision tree model reaches a preset convergence condition to obtain a PET recycled material identification model, including:
  • the preset decision tree model is trained, and the Gini coefficient of each feature in the training sample set at all possible split values is calculated;
  • the original decision tree is pruned to obtain the PET recycled material identification model.
  • reducing the dimension of the feature data to generate a training sample set includes:
  • a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
  • a number of corresponding eigenvectors are selected to form an eigenvector matrix
  • the feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
  • the calculation function of the Gini coefficient is:
  • Gin(D,Ai) is the Gini coefficient of feature Ai in training sample set D
  • D1 is the data set composed of feature data when the cut value in training sample set D is not greater than ai
  • D2 is the data set composed of feature data when the cut value in training sample set D is not less than ai
  • Gini(D1) is the Gini coefficient of data set D1
  • Gini(D2) is the Gini coefficient of data set D2.
  • generating child nodes of a decision tree according to the left node data set and the right node data set until no more child nodes can be generated to obtain an original decision tree includes:
  • the child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
  • pruning the original decision tree to obtain the PET recycled material identification model includes:
  • the original decision tree is optimized until the original decision tree reaches the preset convergence condition, thereby obtaining the PET recycled material identification model, wherein the preset loss function is:
  • T t represents any node in the original decision tree
  • C(T t ) represents the prediction error of node T t
  • represents the number of nodes
  • is a preset training parameter.
  • the step of using the PET recycled material identification model to identify the PET sample to be identified and determining the recycled material status of the PET sample to be identified includes:
  • the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
  • the present invention further provides a system for identifying recycled polyethylene terephthalate materials, comprising:
  • An acquisition module is used to acquire PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
  • An analysis module used for performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
  • a modeling module used to train a preset decision tree model using the feature data until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model
  • the identification module is used to identify the PET sample to be identified by using the PET recycled material identification model, and determine the recycled material situation of the PET sample to be identified.
  • the present invention has at least the following beneficial effects:
  • the present invention uses the determination of chain extenders and depolymerization products to trace back to the molecular structure level of PET plastics, thereby improving the accuracy and efficiency of identification.
  • a decision tree model is used as an identification model, and the model makes decisions independently, reducing the influence of subjective factors caused by human decision-making, thereby improving the accuracy of PET recycled material identification, especially more reliable identification of modified PET recycled material.
  • the present invention reduces the number of features of the model through dimensionality reduction to improve the model operation efficiency; through pruning optimization operations, it reduces the problem of model overfitting and enhances the generalization ability of the model.
  • FIG1 is a schematic flow diagram of a method for identifying polyethylene terephthalate recycled materials according to an embodiment of the present invention
  • FIG2 is a schematic diagram of a characteristic infrared spectrum shown in an embodiment of the present invention.
  • FIG3 is a schematic diagram of a thermal weight loss curve shown in an embodiment of the present invention.
  • FIG4 is a schematic diagram of an element distribution spectrum shown in an embodiment of the present invention.
  • FIG5 is a schematic diagram of a thermal desorption Py-GCMS test spectrum shown in an embodiment of the present invention.
  • FIG6 is a schematic diagram of a GCMS test spectrum of an alcoholysis product shown in an embodiment of the present invention.
  • FIG. 7 is a structural block diagram of a polyethylene terephthalate recycled material identification system according to an embodiment of the present invention.
  • the present invention provides a method for identifying polyethylene terephthalate recycled materials, which can be applied to an identification system for polyethylene terephthalate recycled materials.
  • the identification system can be integrated into a computer device, and the computer device includes but is not limited to tablet computers, laptop computers, desktop computers, physical servers, cloud servers and other devices.
  • the computer device can be connected to a gas chromatograph-mass spectrometer GCMS, a liquid chromatograph-mass spectrometer LCMS, a pyrolysis gas chromatograph-mass spectrometer Py-GCMS and other instruments.
  • the method includes steps S101 to S104, which are described in detail as follows:
  • Step S101 obtaining PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
  • Step S102 performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
  • Step S103 using the characteristic data, training a preset decision tree model until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
  • Step S104 using the PET recycled material identification model to identify the PET sample to be identified, and determining the recycled material status of the PET sample to be identified.
  • the PET recycled material since the PET recycled material is affected by light, oxygen, heat, humidity, stress, and environmental substances during the service stage, side reactions such as hydrolysis, pyrolysis, and oxidative degradation may occur during the recycling process, resulting in a certain degree of difference between its molecular chain, molecular structure, performance, and the characteristics of the pollutants contained therein and the new material. Under normal circumstances, the molecular weight of PET will decrease after recycling and cause performance degradation, so the PET recycled material is usually added with a bifunctional chain extender to increase the molecular weight or characteristic viscosity of the PET recycled material. Therefore, the present embodiment separates the chain extender and the residual pollutants, organic impurities, degradation products, oligomers, etc.
  • the PET samples include PET recycled materials and PET virgin materials from various manufacturers.
  • Sample analysis includes, but is not limited to, spectral analysis, chromatographic analysis, mass spectrometry analysis, infrared spectrum analysis, and thermal analysis.
  • Characteristic data also includes, but is not limited to, characteristic infrared spectrum data, thermogravimetric data, elemental data, melt flow rate data, morphology analysis data, fluorescent brightener data, and pyrolysis gas chromatography-mass spectrometry data.
  • an infrared spectrometer is used to measure the characteristic infrared spectrum of the PET sample.
  • the infrared detection parameters of the infrared spectrometer include 32 scan times, the characteristic spectrum band is the average value of 32 scans, the scanning range is 4000-400 cm -1 , the interval of the data point is 0.5 cm -1 , the room temperature and humidity during scanning are controlled at 25°C/50RH respectively, and each sample is scanned once.
  • thermogravimetric data a TGA thermogravimetric curve of the PET sample is measured using a thermogravimetric analyzer.
  • the test conditions of the thermogravimetric analyzer include: in a nitrogen atmosphere, after being kept at 30°C for 5 minutes, the temperature of the environment in which the PET sample is located is increased from 30°C to an aging temperature (330-380°C, preferably 340°C) at a heating rate of 20°C/min; or in an air atmosphere, the temperature is kept at the aging temperature for 60 minutes.
  • an X-ray fluorescence spectrometer is used to scan the sample to obtain an element distribution spectrum of the sample, and the types and total number of detected elements such as antimony, bromine, titanium, etc. are analyzed and recorded based on the element distribution spectrum.
  • a melt indexer is used to measure the melt flow rate data of the PET sample.
  • the test conditions of the melt flow rate are a temperature of 265° C., a weight of 2.165 kg, a cutting time of 5 s, and drying at 120° C. for 15 h, and the melt flow rate MFR is obtained by testing.
  • an optical microscope or a scanning electron microscope is used to record the number of noisy spots per 50 mm 2 of the PET sample according to a five-point area method.
  • ultraviolet spectrophotometry and fluorescence spectrophotometry are used to perform qualitative and quantitative analysis on the fluorescent substances.
  • the PET sample is detected by pyrolysis gas chromatography-mass spectrometry to obtain the pyrolysis gas chromatography-mass spectrometry data, wherein the thermal desorption temperature of the detection process is 250-300°C, maintained for 20 minutes, and the thermal cracking temperature of the detection process is 560-600°C, maintained for 0.2 minutes.
  • the characteristic fragment ion peaks include fragment ion peaks of substances such as benzene 3-(4-(tert-butyl)phenyl)-2-methylpropanal, biphenyl, straight-chain hydrocarbons (such as pentadiene, 1-hexene, 1-heptene and 3-eicosene), 3-phenyl-2H-chromene, 2-methoxyethanol, 1,2-dimethoxyethane, 1,2-ethylene glycol and diethylene glycol monoethyl ether.
  • substances such as benzene 3-(4-(tert-butyl)phenyl)-2-methylpropanal, biphenyl, straight-chain hydrocarbons (such as pentadiene, 1-hexene, 1-heptene and 3-eicosene), 3-phenyl-2H-chromene, 2-methoxyethanol, 1,2-dimethoxyethane, 1,2-ethylene glycol and diethylene glycol monoethyl ether.
  • a stainless steel reactor or a microwave high-pressure synthesis reactor is used to perform a depolymerization reaction on the PET sample, the reaction temperature is 180-250°C, and the reaction time is 2-3h to obtain a PET sample solution; after filtering and diluting the PET sample solution, GCMS, LCMS or Py-GCMS is used to detect the material components in the PET sample solution to obtain spectrum data; using a preset chain extender GCMS database, LCMS database or Py-GCMS database, a fragment ion peak search is performed on the spectrum data to obtain the chain extender data; using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database, a fragment ion peak search is performed on the spectrum data to obtain the depolymerization product data.
  • the chain extender data include but are not limited to chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines and amides.
  • the alcohol solvent used in the depolymerization process can be one or more of methanol, ethanol, ethylene glycol, propylene glycol, 1,4-butanediol, etc.
  • the catalyst can be one or more of zinc acetate, magnesium acetate, manganese acetate, cobalt acetate, antimony acetate, titanium acetate, tin acetate, germanium acetate, lead acetate, ionic liquid and the like.
  • the step S103 includes:
  • the preset decision tree model is trained, and the Gini coefficient of each feature in the training sample set at all possible split values is calculated;
  • the original decision tree is pruned to obtain the PET recycled material identification model.
  • PCA is used to reduce the number of features of the model and improve the model operation efficiency.
  • the established decision tree model is pruned and optimized to reduce the overfitting problem of the model and enhance the generalization ability of the model.
  • reducing the dimension of the feature data to generate a training sample set includes:
  • a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
  • a number of corresponding eigenvectors are selected to form an eigenvector matrix
  • the feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
  • PCA principal component analysis
  • the eigenvalues and corresponding eigenvectors of the covariance matrix are calculated, the eigenvalues are sorted from large to small, the first k eigenvalues are selected, and then the k eigenvectors corresponding to the selected eigenvalues are used as column vectors to form an eigenvector matrix.
  • the original feature data is reduced in dimension according to the eigenvector matrix, and new feature data are calculated as training samples.
  • the calculation function of the Gini coefficient is:
  • Gini(D,Ai) is the Gini coefficient of the feature data Ai in the training sample set D
  • D1 is the data set composed of the feature data when the cut value in the training sample set D is not greater than ai
  • D2 is the data set composed of the feature data when the cut value in the training sample set D is not less than ai
  • Gini(D1) is the Gini coefficient of the data set D1
  • Gini(D2) is the Gini coefficient of the data set D2.
  • the decision tree model includes but is not limited to ID3 algorithm, C4.5 algorithm, classification and regression tree (CART) algorithm and random forest (RF) algorithm.
  • the decision tree model takes the CART decision tree model as an example, selects the Gini coefficient as the feature standard, and constructs the model as follows: delete missing values from the training sample set D, and use the preprocessed training sample set D for training. Assuming that the feature data in the training sample set D has K categories, the probability of the Kth category is p k , and the Gini coefficient of the feature data is calculated Among them, for each feature data Ai, each classification value (i.e., target segmentation value) ai that may be obtained is obtained.
  • each classification value i.e., target segmentation value
  • each feature data is divided into two parts, wherein all feature data with a segmentation value not greater than ai form a data set D1, and all feature data with a segmentation value not less than ai form a data set D2.
  • the Gini coefficient corresponding to the feature data Ai is obtained by the above-mentioned Gini coefficient calculation function.
  • generating child nodes of a decision tree according to the left node data set and the right node data set until no more child nodes can be generated, thereby obtaining an original decision tree including:
  • the child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
  • each feature data Ai in the training sample set D is traversed, and the Gini coefficients of all possible split values ai under the feature data are calculated.
  • the feature data A' and split value a' that minimize the Gini coefficient are selected as the optimal feature node and the optimal subspace, respectively.
  • the feature data A' and split value a' divide the training sample set D into two parts, and establish the left and right nodes with the current feature node as the root node, where the left node data set D1' and the right node data set D2', and the root node is the data set containing all sample features.
  • the generated child node data set (left node data set and right node data set) is used as the next root node data set, and new child nodes are continuously generated until no child nodes can be generated, and the original decision tree T 0 constructed by multiple feature nodes is obtained.
  • five-fold cross validation is used to evaluate the model performance of the original decision tree model.
  • pruning the original decision tree to obtain the PET recycled material identification model includes:
  • the original decision tree is optimized until the original decision tree reaches the preset convergence condition, thereby obtaining the PET recycled material identification model, wherein the preset loss function is:
  • T t represents any node in the original decision tree
  • C(T t ) represents the prediction error of node T t
  • represents the number of nodes
  • is a preset training parameter.
  • the pruning principle is that the loss function of the child node is less than the loss function of the root node, that is, Then the minimum value of ⁇ is
  • the method of using the PET recycled material identification model to identify the PET sample to be identified and determining the recycled material status of the PET sample to be identified includes:
  • the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
  • the sample to be identified is subjected to spectral analysis, chromatographic analysis, mass spectrometry analysis, infrared analysis and thermal analysis, and the spectrum obtained by the analysis is digitally converted to obtain target feature data reflecting the characteristics of the sample spectrum.
  • PCA is used to reduce the dimension of the target feature data, and then the trained PET recycled material identification model is used for identification, and a decision value is output.
  • the analysis process of whether an unknown sample is PET recycled material is as follows:
  • step 1
  • the TGA thermal gravimetric curves of PET samples from different manufacturers were measured: the TGA test conditions were as follows: in a nitrogen atmosphere, the temperature was kept constant at 30°C for 5 minutes, and then the temperature was increased from 30°C to an aging temperature of 340°C at a rate of 20°C/min; in an air atmosphere, the temperature was kept constant at the aging temperature for 60 minutes, and the TGA weight loss 5% data of the PET recycled material was obtained, and the data was normalized using the z-score standardization method. The results are shown in Figure 3.
  • XRF elemental analysis of PET samples from different manufacturers XRF was used to scan the samples to obtain the element distribution spectrum of the samples, and the types and total number of elements detected, such as antimony, bromine, titanium, etc., were recorded. The data were normalized using the z-score standardization method. The results are shown in Figure 4.
  • melt flow rate data of PET samples from different manufacturers were measured: the melt flow rate test conditions included a temperature of 265°C, a weight of 2.165 kg, a cutting time of 5 s, and drying at 120°C-15 h, and the data were normalized using the z-score standardization method.
  • Morphological analysis of PET samples from different manufacturers Using an optical microscope or scanning electron microscope, the number of variegated spots per 50 mm2 was recorded according to the five-point area method, and the data were normalized using the z-score standardization method.
  • UV spectrophotometry and fluorescence spectrophotometry were used to qualitatively and quantitatively analyze fluorescent substances, and the z-score standardization method was used to normalize the data.
  • the pyrolysis gas chromatography-mass spectrometry (Py-GCMS) data of PET samples from different manufacturers were measured: thermal desorption temperature 250-300°C, maintained for 20 min, thermal cracking temperature 560-600°C, maintained for 0.2 min.
  • thermal desorption temperature 250-300°C maintained for 20 min
  • thermal cracking temperature 560-600°C maintained for 0.2 min.
  • a mass spectrum database of thermal desorption substances and thermal cracking substances of PET recycled materials was established.
  • the characteristic fragment ion peaks of PET recycled materials include benzene 3-(4-(tert-butyl)phenyl)-2-methylpropanal fragment ion peaks, biphenyl, straight-chain hydrocarbons (such as pentadiene, 1-hexene, 1-heptene and 3-eicosene), 3-phenyl-2H-chromene, 2-methoxyethanol, 1,2-dimethoxyethane, 1,2-ethylene glycol and diethylene glycol monoethyl ether, etc., and the data were normalized using the z-score standardization method, and the results are shown in Figure 5.
  • GCMS and Py-GCMS were used to detect mixed mass spectrometry data of ion peaks related to the depolymerization products and chain extenders.
  • the related ion peaks related to the depolymerization products included linear oligomers, cyclic oligomers, p-methylbenzene end groups, isophthalic acid, diethylene glycol, and bisphenol A.
  • the obtained mass spectrometry data were normalized by the z-score standardization method, and the results are shown in Figure 6.
  • the normalized data is reduced in dimension using the PCA algorithm and used as the input of the decision tree model.
  • the PCA dimension reduction process is as follows:
  • the features are zero-meaned, that is, the feature mean is first calculated, and then the mean is subtracted;
  • the original data is reduced in dimension and the new observation matrix is calculated.
  • the calculation formula is
  • the decision tree model uses the CART decision tree algorithm.
  • the model is constructed as follows: missing values are deleted from the data set D, and the preprocessed data set D is trained. Assuming that there are K categories, the probability of the Kth category is p k , and the standard Gini coefficient of the feature data is calculated. Among them, for each feature data Ai, for each target segmentation value ai that may be obtained, each feature data is divided into two parts, wherein all feature data with a segmentation value not greater than ai form a data set D1, and all feature data with a segmentation value not less than ai form a data set D2. The Gini coefficient corresponding to the feature data Ai is obtained by the above-mentioned Gini coefficient calculation function.
  • each feature data Ai in the data set D calculates the Gini coefficient of all possible split values ai under the feature set, select the feature A' and split value a' that minimize the Gini coefficient as the optimal feature node and optimal subspace, feature data A' and split value a', divide the original data set into two parts, and establish the left and right nodes of the current feature node, where the left node is the data set D1' and the right node is the data set D2'.
  • the root node is the data set containing all sample features.
  • the generated child node data set is used as the next root node data set to continue generating child nodes until no child nodes can be generated, thereby generating the original decision tree T 0 .
  • a PET recycled material identification model based on the decision tree algorithm was constructed, and the model performance was evaluated using a five-fold cross validation.
  • the training samples were divided into five parts, four of which were used as training sets each time, and the remaining one was used as a validation set.
  • the five subsets were used as validation sets in turn, and the crossover was repeated five times to obtain five results, and the average of the five results was used as the performance indicator of the classifier or model.
  • the original decision tree T 0 is used to calculate the sub-node loss function and root node loss function of all internal nodes t from bottom to top, and the regularization parameter ⁇ is obtained.
  • the minimum ⁇ ' below this value is selected for pruning, and the new tree obtained is pruned in the same way until the root node of the original decision tree.
  • the sub-tree sequence obtained after pruning is cross-validated to obtain the best one, and the complexity of the tree model is reduced.
  • the PET samples to be identified were analyzed to obtain the spectrum, chromatography, mass spectrum, infrared, thermal analysis and other data of the samples to be tested.
  • the spectra were digitally converted to obtain the recycled material data reflecting the characteristics of the sample spectra.
  • PCA was then used to reduce the data dimension, and the trained PET recycled material identification model was used for identification.
  • the number of features after dimensionality reduction was 30, the number of CART decision trees was 500, and the average accuracy was 88.29%.
  • the accuracy of the decision tree model after pruning optimization was increased by 3%.
  • the present invention proposes for the first time a method for identifying PET recycled materials by comprehensively utilizing the detection of depolymerization products and chain extenders in PET recycled materials; (2) it combines multiple recycled material identification methods with artificial intelligence algorithms for the first time, not only comprehensively considering the impact of each test method on the results, but also using machine learning methods to consider the weight of each method; (3) a recycled material performance database is constructed, and on the basis of big data comparison, the constructed decision tree artificial intelligence algorithm model is used to determine the results, avoiding the influence of subjective factors; (4) with the expansion of the database and the increase in the number of training times, the accuracy of the identification results will be further improved; (5) the method is highly transferable, and professionals do not need to have rich technical accumulation in this field.
  • FIG. 7 shows a structural block diagram of a polyethylene terephthalate recycled material identification system provided by an embodiment of the present invention. For the sake of convenience, only the parts related to this embodiment are shown.
  • the polyethylene terephthalate recycled material identification system provided by the embodiment of the present invention includes:
  • An acquisition module 701 is used to acquire PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
  • An analysis module 702 is used to perform sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
  • a modeling module 703 is used to train a preset decision tree model using the feature data until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
  • the identification module 704 is used to identify the PET sample to be identified by using the PET recycled material identification model, and determine the recycled material status of the PET sample to be identified.
  • the analysis module 702 is specifically used to:
  • the PET sample is subjected to a depolymerization reaction using a stainless steel reactor or a microwave high-pressure synthesis reactor at a reaction temperature of 180-250° C. and a reaction time of 2-3 hours to obtain a PET sample solution;
  • the material components in the PET sample solution are detected by GCMS, LCMS or Py-GCMS to obtain spectrum data;
  • the spectrum data is searched for fragment ion peaks to obtain the chain extender data;
  • the spectral data are searched for fragment ion peaks using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database to obtain the depolymerization product data.
  • the chain extender data includes chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines, and amides related to the chain extender.
  • the modeling module 703 includes:
  • a generating unit used for reducing the dimension of the feature data to generate a training sample set
  • a training unit used to train the preset decision tree model using the training sample set, and calculate the Gini coefficient of each feature in the training sample set at all possible split values
  • An establishing unit is used to use the feature data based on the minimum Gini coefficient as the optimal feature node, use the segmentation value based on the minimum Gini coefficient as the optimal subspace, and establish a left node data set and a right node data set when the optimal feature node is used as the root node;
  • An iteration unit used to generate child nodes of a decision tree according to the left node data set and the right node data set, until no more child nodes can be generated, thereby obtaining an original decision tree;
  • a pruning unit is used to prune the original decision tree to obtain the PET recycled material identification model.
  • the generating unit is specifically configured to:
  • a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
  • a number of corresponding eigenvectors are selected to form an eigenvector matrix
  • the feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
  • the calculation function of the Gini coefficient is:
  • Gini(D,Ai) is the Gini coefficient of feature Ai in training sample set D
  • D1 is the data set composed of feature data when the cut value in training sample set D is not greater than ai
  • D2 is the data set composed of feature data when the cut value in training sample set D is not less than ai
  • Gini(D1) is the Gini coefficient of data set D1
  • Gini(D2) is the Gini coefficient of data set D2.
  • the iteration unit is specifically configured to:
  • the child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
  • the pruning unit is specifically used to:
  • the original decision tree is optimized until the original decision tree reaches the preset convergence condition to obtain the PET recycled material identification model, wherein the preset loss function is:
  • T t represents any node in the original decision tree
  • C(T t ) represents the prediction error of node T t
  • represents the number of nodes
  • is a preset training parameter.
  • the identification module 704 is specifically used to:
  • the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
  • the above-mentioned polyethylene terephthalate recycled material identification system can implement the polyethylene terephthalate recycled material identification method of the above-mentioned method embodiment.
  • the options in the above-mentioned method embodiment are also applicable to this embodiment and will not be described in detail here.
  • the rest of the contents of the embodiment of the present invention can refer to the contents of the above-mentioned method embodiment and will not be described in detail in this embodiment.

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

An identification method and system for a polyethylene terephthalate (PET) recycled material. The method comprises: acquiring a PET sample; analyzing the PET sample to obtain feature data of the PET sample, the feature data comprising chain extender data and depolymerization product data; using the feature data to train a preset decision tree model until the preset decision tree model reaches a preset convergence condition, so as to obtain a PET recycled material identification model; and using the PET recycled material identification model to identify a PET sample to be identified, so as to determine the recycled material condition of said sample. Using chain extenders and depolymerization products to trace molecular structures and incorporating a model autonomous decision improve the accuracy of identification results of PET recycled materials.

Description

一种聚对苯二甲酸乙二醇酯再生料的鉴别方法及系统A method and system for identifying polyethylene terephthalate recycled materials 技术领域Technical Field
本发明涉及再生料鉴别技术领域,尤其涉及一种聚对苯二甲酸乙二醇酯再生料的鉴别方法及系统。The invention relates to the technical field of recycled material identification, and in particular to a method and system for identifying recycled polyethylene terephthalate materials.
背景技术Background technique
再生塑料(或称再生料)具有低碳属性,采用再生料制品提高塑料的循环利用率成为可持续发展的有效手段。然而考虑到再生料的材料可靠性和品质质量有不同程度的劣化,并非所有领域均适用再生料,所以为对产品来源进行溯源,需要对产品当中是否含有再生料进行鉴别。其中,聚对苯二甲酸乙二醇酯(PET)的回收量占所有塑料回收量的32%,是主要的再生料鉴别对象。Recycled plastics (or recycled materials) have low carbon properties, and using recycled materials to improve the recycling rate of plastics has become an effective means of sustainable development. However, considering that the material reliability and quality of recycled materials have different degrees of deterioration, recycled materials are not suitable for all fields. Therefore, in order to trace the source of the product, it is necessary to identify whether the product contains recycled materials. Among them, the recycling volume of polyethylene terephthalate (PET) accounts for 32% of all plastic recycling volume, and is the main object of recycled material identification.
目前,现有的再生料鉴别方法主要借助仪器分析手段结合人工进行鉴别。例如采用溶解-混合沉淀剂沉淀-过滤法、溶解-单一沉淀剂沉淀过滤法或萃取法提前物理回收涤纶中的低聚物,使用高效液相色谱法检测涤纶的低聚物出峰信息,再人工根据低聚物出峰信息得出鉴别结果;又例如,使用高效液相色谱法获取苯甲酸甲酯对位取代物的检测信号,再人工根据检测信号得出鉴别结果;再例如,一种环丁砜-二氯甲烷溶解沉淀提取化学回收再生涤纶的第一序列环状低聚物,再利用高效液相色谱法进行鉴别。以上再生料鉴别方法主要以一种或几种不同测试方法进行简单组合,其鉴别结果的准确性受个人经验影响较大。而且再生料的材料差异导致不同方法的测定结果在最终结果判断时的权重必然不同,现有方法难以对此进行合理量化,从而导致鉴别结果存在一定的主观性和不准确性。At present, the existing identification methods of recycled materials mainly rely on instrumental analysis combined with manual identification. For example, the oligomers in polyester are physically recovered in advance by dissolution-mixed precipitant precipitation-filtration method, dissolution-single precipitant precipitation filtration method or extraction method, and the oligomer peak information of polyester is detected by high performance liquid chromatography, and then the identification result is obtained manually based on the oligomer peak information; for another example, the detection signal of methyl benzoate para-substituent is obtained by high performance liquid chromatography, and then the identification result is obtained manually based on the detection signal; for another example, a sulfolane-dichloromethane dissolution precipitation extraction chemically recovers the first sequence of cyclic oligomers of recycled polyester, and then uses high performance liquid chromatography for identification. The above identification methods of recycled materials are mainly simple combinations of one or several different test methods, and the accuracy of the identification results is greatly affected by personal experience. Moreover, the material differences of recycled materials lead to different weights of the measurement results of different methods in the final result judgment, and it is difficult for the existing methods to reasonably quantify this, resulting in a certain subjectivity and inaccuracy in the identification results.
发明内容Summary of the invention
本发明的目的在于克服上述现有技术的不足之处而提供一种聚对苯二甲酸乙二醇酯再生料的鉴别方法及系统,以降低PET再生料的鉴别结果受用户个人经验的主观影响,提高鉴别结果的准确性。The purpose of the present invention is to overcome the deficiencies of the above-mentioned prior art and provide a method and system for identifying polyethylene terephthalate recycled materials, so as to reduce the subjective influence of the user's personal experience on the identification results of PET recycled materials and improve the accuracy of the identification results.
为了解决上述技术问题,第一方面,本发明提供了一种聚对苯二甲酸乙二醇酯再生料的鉴别方法,包括:In order to solve the above technical problems, in a first aspect, the present invention provides a method for identifying polyethylene terephthalate recycled materials, comprising:
获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;Obtaining PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括但不限于扩链剂数据和解聚产物数据;Performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;Using the characteristic data, training a preset decision tree model until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。The PET recycled material identification model is used to identify the PET sample to be identified, and the recycled material situation of the PET sample to be identified is determined.
在一些实现方式中,对于所述扩链剂数据和解聚产物数据,样品分析包括基于热裂解的样品分析和基于醇解聚的样品分析。In some implementations, for the chain extender data and the depolymerization product data, sample analysis includes sample analysis based on thermal cracking and sample analysis based on alcohol depolymerization.
在一些实现方式中,对于基于热裂解的样品分析,所述对所述PET样品进行样品分析,得到所述PET样品的特征数据,包括:In some implementations, for the sample analysis based on thermal pyrolysis, performing sample analysis on the PET sample to obtain characteristic data of the PET sample includes:
采用裂解气相色谱质谱联用仪,对所述PET样品进行检测,得到裂解气相色谱质谱联用仪数据,裂解气相色谱质谱联用仪数据包括扩链剂数据和解聚产物数据,其中检测过程的热脱附温度为250-300℃,保持20min,检测过程的热裂解温度560-600℃,保持0.2min。The PET sample is detected by using a pyrolysis gas chromatography-mass spectrometer to obtain pyrolysis gas chromatography-mass spectrometer data, wherein the pyrolysis gas chromatography-mass spectrometer data includes chain extender data and depolymerization product data, wherein the thermal desorption temperature of the detection process is 250-300° C., maintained for 20 min, and the thermal cracking temperature of the detection process is 560-600° C., maintained for 0.2 min.
在一些实现方式中,对于基于醇解聚的样品分析,所述对所述PET样品进行样品分析,得到所述PET样品的特征数据,包括:In some implementations, for sample analysis based on alcohol depolymerization, performing sample analysis on the PET sample to obtain characteristic data of the PET sample includes:
在醇溶剂和催化剂存在下,利用不锈钢反应釜或微波高压合成釜,对所述PET样品进行解聚反应,反应温度为180-250℃,反应时间为2-3h,得到PET样品溶液;In the presence of an alcohol solvent and a catalyst, the PET sample is subjected to a depolymerization reaction using a stainless steel reactor or a microwave high-pressure synthesis reactor at a reaction temperature of 180-250° C. and a reaction time of 2-3 hours to obtain a PET sample solution;
对所述PET样品溶液进行过滤和稀释后,利用GCMS、LCMS或Py-GCMS检测所述PET样品溶液中的物质成分,得到图谱数据;After filtering and diluting the PET sample solution, the material components in the PET sample solution are detected by GCMS, LCMS or Py-GCMS to obtain spectrum data;
利用预设扩链剂GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述扩链剂数据;Using a preset chain extender GCMS database, LCMS database or Py-GCMS database, the spectrum data is searched for fragment ion peaks to obtain the chain extender data;
利用预设解聚产物GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述解聚产物数据。The spectral data are searched for fragment ion peaks using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database to obtain the depolymerization product data.
在一些实现方式中,所述扩链剂数据包括与扩链剂有关的环氧树脂类、异氰酸酯类、酸酐类、噁唑啉类和酰胺类的扩链剂数据。In some implementations, the chain extender data includes chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines, and amides related to the chain extender.
在一些实现方式中,所述利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型,包括:In some implementations, the feature data is used to train a preset decision tree model until the preset decision tree model reaches a preset convergence condition to obtain a PET recycled material identification model, including:
对所述特征数据进行降维,生成训练样本集;Performing dimension reduction on the feature data to generate a training sample set;
利用所述训练样本集,对所述预设决策树模型进行训练,并计算所述训练样本集中每个特征在所有可能的切分值时的基尼系数;Using the training sample set, the preset decision tree model is trained, and the Gini coefficient of each feature in the training sample set at all possible split values is calculated;
以基于基尼系数最小时的特征数据作为最优特征节点,以基于基尼系数最小时的切分值作为最优子空间,建立以所述最优特征节点作为根节点时的左节点数据集和右节点数据集;Taking the feature data based on the minimum Gini coefficient as the optimal feature node, taking the segmentation value based on the minimum Gini coefficient as the optimal subspace, and establishing the left node data set and the right node data set when the optimal feature node is used as the root node;
根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树;Generate child nodes of the decision tree according to the left node data set and the right node data set until no more child nodes can be generated, thereby obtaining an original decision tree;
对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型。The original decision tree is pruned to obtain the PET recycled material identification model.
在一些实现方式中,所述对所述特征数据进行降维,生成训练样本集,包括:In some implementations, reducing the dimension of the feature data to generate a training sample set includes:
基于所述特征数据,建立特征数据集;Based on the feature data, establish a feature data set;
对所述特征数据集进行归一化和零均值化,得到目标特征数据集;Normalizing and zero-meaning the feature data set to obtain a target feature data set;
基于所述目标特征数据集,计算协方差矩阵,并确定所述协方差矩阵的特征值和对应的特征向量;Based on the target feature data set, a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
基于所述特征值的排序结果,选取对应的若干个特征向量组成特征向量矩阵;Based on the sorting results of the eigenvalues, a number of corresponding eigenvectors are selected to form an eigenvector matrix;
基于所述特征向量矩阵对所述特征数据进行降维,生成所述训练样本集。The feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
在一些实现方式中,所述基尼系数的计算函数为:In some implementations, the calculation function of the Gini coefficient is:
Figure PCTCN2023071537-appb-000001
Figure PCTCN2023071537-appb-000001
其中,Gin(D,Ai)为训练样本集D中特征Ai的基尼系数,D1为训练样本集D中切分值不大于ai时的特征数据所组成的数据集,D2为训练样本集D中切分值不小于ai时的特征数据所组成的数据集,Gini(D1)为数据集D1的基尼系数,Gini(D2)为数据集D2的基尼系数。Among them, Gin(D,Ai) is the Gini coefficient of feature Ai in training sample set D, D1 is the data set composed of feature data when the cut value in training sample set D is not greater than ai, D2 is the data set composed of feature data when the cut value in training sample set D is not less than ai, Gini(D1) is the Gini coefficient of data set D1, and Gini(D2) is the Gini coefficient of data set D2.
在一些实现方式中,根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树,包括:In some implementations, generating child nodes of a decision tree according to the left node data set and the right node data set until no more child nodes can be generated to obtain an original decision tree includes:
根据所述左节点数据集和右节点数据集作为下一个根节点数据,生成新的子节点;Generate a new child node according to the left node data set and the right node data set as the next root node data;
将新的子节点对应的子节点数据集作为下一个根节点数据集,继续生成新的子节点,直至无法继续生成子节点,得到所述原始决策树。The child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
在一些实现方式中,所述对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型,包括:In some implementations, pruning the original decision tree to obtain the PET recycled material identification model includes:
基于预设损失函数,对所述原始决策树进行优化,直至原始决策树达到预设收敛条件,得到所述PET再生料鉴别模型,其中所述预设损失函数为:Based on the preset loss function, the original decision tree is optimized until the original decision tree reaches the preset convergence condition, thereby obtaining the PET recycled material identification model, wherein the preset loss function is:
C α(T t)=C(T t)+α|T t|; C α (T t ) = C (T t ) + α |T t |;
其中,T t表示原始决策树中的任意一个节点,C(T t)表示节点T t的预测误差,|T t|表示节点数量,α为预设训练参数。 Wherein, T t represents any node in the original decision tree, C(T t ) represents the prediction error of node T t , |T t | represents the number of nodes, and α is a preset training parameter.
在一些实现方式中,所述利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况,包括:In some implementations, the step of using the PET recycled material identification model to identify the PET sample to be identified and determining the recycled material status of the PET sample to be identified includes:
对所述待鉴别PET样品进行样品分析,得到所述待鉴别PET样本的目标特征数据;Performing sample analysis on the PET sample to be identified to obtain target feature data of the PET sample to be identified;
将所述目标特征数据输入到所述PET再生料鉴别模型,输出决策值;Inputting the target feature data into the PET recycled material identification model and outputting a decision value;
若所述决策值满足预设决策值区间,则判定所述待鉴别PET样本含有再生料。If the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
第二方面,本发明还提供一种聚对苯二甲酸乙二醇酯再生料的鉴别系统,包括:In a second aspect, the present invention further provides a system for identifying recycled polyethylene terephthalate materials, comprising:
获取模块,用于获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;An acquisition module is used to acquire PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
分析模块,用于对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括但不限于扩链剂数据和解聚产物数据;An analysis module, used for performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
建模模块,用于利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;A modeling module, used to train a preset decision tree model using the feature data until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
鉴别模块,用于利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。The identification module is used to identify the PET sample to be identified by using the PET recycled material identification model, and determine the recycled material situation of the PET sample to be identified.
与现有技术相比,本发明至少具备以下有益效果:Compared with the prior art, the present invention has at least the following beneficial effects:
本发明通过利用扩链剂和解聚产物的测定,追溯到PET塑料的分子结构层面,提高了鉴别的准确度和效率。利用决策树模型作为鉴别模型,由模型自主决策,降低人为决策带来的主观因素影响,从而能够提高PET再生料鉴别的准确性,特别是鉴别改性后的PET再生料更加可靠。The present invention uses the determination of chain extenders and depolymerization products to trace back to the molecular structure level of PET plastics, thereby improving the accuracy and efficiency of identification. A decision tree model is used as an identification model, and the model makes decisions independently, reducing the influence of subjective factors caused by human decision-making, thereby improving the accuracy of PET recycled material identification, especially more reliable identification of modified PET recycled material.
此外,本发明通过降维方式降低模型的特征数量,提高模型运算效率;通过剪枝优化操作,减少模型过拟合的问题,增强模型的泛化能力。In addition, the present invention reduces the number of features of the model through dimensionality reduction to improve the model operation efficiency; through pruning optimization operations, it reduces the problem of model overfitting and enhances the generalization ability of the model.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明实施例示出的聚对苯二甲酸乙二醇酯再生料的鉴别方法的流程示意图;FIG1 is a schematic flow diagram of a method for identifying polyethylene terephthalate recycled materials according to an embodiment of the present invention;
图2为本发明实施例示出的特征红外光谱示意图;FIG2 is a schematic diagram of a characteristic infrared spectrum shown in an embodiment of the present invention;
图3为本发明实施例示出的热失重曲线示意图;FIG3 is a schematic diagram of a thermal weight loss curve shown in an embodiment of the present invention;
图4为本发明实施例示出的元素分布谱示意图;FIG4 is a schematic diagram of an element distribution spectrum shown in an embodiment of the present invention;
图5为本发明实施例示出的热脱附Py-GCMS测试谱示意图;FIG5 is a schematic diagram of a thermal desorption Py-GCMS test spectrum shown in an embodiment of the present invention;
图6为本发明实施例示出的醇解产物的GCMS测试谱示意图;FIG6 is a schematic diagram of a GCMS test spectrum of an alcoholysis product shown in an embodiment of the present invention;
图7为本发明实施例示出的聚对苯二甲酸乙二醇酯再生料的鉴别系统的结构框图。FIG. 7 is a structural block diagram of a polyethylene terephthalate recycled material identification system according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
请参见图1,本发明提供一种聚对苯二甲酸乙二醇酯再生料的鉴别方法,可应用于聚对苯二甲酸乙二醇酯再生料的鉴别系统,该鉴别系统可集成于计算机设备,所述计算机设备包括但不限于平板电脑、笔记本电脑、桌上型计算机、物理服务器和云服务器等设备,计算机设备可以连接有气相色谱质谱联用仪GCMS、液相色谱质谱联用仪LCMS、热裂解气相色谱质谱联用仪Py-GCMS等仪器。所述方法包括步骤S101至步骤S104,详述如下:Please refer to Figure 1. The present invention provides a method for identifying polyethylene terephthalate recycled materials, which can be applied to an identification system for polyethylene terephthalate recycled materials. The identification system can be integrated into a computer device, and the computer device includes but is not limited to tablet computers, laptop computers, desktop computers, physical servers, cloud servers and other devices. The computer device can be connected to a gas chromatograph-mass spectrometer GCMS, a liquid chromatograph-mass spectrometer LCMS, a pyrolysis gas chromatograph-mass spectrometer Py-GCMS and other instruments. The method includes steps S101 to S104, which are described in detail as follows:
步骤S101,获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;Step S101, obtaining PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
步骤S102,对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括但不限于扩链剂数据和解聚产物数据;Step S102, performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
步骤S103,利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;Step S103, using the characteristic data, training a preset decision tree model until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
步骤S104,利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。Step S104, using the PET recycled material identification model to identify the PET sample to be identified, and determining the recycled material status of the PET sample to be identified.
在本实施例中,由于PET再生料在服役阶段会受光,氧,热,湿,应力,环境物质的作用,在回 收再生过程中可能会发生水解、热解、氧化降解等副反应,导致其分子链、分子结构、性能和所含污染物等特性将与全新料之间存在一定程度的不同。通常情况下,PET回收后分子量会下降并导致性能衰减,所以PET再生料通常通过添加双官能团扩链剂以提高PET再生料的分子量或特性黏度。因此,本实施例针对扩链剂和残留在PET中的的残留污染物、有机杂质、降解产物、低聚物等,通过解聚反应分离出来,再通过检测PET再生料中的这些材料组成、结构和性能的细微变化,建立PET再生料鉴别模型,并通过PET再生料鉴别模型鉴定未知样品是否为PET再生料。In the present embodiment, since the PET recycled material is affected by light, oxygen, heat, humidity, stress, and environmental substances during the service stage, side reactions such as hydrolysis, pyrolysis, and oxidative degradation may occur during the recycling process, resulting in a certain degree of difference between its molecular chain, molecular structure, performance, and the characteristics of the pollutants contained therein and the new material. Under normal circumstances, the molecular weight of PET will decrease after recycling and cause performance degradation, so the PET recycled material is usually added with a bifunctional chain extender to increase the molecular weight or characteristic viscosity of the PET recycled material. Therefore, the present embodiment separates the chain extender and the residual pollutants, organic impurities, degradation products, oligomers, etc. remaining in the PET through a depolymerization reaction, and then establishes a PET recycled material identification model by detecting the subtle changes in the composition, structure, and performance of these materials in the PET recycled material, and identifies whether the unknown sample is a PET recycled material through the PET recycled material identification model.
可选地,PET样品包括来源于多种厂家的PET再生料和PET原生料。样品分析包括但不限于光谱分析、色谱分析、质谱分析、红外波谱分析和热分析等。特征数据还包括但不限于特征红外光谱数据、热失重数据、元素数据、熔体流动速率数据、形貌分析数据、荧光增白剂数据和裂解气相色谱质谱联用仪数据。Optionally, the PET samples include PET recycled materials and PET virgin materials from various manufacturers. Sample analysis includes, but is not limited to, spectral analysis, chromatographic analysis, mass spectrometry analysis, infrared spectrum analysis, and thermal analysis. Characteristic data also includes, but is not limited to, characteristic infrared spectrum data, thermogravimetric data, elemental data, melt flow rate data, morphology analysis data, fluorescent brightener data, and pyrolysis gas chromatography-mass spectrometry data.
可选地,对于特征红外光谱数据,利用红外光谱仪测定PET样品的特征红外光谱图。红外光谱仪的红外检测参数包括扫描次数为32次,特征光谱条带为32次扫描的平均值,扫描范围为4000-400cm -1,数据点的间隔为0.5cm -1,扫描时的室温和湿度分别控制在25℃/50RH,每个样品扫描1次。 Optionally, for the characteristic infrared spectrum data, an infrared spectrometer is used to measure the characteristic infrared spectrum of the PET sample. The infrared detection parameters of the infrared spectrometer include 32 scan times, the characteristic spectrum band is the average value of 32 scans, the scanning range is 4000-400 cm -1 , the interval of the data point is 0.5 cm -1 , the room temperature and humidity during scanning are controlled at 25°C/50RH respectively, and each sample is scanned once.
可选地,对于热失重数据,利用热重分析仪测定PET样品的TGA热失重曲线。热重分析仪的测试条件包括:在氮气气氛下,于30℃恒温5min后,以20℃/min的升温速率,将PET样品所处环境的温度从30℃升至老化温度(330-380℃,优选为340℃);或在空气气氛下,于老化温度恒温60min。Optionally, for the thermogravimetric data, a TGA thermogravimetric curve of the PET sample is measured using a thermogravimetric analyzer. The test conditions of the thermogravimetric analyzer include: in a nitrogen atmosphere, after being kept at 30°C for 5 minutes, the temperature of the environment in which the PET sample is located is increased from 30°C to an aging temperature (330-380°C, preferably 340°C) at a heating rate of 20°C/min; or in an air atmosphere, the temperature is kept at the aging temperature for 60 minutes.
可选地,对于元素数据,利用X射线荧光光谱分析仪对待测样品进行扫描,得到样品的元素分布谱图,根据元素分布谱图分析并记录所检出元素如锑、溴、钛等的种类和元素总数。Optionally, for elemental data, an X-ray fluorescence spectrometer is used to scan the sample to obtain an element distribution spectrum of the sample, and the types and total number of detected elements such as antimony, bromine, titanium, etc. are analyzed and recorded based on the element distribution spectrum.
可选地,对于熔体流动速率数据,利用熔融指数仪测定PET样品的熔体流动速率数据。熔体流动速率的测试条件为温度265℃、砝码质量为2.165kg、切割时间为5s,以120℃烘干15h,测试得到熔体流动速率MFR。Optionally, for the melt flow rate data, a melt indexer is used to measure the melt flow rate data of the PET sample. The test conditions of the melt flow rate are a temperature of 265° C., a weight of 2.165 kg, a cutting time of 5 s, and drying at 120° C. for 15 h, and the melt flow rate MFR is obtained by testing.
可选地,对于形貌分析数据,采用光学显微镜或扫描电子显微镜,按照五点区域法记录PET样品每50mm 2的杂色点数目。 Optionally, for the morphological analysis data, an optical microscope or a scanning electron microscope is used to record the number of noisy spots per 50 mm 2 of the PET sample according to a five-point area method.
可选地,对于荧光增白剂数据,利用紫外分光光度法和荧光分光光度法对荧光物质进行定性分析和定量分析。Optionally, for the fluorescent brightener data, ultraviolet spectrophotometry and fluorescence spectrophotometry are used to perform qualitative and quantitative analysis on the fluorescent substances.
可选地,对于裂解气相色谱质谱联用仪数据,利用裂解气相色谱质谱联用仪对所述PET样品进行检测,得到所述裂解气相色谱质谱联用仪数据,其中检测过程的热脱附温度为250-300℃,保持20min,检测过程的热裂解温度560-600℃,保持0.2min。Optionally, for pyrolysis gas chromatography-mass spectrometry data, the PET sample is detected by pyrolysis gas chromatography-mass spectrometry to obtain the pyrolysis gas chromatography-mass spectrometry data, wherein the thermal desorption temperature of the detection process is 250-300°C, maintained for 20 minutes, and the thermal cracking temperature of the detection process is 560-600°C, maintained for 0.2 minutes.
进一步地,建立PET再生料的热脱附物质和热裂解物质的质谱数据库,以便于在应用阶段分析待检样品的热脱附物质和热裂解物质。其中,特征碎片离子峰包括苯3-(4-(叔丁基)苯基)-2-甲基丙醛、联苯、直链烃(如戊二烯、1-己烯、1-庚烯和3-二十碳烯)、3-苯基-2H-色烯、2-甲氧基乙醇、1,2-二甲氧基乙烷、1,2-乙二醇和二甘醇单乙醚等物质的碎片离子峰。Furthermore, a mass spectrometry database of thermal desorption and thermal cracking substances of PET recycled materials was established to facilitate the analysis of thermal desorption and thermal cracking substances of the samples to be tested in the application stage. Among them, the characteristic fragment ion peaks include fragment ion peaks of substances such as benzene 3-(4-(tert-butyl)phenyl)-2-methylpropanal, biphenyl, straight-chain hydrocarbons (such as pentadiene, 1-hexene, 1-heptene and 3-eicosene), 3-phenyl-2H-chromene, 2-methoxyethanol, 1,2-dimethoxyethane, 1,2-ethylene glycol and diethylene glycol monoethyl ether.
可选地,对于扩链剂数据和解聚产物数据:在醇溶剂和催化剂存在下,利用不锈钢反应釜或微波 高压合成釜,对所述PET样品进行解聚反应,反应温度为180-250℃,反应时间为2-3h,得到PET样品溶液;对所述PET样品溶液进行过滤和稀释后,利用GCMS、LCMS或Py-GCMS检测所述PET样品溶液中的物质成分,得到图谱数据;利用预设扩链剂GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述扩链剂数据;利用预设解聚产物GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述解聚产物数据。Optionally, for the chain extender data and the depolymerization product data: in the presence of an alcohol solvent and a catalyst, a stainless steel reactor or a microwave high-pressure synthesis reactor is used to perform a depolymerization reaction on the PET sample, the reaction temperature is 180-250°C, and the reaction time is 2-3h to obtain a PET sample solution; after filtering and diluting the PET sample solution, GCMS, LCMS or Py-GCMS is used to detect the material components in the PET sample solution to obtain spectrum data; using a preset chain extender GCMS database, LCMS database or Py-GCMS database, a fragment ion peak search is performed on the spectrum data to obtain the chain extender data; using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database, a fragment ion peak search is performed on the spectrum data to obtain the depolymerization product data.
其中,所述扩链剂数据包括但不限于与扩链剂有关的环氧树脂类、异氰酸酯类、酸酐类、噁唑啉类和酰胺类的扩链剂数据。The chain extender data include but are not limited to chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines and amides.
可选地,解聚过程使用的醇溶剂可以是甲醇、乙醇、乙二醇、丙三醇、1,4-丁二醇等中的一种或多种,催化剂可以是醋酸锌、醋酸镁、醋酸锰、醋酸钴、醋酸锑、醋酸钛、醋酸锡、醋酸锗、醋酸铅、离子液体等催化剂中的一种或多种。Optionally, the alcohol solvent used in the depolymerization process can be one or more of methanol, ethanol, ethylene glycol, propylene glycol, 1,4-butanediol, etc., and the catalyst can be one or more of zinc acetate, magnesium acetate, manganese acetate, cobalt acetate, antimony acetate, titanium acetate, tin acetate, germanium acetate, lead acetate, ionic liquid and the like.
在一些实施例中,所述步骤S103,包括:In some embodiments, the step S103 includes:
对所述特征数据进行降维,生成训练样本集;Performing dimension reduction on the feature data to generate a training sample set;
利用所述训练样本集,对所述预设决策树模型进行训练,并计算所述训练样本集中每个特征在所有可能的切分值时的基尼系数;Using the training sample set, the preset decision tree model is trained, and the Gini coefficient of each feature in the training sample set at all possible split values is calculated;
以基于基尼系数最小时的特征数据作为最优特征节点,以基于基尼系数最小时的切分值作为最优子空间,建立以所述最优特征节点作为根节点时的左节点数据集和右节点数据集;Taking the feature data based on the minimum Gini coefficient as the optimal feature node, taking the segmentation value based on the minimum Gini coefficient as the optimal subspace, and establishing the left node data set and the right node data set when the optimal feature node is used as the root node;
根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树;Generate child nodes of the decision tree according to the left node data set and the right node data set until no more child nodes can be generated, thereby obtaining an original decision tree;
对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型。The original decision tree is pruned to obtain the PET recycled material identification model.
在本实施例中,使用PCA降低模型的特征数量,提高模型运算效率。同时,建立的决策树模型进行剪枝优化,减少模型过拟合问题,增强模型的泛化能力。In this embodiment, PCA is used to reduce the number of features of the model and improve the model operation efficiency. At the same time, the established decision tree model is pruned and optimized to reduce the overfitting problem of the model and enhance the generalization ability of the model.
可选地,所述对所述特征数据进行降维,生成训练样本集,包括:Optionally, reducing the dimension of the feature data to generate a training sample set includes:
基于所述特征数据,建立特征数据集;Based on the feature data, establish a feature data set;
对所述特征数据集进行归一化和零均值化,得到目标特征数据集;Normalizing and zero-meaning the feature data set to obtain a target feature data set;
基于所述目标特征数据集,计算协方差矩阵,并确定所述协方差矩阵的特征值和对应的特征向量;Based on the target feature data set, a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
基于所述特征值的排序结果,选取对应的若干个特征向量组成特征向量矩阵;Based on the sorting results of the eigenvalues, a number of corresponding eigenvectors are selected to form an eigenvector matrix;
基于所述特征向量矩阵对所述特征数据进行降维,生成所述训练样本集。The feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
在本可选实施例中,采用主成分分析(PCA)对收集的输入数据集进行降维处理:将PET再生料的特征数据组合为特征数据集,表示为Y=[y 1,y 2,…,y N]∈R N×M,其中,N表示数据的数量,M表示特征数量。基于z-score标准化方法,对数据进行归一化处理;对归一化后的特征数据进行零均值化,即先求特征均值,再减去均值,得到目标特征数据集,并计算协方差矩阵。求出协方差矩阵的特征值和对应的特征向量,对特征值从大到小排序,选择前k个特征值,然后将所选择的特征值对应的k 个特征向量分别作为列向量,组成特征向量矩阵。根据特征向量矩阵对原特征数据进行降维,计算出新的特征数据作为训练样本。 In this optional embodiment, principal component analysis (PCA) is used to reduce the dimension of the collected input data set: the feature data of the PET recycled material are combined into a feature data set, expressed as Y=[y 1 ,y 2 ,…,y N ]∈RN ×M , where N represents the number of data and M represents the number of features. Based on the z-score standardization method, the data is normalized; the normalized feature data is zero-meaned, that is, the feature mean is first calculated, then the mean is subtracted to obtain the target feature data set, and the covariance matrix is calculated. The eigenvalues and corresponding eigenvectors of the covariance matrix are calculated, the eigenvalues are sorted from large to small, the first k eigenvalues are selected, and then the k eigenvectors corresponding to the selected eigenvalues are used as column vectors to form an eigenvector matrix. The original feature data is reduced in dimension according to the eigenvector matrix, and new feature data are calculated as training samples.
可选地,所述基尼系数的计算函数为:Optionally, the calculation function of the Gini coefficient is:
Figure PCTCN2023071537-appb-000002
Figure PCTCN2023071537-appb-000002
其中,Gini(D,Ai)为训练样本集D中特征数据Ai的基尼系数,D1为训练样本集D中切分值不大于ai时的特征数据所组成的数据集,D2为训练样本集D中切分值不小于ai时的特征数据所组成的数据集,Gini(D1)为数据集D1的基尼系数,Gini(D2)为数据集D2的基尼系数。Among them, Gini(D,Ai) is the Gini coefficient of the feature data Ai in the training sample set D, D1 is the data set composed of the feature data when the cut value in the training sample set D is not greater than ai, D2 is the data set composed of the feature data when the cut value in the training sample set D is not less than ai, Gini(D1) is the Gini coefficient of the data set D1, and Gini(D2) is the Gini coefficient of the data set D2.
在本可选实施例中,决策树模型包括但不限于ID3算法、C4.5算法、分类回归决策树(Classification and Regression Tree,CART)算法和随机森林(Random Forest,RF)算法。In this optional embodiment, the decision tree model includes but is not limited to ID3 algorithm, C4.5 algorithm, classification and regression tree (CART) algorithm and random forest (RF) algorithm.
示例性地,决策树模型以CART决策树模型为例,选取特征标准为基尼系数,模型构建如下:对训练样本集D进行缺失值删除,利用预处理后的训练样本集D进行训练。假设训练样本集D中的特征数据有K个类别,第K个类别的概率为p k,计算特征数据的基尼系数
Figure PCTCN2023071537-appb-000003
其中,对于每一个特征数据Ai,对其可能取得到每一个分类值(即目标切分值)ai,根据ai,将每一个特征数据分为两部分,其中将切分值不大于ai的所有特征数据组成数据集D1,将切分值不小于ai的所有特征数据组成数据集D2,则由上述基尼系数计算函数得到特征数据Ai对应的基尼系数。
For example, the decision tree model takes the CART decision tree model as an example, selects the Gini coefficient as the feature standard, and constructs the model as follows: delete missing values from the training sample set D, and use the preprocessed training sample set D for training. Assuming that the feature data in the training sample set D has K categories, the probability of the Kth category is p k , and the Gini coefficient of the feature data is calculated
Figure PCTCN2023071537-appb-000003
Among them, for each feature data Ai, each classification value (i.e., target segmentation value) ai that may be obtained is obtained. According to ai, each feature data is divided into two parts, wherein all feature data with a segmentation value not greater than ai form a data set D1, and all feature data with a segmentation value not less than ai form a data set D2. The Gini coefficient corresponding to the feature data Ai is obtained by the above-mentioned Gini coefficient calculation function.
可选地,根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树,包括:Optionally, generating child nodes of a decision tree according to the left node data set and the right node data set until no more child nodes can be generated, thereby obtaining an original decision tree, including:
根据所述左节点数据集和右节点数据集作为下一个根节点数据,生成新的子节点;Generate a new child node according to the left node data set and the right node data set as the next root node data;
将新的子节点对应的子节点数据集作为下一个根节点数据集,继续生成新的子节点,直至无法继续生成子节点,得到所述原始决策树。The child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
在本可选实施例中,遍历训练样本集D中的每一个特征数据Ai,计算该特征数据下所有可能的切分值ai的基尼系数,选择使基尼系数最小的特征数据A'和切分值a'分别作为最优特征节点和最优子空间,特征数据A'和切分值a',将训练样本集D分成两部分,并建立当前以当前特征节点为根节点的左右节点,其中左节点数据集D1',右节点数据集D2',则根节点为包含所有样本特征的数据集。从根节点开始,将生成的子节点数据集(左节点数据集合右节点数据集)作为下一个根节点数据集,继续生成新的子节点,直到不能生成子节点为止,得到由多个特征节点构建的原始决策树T 0In this optional embodiment, each feature data Ai in the training sample set D is traversed, and the Gini coefficients of all possible split values ai under the feature data are calculated. The feature data A' and split value a' that minimize the Gini coefficient are selected as the optimal feature node and the optimal subspace, respectively. The feature data A' and split value a' divide the training sample set D into two parts, and establish the left and right nodes with the current feature node as the root node, where the left node data set D1' and the right node data set D2', and the root node is the data set containing all sample features. Starting from the root node, the generated child node data set (left node data set and right node data set) is used as the next root node data set, and new child nodes are continuously generated until no child nodes can be generated, and the original decision tree T 0 constructed by multiple feature nodes is obtained.
可选地,采用五折交叉验证对上述原始决策树模型的模型性能进行评价。Optionally, five-fold cross validation is used to evaluate the model performance of the original decision tree model.
可选地,所述对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型,包括:Optionally, pruning the original decision tree to obtain the PET recycled material identification model includes:
基于预设损失函数,对所述原始决策树进行优化,直至原始决策树达到预设收敛条件,得到所述PET再生料鉴别模型,其中所述预设损失函数为:Based on the preset loss function, the original decision tree is optimized until the original decision tree reaches the preset convergence condition, thereby obtaining the PET recycled material identification model, wherein the preset loss function is:
C α(T t)=C(T t)+α|T t|; C α (T t ) = C (T t ) + α |T t |;
其中,T t表示原始决策树中的任意一个节点,C(T t)表示节点T t的预测误差,|T t|表示节点数量, α为预设训练参数。 Wherein, T t represents any node in the original decision tree, C(T t ) represents the prediction error of node T t , |T t | represents the number of nodes, and α is a preset training parameter.
在本可选实施例中,基于损失函数C α(T t)=C(T t)+α|T t|对原始决策树进行剪枝优化。其中,剪枝原则为子节点的损失函数小于根节点的损失函数,即
Figure PCTCN2023071537-appb-000004
则此时α的最小值为
Figure PCTCN2023071537-appb-000005
Figure PCTCN2023071537-appb-000006
In this optional embodiment, the original decision tree is pruned and optimized based on the loss function C α (T t ) = C (T t ) + α |T t |. The pruning principle is that the loss function of the child node is less than the loss function of the root node, that is,
Figure PCTCN2023071537-appb-000004
Then the minimum value of α is
Figure PCTCN2023071537-appb-000005
Figure PCTCN2023071537-appb-000006
在一些实施例中,所述利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况,包括:In some embodiments, the method of using the PET recycled material identification model to identify the PET sample to be identified and determining the recycled material status of the PET sample to be identified includes:
对所述待鉴别PET样品进行样品分析,得到所述待鉴别PET样本的目标特征数据;Performing sample analysis on the PET sample to be identified to obtain target feature data of the PET sample to be identified;
将所述目标特征数据输入到所述PET再生料鉴别模型,输出决策值;Inputting the target feature data into the PET recycled material identification model and outputting a decision value;
若所述决策值满足预设决策值区间,则判定所述待鉴别PET样本含有再生料。If the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
在本实施例中,对待鉴别样品进行光谱分析、色谱分析、质谱分析、红外分析和热分析等样品分析,对分析得到的谱图进行数字化转化,得到反映样品谱图特征的目标特征数据,利用PCA对目标特征数据进行数据降维,再利用训练好的PET再生料鉴别模型进行鉴别,输出决策值。In this embodiment, the sample to be identified is subjected to spectral analysis, chromatographic analysis, mass spectrometry analysis, infrared analysis and thermal analysis, and the spectrum obtained by the analysis is digitally converted to obtain target feature data reflecting the characteristics of the sample spectrum. PCA is used to reduce the dimension of the target feature data, and then the trained PET recycled material identification model is used for identification, and a decision value is output.
作为示例而非限定,对未知样品是否为PET再生料的分析过程如下:As an example and not a limitation, the analysis process of whether an unknown sample is PET recycled material is as follows:
步骤1:step 1:
测定不同厂家的PET样品的红外光谱分析:采用KBr盐片法,扫描次数设置为32次,特征光谱条带为32次扫描的平均值,扫描范围为4000-400cm -1,数据点的间隔为0.5cm -1,采集时室温和湿度分别控制在25℃/50RH,每个样品的光谱采集1次,结果如图2所示。 Infrared spectrum analysis of PET samples from different manufacturers: KBr salt tablet method was used, the number of scans was set to 32 times, the characteristic spectrum band was the average of 32 scans, the scanning range was 4000-400cm -1 , the interval of data points was 0.5cm -1 , the room temperature and humidity were controlled at 25℃/50RH respectively during collection, and the spectrum of each sample was collected once. The results are shown in Figure 2.
测定不同厂家的PET样品的TGA热失重曲线:TGA测试条件为在氮气气氛下,于30℃恒温5min后,以20℃/min从30℃升至老化温度340℃;在空气气氛下,于老化温度恒温60min,获取PET再生料的TGA失重5%数据,并使用z-score标准化方法对数据进行归一化处理,结果如图3所示。The TGA thermal gravimetric curves of PET samples from different manufacturers were measured: the TGA test conditions were as follows: in a nitrogen atmosphere, the temperature was kept constant at 30°C for 5 minutes, and then the temperature was increased from 30°C to an aging temperature of 340°C at a rate of 20°C/min; in an air atmosphere, the temperature was kept constant at the aging temperature for 60 minutes, and the TGA weight loss 5% data of the PET recycled material was obtained, and the data was normalized using the z-score standardization method. The results are shown in Figure 3.
测试不同厂家的PET样品的XRF元素分析:利用XRF对待测样品进行扫描,得到样品的元素分布谱图,记录所检出元素如锑、溴、钛等的种类和元素总数数据,并使用z-score标准化方法对数据进行归一化处理,结果如图4所示。XRF elemental analysis of PET samples from different manufacturers: XRF was used to scan the samples to obtain the element distribution spectrum of the samples, and the types and total number of elements detected, such as antimony, bromine, titanium, etc., were recorded. The data were normalized using the z-score standardization method. The results are shown in Figure 4.
测定不同厂家的PET样品的熔体流动速率数据:熔体流动速率测试条件包括温度为265℃,砝码为2.165kg,切割时间为5s,120℃-15h烘干,并使用z-score标准化方法对数据进行归一化处理。The melt flow rate data of PET samples from different manufacturers were measured: the melt flow rate test conditions included a temperature of 265°C, a weight of 2.165 kg, a cutting time of 5 s, and drying at 120°C-15 h, and the data were normalized using the z-score standardization method.
测定不同厂家的PET样品的形貌分析:采用光学显微镜或扫描电子显微镜,按照五点区域法记录每50mm 2的杂色点数目,并使用z-score标准化方法对数据进行归一化处理。 Morphological analysis of PET samples from different manufacturers: Using an optical microscope or scanning electron microscope, the number of variegated spots per 50 mm2 was recorded according to the five-point area method, and the data were normalized using the z-score standardization method.
测定不同厂家的PET样品的荧光增白剂:利用紫外分光光度法和荧光分光光度法对荧光物质进行定性、定量分析,并使用z-score标准化方法对数据进行归一化处理。Determination of fluorescent brighteners in PET samples from different manufacturers: UV spectrophotometry and fluorescence spectrophotometry were used to qualitatively and quantitatively analyze fluorescent substances, and the z-score standardization method was used to normalize the data.
测定不同厂家的PET样品的裂解气相色谱质谱联用仪(Py-GCMS)数据:热脱附温度250-300℃,保持20min,热裂解温度560-600℃,保持0.2min。建立PET再生料的热脱附物质和热裂解物质的质谱数据库。PET再生料的特征碎片离子峰包括苯3-(4-(叔丁基)苯基)-2-甲基丙醛碎片离子峰、联苯、 直链烃(如戊二烯、1-己烯、1-庚烯和3-二十碳烯)、3-苯基-2H-色烯、2-甲氧基乙醇、1,2-二甲氧基乙烷、1,2-乙二醇和二甘醇单乙醚等,并使用z-score标准化方法对数据进行归一化处理,结果如图5所示。The pyrolysis gas chromatography-mass spectrometry (Py-GCMS) data of PET samples from different manufacturers were measured: thermal desorption temperature 250-300°C, maintained for 20 min, thermal cracking temperature 560-600°C, maintained for 0.2 min. A mass spectrum database of thermal desorption substances and thermal cracking substances of PET recycled materials was established. The characteristic fragment ion peaks of PET recycled materials include benzene 3-(4-(tert-butyl)phenyl)-2-methylpropanal fragment ion peaks, biphenyl, straight-chain hydrocarbons (such as pentadiene, 1-hexene, 1-heptene and 3-eicosene), 3-phenyl-2H-chromene, 2-methoxyethanol, 1,2-dimethoxyethane, 1,2-ethylene glycol and diethylene glycol monoethyl ether, etc., and the data were normalized using the z-score standardization method, and the results are shown in Figure 5.
测定不同厂家的PET样品的扩链剂数据和解聚产物数据:称取1g样品,放入不锈钢反应釜,加入50ml的0.02-0.04mg/mL的醋酸锌甲醇溶液,200-220℃加热2-3h,冷却至室温后,打开反应釜将液体过滤,经0.45μm有机滤膜过滤至2ml样品瓶,利用GCMS、Py-GCMS检测,得到与解聚产物和扩链剂相关的离子峰混合质谱数据。与扩链剂有关的离子峰包括环氧树脂、异氰酸酯、酸酐、噁唑啉、酰胺类的相关离子峰。对所得质谱数据进行z-score标准化方法归一化处理。Determine the chain extender data and depolymerization product data of PET samples from different manufacturers: weigh 1g of sample, put it into a stainless steel reactor, add 50ml of 0.02-0.04mg/mL zinc acetate methanol solution, heat at 200-220℃ for 2-3h, cool to room temperature, open the reactor to filter the liquid, filter it through a 0.45μm organic filter membrane into a 2ml sample bottle, and use GCMS and Py-GCMS to obtain mixed mass spectrometry data of ion peaks related to depolymerization products and chain extenders. The ion peaks related to chain extenders include relevant ion peaks of epoxy resin, isocyanate, anhydride, oxazoline, and amide. The obtained mass spectrometry data are normalized by the z-score standardization method.
利用GCMS、Py-GCMS检测,得到与解聚产物和扩链剂相关的离子峰混合质谱数据,与解聚产物有关的包括线性低聚体、环状低聚体、对甲基苯端基、间苯二甲酸、二甘醇和双酚A等的相关离子峰。对所得质谱数据进行z-score标准化方法归一化处理,结果如图6所示。GCMS and Py-GCMS were used to detect mixed mass spectrometry data of ion peaks related to the depolymerization products and chain extenders. The related ion peaks related to the depolymerization products included linear oligomers, cyclic oligomers, p-methylbenzene end groups, isophthalic acid, diethylene glycol, and bisphenol A. The obtained mass spectrometry data were normalized by the z-score standardization method, and the results are shown in Figure 6.
步骤2:Step 2:
将归一化后得到的数据通过PCA算法降维后作为决策树模型的输入,PCA降维过程具体如下:The normalized data is reduced in dimension using the PCA algorithm and used as the input of the decision tree model. The PCA dimension reduction process is as follows:
首先对特征进行零均值化,即先求特征均值,再减去均值;First, the features are zero-meaned, that is, the feature mean is first calculated, and then the mean is subtracted;
计算协方差矩阵C,公式为
Figure PCTCN2023071537-appb-000007
Calculate the covariance matrix C, the formula is
Figure PCTCN2023071537-appb-000007
求出协方差矩阵的特征值和对应的特征向量,对特征值从大到小排序,选择前k个,然后将其对应的k个特征向量分别作为列向量组成特征向量矩阵P∈R N×MFind the eigenvalues and corresponding eigenvectors of the covariance matrix, sort the eigenvalues from large to small, select the first k, and then use their corresponding k eigenvectors as column vectors to form an eigenvector matrix P∈R N×M .
根据P对原数据进行降维,计算出新的观测矩阵
Figure PCTCN2023071537-appb-000008
作为训练样本,计算公式为
Figure PCTCN2023071537-appb-000009
According to P, the original data is reduced in dimension and the new observation matrix is calculated.
Figure PCTCN2023071537-appb-000008
As a training sample, the calculation formula is
Figure PCTCN2023071537-appb-000009
根据训练样本随机选择特征得到多个子训练集,建立基于决策树算法的判定规则模型,其中判据为模型输入因子,输出由人工判定是否为再生料结果的训练标签进行验证。Based on the randomly selected features of the training samples, multiple sub-training sets are obtained, and a judgment rule model based on the decision tree algorithm is established, in which the judgment criteria are the model input factors, and the output is verified by the training labels of whether it is the result of recycled materials manually.
决策树模型采用CART决策树算法,模型构建如下:对数据集D进行缺失值删除,预处理后的数据集D进行训练,假设有K个类别,第K个类别概率为p k,计算特征数据的标准基尼系数
Figure PCTCN2023071537-appb-000010
Figure PCTCN2023071537-appb-000011
其中,对于每一个特征数据Ai,对其可能取得到每一个目标切分值ai,将每一个特征数据分为两部分,其中将切分值不大于ai的所有特征数据组成数据集D1,将切分值不小于ai的所有特征数据组成数据集D2,则由上述基尼系数计算函数得到特征数据Ai对应的基尼系数。
The decision tree model uses the CART decision tree algorithm. The model is constructed as follows: missing values are deleted from the data set D, and the preprocessed data set D is trained. Assuming that there are K categories, the probability of the Kth category is p k , and the standard Gini coefficient of the feature data is calculated.
Figure PCTCN2023071537-appb-000010
Figure PCTCN2023071537-appb-000011
Among them, for each feature data Ai, for each target segmentation value ai that may be obtained, each feature data is divided into two parts, wherein all feature data with a segmentation value not greater than ai form a data set D1, and all feature data with a segmentation value not less than ai form a data set D2. The Gini coefficient corresponding to the feature data Ai is obtained by the above-mentioned Gini coefficient calculation function.
遍历数据集D中每一个特征数据Ai,计算该特征集下所有可能的切分值ai的基尼系数,选择使基尼系数最小的特征A'和切分值a'分别作为最优特征节点和最优子空间,特征数据A'和切分值a',将原始数据集分成两部分,并建立当前特征节点的左右节点,其中左节点是数据集D1',右节点为数据集D2',此时根节点为包含所有样本特征的数据集。Traverse each feature data Ai in the data set D, calculate the Gini coefficient of all possible split values ai under the feature set, select the feature A' and split value a' that minimize the Gini coefficient as the optimal feature node and optimal subspace, feature data A' and split value a', divide the original data set into two parts, and establish the left and right nodes of the current feature node, where the left node is the data set D1' and the right node is the data set D2'. At this time, the root node is the data set containing all sample features.
从根节点开始,生成的子节点数据集作为下一个根节点数据集继续生成子节点,直到不能生成子节点为止,生成原始决策树T 0Starting from the root node, the generated child node data set is used as the next root node data set to continue generating child nodes until no child nodes can be generated, thereby generating the original decision tree T 0 .
通过上述步骤构建基于决策树算法的PET再生料鉴别模型,并采用五折交叉验证对模型性能进行 评价。其中,五折交叉验证将训练样本分为5份,每次以其中4份作为训练集,剩余的1份作为验证集。轮流将5个子集作为验证集,交叉重复5次,得到5次的结果,用5次结果的平均值作为分类器或模型的性能指标。Through the above steps, a PET recycled material identification model based on the decision tree algorithm was constructed, and the model performance was evaluated using a five-fold cross validation. In the five-fold cross validation, the training samples were divided into five parts, four of which were used as training sets each time, and the remaining one was used as a validation set. The five subsets were used as validation sets in turn, and the crossover was repeated five times to obtain five results, and the average of the five results was used as the performance indicator of the classifier or model.
将原始决策树T 0自下而上分别计算内部所有节点t下的子节点损失函数和根节点损失函数,求出正则化参数α,选取该值下面最小α'进行剪枝,得到的新树依次类推剪枝,直到原始决策树根节点。将剪枝后得到的子树列进行交叉验证取最优,树模型复杂度降低。 The original decision tree T 0 is used to calculate the sub-node loss function and root node loss function of all internal nodes t from bottom to top, and the regularization parameter α is obtained. The minimum α' below this value is selected for pruning, and the new tree obtained is pruned in the same way until the root node of the original decision tree. The sub-tree sequence obtained after pruning is cross-validated to obtain the best one, and the complexity of the tree model is reduced.
步骤3:Step 3:
对待鉴别的PET样品进行样品分析,得到待测样品的光谱、色谱、质谱、红外、热分析等数据,对谱图进行数字化转化,得到反映样品谱图特征的再生料数据,再利用PCA进行数据降维,再利用训练好的PET再生料鉴别模型进行鉴别。其中降维后特征数量为30,CART决策树数量为500,得到的平均精确度为88.29%。剪枝优化后的决策树模型精确度提高3%。The PET samples to be identified were analyzed to obtain the spectrum, chromatography, mass spectrum, infrared, thermal analysis and other data of the samples to be tested. The spectra were digitally converted to obtain the recycled material data reflecting the characteristics of the sample spectra. PCA was then used to reduce the data dimension, and the trained PET recycled material identification model was used for identification. The number of features after dimensionality reduction was 30, the number of CART decision trees was 500, and the average accuracy was 88.29%. The accuracy of the decision tree model after pruning optimization was increased by 3%.
需要说明的是,(1)本发明首次提出了综合利用检测PET再生料当中的解聚产物、扩链剂来进行PET再生料鉴定的方法;(2)首次将多种再生料鉴定方法和人工智能算法相结合,不但综合考虑了每一种测试方法对结果的影响,还利用机器学习手段考虑了每一种方法的权重;(3)构建了再生料性能数据库,在大数据比对的基础上,利用构建的决策树人工智能算法模型进行结果判定,避免了主观因素带来的影响;(4)随着数据库的扩充和训练次数增加,鉴别结果的准确度会进一步提升;(5)方法的可传承性强,专业人员不需要在该领域有丰富的技术积累。It should be noted that: (1) the present invention proposes for the first time a method for identifying PET recycled materials by comprehensively utilizing the detection of depolymerization products and chain extenders in PET recycled materials; (2) it combines multiple recycled material identification methods with artificial intelligence algorithms for the first time, not only comprehensively considering the impact of each test method on the results, but also using machine learning methods to consider the weight of each method; (3) a recycled material performance database is constructed, and on the basis of big data comparison, the constructed decision tree artificial intelligence algorithm model is used to determine the results, avoiding the influence of subjective factors; (4) with the expansion of the database and the increase in the number of training times, the accuracy of the identification results will be further improved; (5) the method is highly transferable, and professionals do not need to have rich technical accumulation in this field.
为了执行上述方法实施例对应的聚对苯二甲酸乙二醇酯再生料的鉴别方法,以实现相应的功能和技术效果。参见图7,图7示出了本发明实施例提供的一种聚对苯二甲酸乙二醇酯再生料的鉴别系统的结构框图。为了便于说明,仅示出了与本实施例相关的部分,本发明实施例提供的聚对苯二甲酸乙二醇酯再生料的鉴别系统,包括:In order to perform the identification method of polyethylene terephthalate recycled materials corresponding to the above method embodiment, to achieve the corresponding functions and technical effects. Referring to FIG. 7, FIG. 7 shows a structural block diagram of a polyethylene terephthalate recycled material identification system provided by an embodiment of the present invention. For the sake of convenience, only the parts related to this embodiment are shown. The polyethylene terephthalate recycled material identification system provided by the embodiment of the present invention includes:
获取模块701,用于获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;An acquisition module 701 is used to acquire PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
分析模块702,用于对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括但不限于扩链剂数据和解聚产物数据;An analysis module 702 is used to perform sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
建模模块703,用于利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;A modeling module 703 is used to train a preset decision tree model using the feature data until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
鉴别模块704,用于利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。The identification module 704 is used to identify the PET sample to be identified by using the PET recycled material identification model, and determine the recycled material status of the PET sample to be identified.
在一些实施例中,所述分析模块702,具体用于:In some embodiments, the analysis module 702 is specifically used to:
在醇溶剂和催化剂存在下,利用不锈钢反应釜或微波高压合成釜,对所述PET样品进行解聚反应,反应温度为180-250℃,反应时间为2-3h,得到PET样品溶液;In the presence of an alcohol solvent and a catalyst, the PET sample is subjected to a depolymerization reaction using a stainless steel reactor or a microwave high-pressure synthesis reactor at a reaction temperature of 180-250° C. and a reaction time of 2-3 hours to obtain a PET sample solution;
对所述PET样品溶液进行过滤和稀释后,利用GCMS、LCMS或Py-GCMS检测所述PET样品溶液中的物质成分,得到图谱数据;After filtering and diluting the PET sample solution, the material components in the PET sample solution are detected by GCMS, LCMS or Py-GCMS to obtain spectrum data;
利用预设扩链剂GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述扩链剂数据;Using a preset chain extender GCMS database, LCMS database or Py-GCMS database, the spectrum data is searched for fragment ion peaks to obtain the chain extender data;
利用预设解聚产物GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述解聚产物数据。The spectral data are searched for fragment ion peaks using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database to obtain the depolymerization product data.
在一些实施例中,所述扩链剂数据包括与扩链剂有关的环氧树脂类、异氰酸酯类、酸酐类、噁唑啉类和酰胺类的扩链剂数据。In some embodiments, the chain extender data includes chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines, and amides related to the chain extender.
在一些实施例中,所述建模模块703,包括:In some embodiments, the modeling module 703 includes:
生成单元,用于对所述特征数据进行降维,生成训练样本集;A generating unit, used for reducing the dimension of the feature data to generate a training sample set;
训练单元,用于利用所述训练样本集,对所述预设决策树模型进行训练,并计算所述训练样本集中每个特征在所有可能的切分值时的基尼系数;A training unit, used to train the preset decision tree model using the training sample set, and calculate the Gini coefficient of each feature in the training sample set at all possible split values;
建立单元,用于以基于基尼系数最小时的特征数据作为最优特征节点,以基于基尼系数最小时的切分值作为最优子空间,建立以所述最优特征节点作为根节点时的左节点数据集和右节点数据集;An establishing unit is used to use the feature data based on the minimum Gini coefficient as the optimal feature node, use the segmentation value based on the minimum Gini coefficient as the optimal subspace, and establish a left node data set and a right node data set when the optimal feature node is used as the root node;
迭代单元,用于根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树;An iteration unit, used to generate child nodes of a decision tree according to the left node data set and the right node data set, until no more child nodes can be generated, thereby obtaining an original decision tree;
剪枝单元,用于对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型。A pruning unit is used to prune the original decision tree to obtain the PET recycled material identification model.
在一些实施例中,所述生成单元,具体用于:In some embodiments, the generating unit is specifically configured to:
基于所述特征数据,建立特征数据集;Based on the feature data, establish a feature data set;
对所述特征数据集进行归一化和零均值化,得到目标特征数据集;Normalizing and zero-meaning the feature data set to obtain a target feature data set;
基于所述目标特征数据集,计算协方差矩阵,并确定所述协方差矩阵的特征值和对应的特征向量;Based on the target feature data set, a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
基于所述特征值的排序结果,选取对应的若干个特征向量组成特征向量矩阵;Based on the sorting results of the eigenvalues, a number of corresponding eigenvectors are selected to form an eigenvector matrix;
基于所述特征向量矩阵对所述特征数据进行降维,生成所述训练样本集。The feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
在一些实施例中,所述基尼系数的计算函数为:In some embodiments, the calculation function of the Gini coefficient is:
Figure PCTCN2023071537-appb-000012
Figure PCTCN2023071537-appb-000012
其中,Gini(D,Ai)为训练样本集D中特征Ai的基尼系数,D1为训练样本集D中切分值不大于ai时的特征数据所组成的数据集,D2为训练样本集D中切分值不小于ai时的特征数据所组成的数据集,Gini(D1)为数据集D1的基尼系数,Gini(D2)为数据集D2的基尼系数。Among them, Gini(D,Ai) is the Gini coefficient of feature Ai in training sample set D, D1 is the data set composed of feature data when the cut value in training sample set D is not greater than ai, D2 is the data set composed of feature data when the cut value in training sample set D is not less than ai, Gini(D1) is the Gini coefficient of data set D1, and Gini(D2) is the Gini coefficient of data set D2.
在一些实施例中,迭代单元,具体用于:In some embodiments, the iteration unit is specifically configured to:
根据所述左节点数据集和右节点数据集作为下一个根节点数据,生成新的子节点;Generate a new child node according to the left node data set and the right node data set as the next root node data;
将新的子节点对应的子节点数据集作为下一个根节点数据集,继续生成新的子节点,直至无法继续生成子节点,得到所述原始决策树。The child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
在一些实施例中,所述剪枝单元,具体用于:In some embodiments, the pruning unit is specifically used to:
基于预设损失函数,对所述原始决策树进行优化,直至原始决策树达到预设收敛条件,得到所述 PET再生料鉴别模型,其中所述预设损失函数为:Based on the preset loss function, the original decision tree is optimized until the original decision tree reaches the preset convergence condition to obtain the PET recycled material identification model, wherein the preset loss function is:
C α(T t)=C(T t)+α|T t|; C α (T t ) = C (T t ) + α |T t |;
其中,T t表示原始决策树中的任意一个节点,C(T t)表示节点T t的预测误差,|T t|表示节点数量,α为预设训练参数。 Wherein, T t represents any node in the original decision tree, C(T t ) represents the prediction error of node T t , |T t | represents the number of nodes, and α is a preset training parameter.
在一些实施例中,所述鉴别模块704,具体用于:In some embodiments, the identification module 704 is specifically used to:
对所述待鉴别PET样品进行样品分析,得到所述待鉴别PET样本的目标特征数据;Performing sample analysis on the PET sample to be identified to obtain target feature data of the PET sample to be identified;
将所述目标特征数据输入到所述PET再生料鉴别模型,输出决策值;Inputting the target feature data into the PET recycled material identification model and outputting a decision value;
若所述决策值满足预设决策值区间,则判定所述待鉴别PET样本含有再生料。If the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
上述的聚对苯二甲酸乙二醇酯再生料的鉴别系统可实施上述方法实施例的聚对苯二甲酸乙二醇酯再生料的鉴别方法。上述方法实施例中的可选项也适用于本实施例,这里不再详述。本发明实施例的其余内容可参照上述方法实施例的内容,在本实施例中,不再进行赘述。The above-mentioned polyethylene terephthalate recycled material identification system can implement the polyethylene terephthalate recycled material identification method of the above-mentioned method embodiment. The options in the above-mentioned method embodiment are also applicable to this embodiment and will not be described in detail here. The rest of the contents of the embodiment of the present invention can refer to the contents of the above-mentioned method embodiment and will not be described in detail in this embodiment.
以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步的详细说明,应当理解,以上所述仅为本发明的具体实施例而已,并不用于限定本发明的保护范围。特别指出,对于本领域技术人员来说,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments described above further illustrate the purpose, technical solutions and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. It is particularly pointed out that for those skilled in the art, any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of protection of the present invention.

Claims (10)

  1. 一种聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,包括:A method for identifying polyethylene terephthalate recycled materials, characterized by comprising:
    获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;Obtaining PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
    对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括扩链剂数据和解聚产物数据;Performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes chain extender data and depolymerization product data;
    利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;Using the characteristic data, training a preset decision tree model until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
    利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。The PET recycled material identification model is used to identify the PET sample to be identified, and the recycled material situation of the PET sample to be identified is determined.
  2. 如权利要求1所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述对所述PET样品进行样品分析,得到所述PET样品的特征数据,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 1, characterized in that the sample analysis of the PET sample to obtain characteristic data of the PET sample includes:
    在醇溶剂和催化剂存在下,利用不锈钢反应釜或微波高压合成釜,对所述PET样品进行解聚反应,反应温度为180-250℃,反应时间为2-3h,得到PET样品溶液;In the presence of an alcohol solvent and a catalyst, the PET sample is subjected to a depolymerization reaction using a stainless steel reactor or a microwave high-pressure synthesis reactor at a reaction temperature of 180-250° C. and a reaction time of 2-3 hours to obtain a PET sample solution;
    对所述PET样品溶液进行过滤和稀释后,利用GCMS、LCMS或Py-GCMS检测所述PET样品溶液中的物质成分,得到图谱数据;After filtering and diluting the PET sample solution, the material components in the PET sample solution are detected by GCMS, LCMS or Py-GCMS to obtain spectrum data;
    利用预设扩链剂GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述扩链剂数据;Using a preset chain extender GCMS database, LCMS database or Py-GCMS database, the spectrum data is searched for fragment ion peaks to obtain the chain extender data;
    利用预设解聚产物GCMS数据库、LCMS数据库或Py-GCMS数据库,对所述图谱数据进行碎片离子峰检索,得到所述解聚产物数据。The spectral data are searched for fragment ion peaks using a preset depolymerization product GCMS database, LCMS database or Py-GCMS database to obtain the depolymerization product data.
  3. 如权利要求2所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述扩链剂数据包括与扩链剂有关的环氧树脂类、异氰酸酯类、酸酐类、噁唑啉类和酰胺类的扩链剂数据。The method for identifying polyethylene terephthalate recycled materials as described in claim 2 is characterized in that the chain extender data includes chain extender data of epoxy resins, isocyanates, anhydrides, oxazolines and amides related to the chain extender.
  4. 如权利要求1所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 1, characterized in that the preset decision tree model is trained using the characteristic data until the preset decision tree model reaches a preset convergence condition to obtain a PET recycled material identification model, comprising:
    对所述特征数据进行降维,生成训练样本集;Performing dimension reduction on the feature data to generate a training sample set;
    利用所述训练样本集,对所述预设决策树模型进行训练,并计算所述训练样本集中每个特征数据在所有可能的切分值时的基尼系数;Using the training sample set, the preset decision tree model is trained, and the Gini coefficient of each feature data in the training sample set at all possible segmentation values is calculated;
    以基于基尼系数最小时的特征数据作为最优特征节点,以基于基尼系数最小时的切分值作为最优子空间,建立以所述最优特征节点作为根节点时的左节点数据集和右节点数据集;Taking the feature data based on the minimum Gini coefficient as the optimal feature node, taking the segmentation value based on the minimum Gini coefficient as the optimal subspace, and establishing the left node data set and the right node data set when the optimal feature node is used as the root node;
    根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树;Generate child nodes of the decision tree according to the left node data set and the right node data set until no more child nodes can be generated, thereby obtaining an original decision tree;
    对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型。The original decision tree is pruned to obtain the PET recycled material identification model.
  5. 如权利要求4所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述对所述特征数据进行降维,生成训练样本集,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 4, characterized in that the dimension reduction of the feature data to generate a training sample set comprises:
    基于所述特征数据,建立特征数据集;Based on the feature data, establish a feature data set;
    对所述特征数据集进行归一化和零均值化,得到目标特征数据集;Normalizing and zero-meaning the feature data set to obtain a target feature data set;
    基于所述目标特征数据集,计算协方差矩阵,并确定所述协方差矩阵的特征值和对应的特征向量;Based on the target feature data set, a covariance matrix is calculated, and eigenvalues and corresponding eigenvectors of the covariance matrix are determined;
    基于所述特征值的排序结果,选取对应的若干个特征向量组成特征向量矩阵;Based on the sorting results of the eigenvalues, a number of corresponding eigenvectors are selected to form an eigenvector matrix;
    基于所述特征向量矩阵对所述特征数据进行降维,生成所述训练样本集。The feature data is reduced in dimension based on the feature vector matrix to generate the training sample set.
  6. 如权利要求4所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述基尼系数的计算函数为:The method for identifying polyethylene terephthalate recycled materials according to claim 4, characterized in that the calculation function of the Gini coefficient is:
    Figure PCTCN2023071537-appb-100001
    Figure PCTCN2023071537-appb-100001
    其中,Gini(D,Ai)为训练样本集D中特征数据Ai的基尼系数,D1为训练样本集D中切分值不大于目标切分值时的特征数据所组成的数据集,D2为训练样本集D中切分值不小于目标切分值时的特征数据所组成的数据集,Gini(D1)为数据集D1的基尼系数,Gini(D2)为数据集D2的基尼系数。Among them, Gini(D,Ai) is the Gini coefficient of the feature data Ai in the training sample set D, D1 is the data set composed of the feature data when the cut value in the training sample set D is not greater than the target cut value, D2 is the data set composed of the feature data when the cut value in the training sample set D is not less than the target cut value, Gini(D1) is the Gini coefficient of the data set D1, and Gini(D2) is the Gini coefficient of the data set D2.
  7. 如权利要求4所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,根据所述左节点数据集和右节点数据集,生成决策树的子节点,直至无法继续生成子节点,得到原始决策树,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 4 is characterized in that, according to the left node data set and the right node data set, the child nodes of the decision tree are generated until the child nodes can no longer be generated, and the original decision tree is obtained, comprising:
    根据所述左节点数据集和右节点数据集作为下一个根节点数据,生成新的子节点;Generate a new child node according to the left node data set and the right node data set as the next root node data;
    将新的子节点对应的子节点数据集作为下一个根节点数据集,继续生成新的子节点,直至无法继续生成子节点,得到所述原始决策树。The child node data set corresponding to the new child node is used as the next root node data set, and new child nodes are continuously generated until no more child nodes can be generated, thereby obtaining the original decision tree.
  8. 如权利要求4所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述对所述原始决策树进行剪枝,得到所述PET再生料鉴别模型,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 4, characterized in that pruning the original decision tree to obtain the PET recycled material identification model comprises:
    基于预设损失函数,对所述原始决策树进行优化,直至原始决策树达到预设收敛条件,得到所述PET再生料鉴别模型,其中所述预设损失函数为:Based on the preset loss function, the original decision tree is optimized until the original decision tree reaches the preset convergence condition, thereby obtaining the PET recycled material identification model, wherein the preset loss function is:
    C α(T t)=C(T t)+α|T t|; C α (T t ) = C (T t ) + α |T t |;
    其中,T t表示原始决策树中的任意一个节点,C(T t)表示节点T t的预测误差,|T t|表示节点数量,α为预设训练参数。 Wherein, T t represents any node in the original decision tree, C(T t ) represents the prediction error of node T t , |T t | represents the number of nodes, and α is a preset training parameter.
  9. 如权利要求1所述的聚对苯二甲酸乙二醇酯再生料的鉴别方法,其特征在于,所述利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况,包括:The method for identifying polyethylene terephthalate recycled materials according to claim 1, characterized in that the PET recycled material identification model is used to identify the PET sample to be identified, and the recycled material situation of the PET sample to be identified is determined, including:
    对所述待鉴别PET样品进行样品分析,得到所述待鉴别PET样本的目标特征数据;Performing sample analysis on the PET sample to be identified to obtain target feature data of the PET sample to be identified;
    将所述目标特征数据输入到所述PET再生料鉴别模型,输出决策值;Inputting the target feature data into the PET recycled material identification model and outputting a decision value;
    若所述决策值满足预设决策值区间,则判定所述待鉴别PET样本含有再生料。If the decision value satisfies the preset decision value interval, it is determined that the PET sample to be identified contains recycled material.
  10. 一种聚对苯二甲酸乙二醇酯再生料的鉴别系统,其特征在于,包括:A polyethylene terephthalate recycled material identification system, characterized by comprising:
    获取模块,用于获取PET样品,所述PET样品包括PET再生料样品和PET原生料样品;An acquisition module is used to acquire PET samples, wherein the PET samples include PET recycled material samples and PET virgin material samples;
    分析模块,用于对所述PET样品进行样品分析,得到所述PET样品的特征数据,所述特征数据包括但不限于扩链剂数据和解聚产物数据;An analysis module, used for performing sample analysis on the PET sample to obtain characteristic data of the PET sample, wherein the characteristic data includes but is not limited to chain extender data and depolymerization product data;
    建模模块,用于利用所述特征数据,对预设决策树模型进行训练,直至所述预设决策树模型达到预设收敛条件,得到PET再生料鉴别模型;A modeling module, used to train a preset decision tree model using the feature data until the preset decision tree model reaches a preset convergence condition, thereby obtaining a PET recycled material identification model;
    鉴别模块,用于利用所述PET再生料鉴别模型,对待鉴别PET样品进行鉴别,确定所述待鉴别PET样品的再生料情况。The identification module is used to identify the PET sample to be identified by using the PET recycled material identification model, and determine the recycled material situation of the PET sample to be identified.
PCT/CN2023/071537 2022-12-21 2023-01-10 Identification method and system for polyethylene terephthalate recycled material WO2024130805A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211646074.3A CN115616204A (en) 2022-12-21 2022-12-21 Method and system for identifying polyethylene terephthalate reclaimed materials
CN202211646074.3 2022-12-21

Publications (1)

Publication Number Publication Date
WO2024130805A1 true WO2024130805A1 (en) 2024-06-27

Family

ID=84880443

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/071537 WO2024130805A1 (en) 2022-12-21 2023-01-10 Identification method and system for polyethylene terephthalate recycled material

Country Status (2)

Country Link
CN (1) CN115616204A (en)
WO (1) WO2024130805A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7352758B1 (en) 2023-03-08 2023-09-28 住友化学株式会社 Composition proposal system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19609916A1 (en) * 1996-03-14 1997-09-18 Robert Prof Dr Ing Massen Optical process for identifying materials, especially recycled plastics
CN109870558A (en) * 2017-12-04 2019-06-11 金发科技股份有限公司 A kind of discrimination method of polycarbonate plastic reworked material
US20230314314A1 (en) * 2020-05-11 2023-10-05 Woods Hole Oceanographic Institution Optical system and method to identify plastic
CN112070131A (en) * 2020-08-25 2020-12-11 天津大学 Intrusion detection method based on partial deep learning theory
CN112831017B (en) * 2021-01-05 2022-04-01 美瑞新材料股份有限公司 Method for preparing PLA-PPC-PU copolymer alloy by using PLA reclaimed materials, product and application thereof
WO2022170262A1 (en) * 2021-02-08 2022-08-11 Sortera Alloys, Inc. Sorting of plastics
CN115356230A (en) * 2022-03-28 2022-11-18 广州海关技术中心 Method for identifying polyvinyl chloride reclaimed material
CN114965973A (en) * 2022-05-12 2022-08-30 知里科技(广东)有限公司 Method for identifying recycled plastic based on instrument detection and analysis technology combined with multiple chemometrics methods and/or machine learning algorithm
CN114923869A (en) * 2022-05-12 2022-08-19 知里科技(广东)有限公司 Method for identifying recycled plastic based on combination of spectroscopy, thermal analysis and data fusion strategy and chemometric method
CN115147092A (en) * 2022-07-29 2022-10-04 京东科技信息技术有限公司 Resource approval method and training method and device of random forest model

Also Published As

Publication number Publication date
CN115616204A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
WO2024130805A1 (en) Identification method and system for polyethylene terephthalate recycled material
Tarrío-Saavedra et al. Functional nonparametric classification of wood species from thermal data
Cozzolino et al. Varietal differentiation of grape juice based on the analysis of near-and mid-infrared spectral data
US11630057B2 (en) Deformulation techniques for deducing the composition of a material from a spectrogram
CN108760789A (en) A kind of crude oil fast evaluation method
JP4581039B2 (en) Grade identification method for polymer materials
US11929153B2 (en) Chemometric characterization of refinery hydrocarbon streams
CN114965973A (en) Method for identifying recycled plastic based on instrument detection and analysis technology combined with multiple chemometrics methods and/or machine learning algorithm
CN114202645A (en) Plastic near infrared spectrum classification and identification precision verification method
Lucchi et al. Tire classification by elemental signatures using laser-induced breakdown spectroscopy
Chen et al. Spectroscopic identification of environmental microplastics
CN115406852A (en) Fabric fiber component qualitative method based on multi-label convolutional neural network
CN114971259A (en) Method for analyzing quality consistency of formula product by using near infrared spectrum
CN113567417A (en) Method for identifying peanut oil production place based on Raman spectrum fingerprint analysis technology
Wilcken et al. Quality control of paints: Pyrolysis-mass spectrometry and chemometrics
Li et al. Genetic algorithms (GAs) and evolutionary strategy to optimize electronic nose sensor selection
WO2024130803A1 (en) Method for identifying recycled polypropylene material
WO2024011687A1 (en) Method and apparatus for establishing oil product physical property fast evaluation model
Musu et al. Quantitative Evaluation for the Peak Selectionin the Raman Spectroscopy Classification of Plastics Based on the Support Vector Machine
WO2024130802A1 (en) Method and system for identifying high-impact polystyrene recycled material
Rammal et al. Features' selection based on weighted distance minimization, application to biodegradation process evaluation
CN118032708A (en) Method for rapidly detecting authenticity of degradable plastic by near infrared technology
CN109406421B (en) Method for predicting ferulic acid content in wolfberry fruit based on hyperspectral imaging technology
US11990327B2 (en) Method, system and program for processing mass spectrometry data
CN109406419B (en) Method for predicting content of p-hydroxybenzoic acid in wolfberry based on hyperspectral imaging technology