CN117708719A - Near infrared hyperspectral plastic sorting system and method based on band quantity constraint - Google Patents

Near infrared hyperspectral plastic sorting system and method based on band quantity constraint Download PDF

Info

Publication number
CN117708719A
CN117708719A CN202311700390.9A CN202311700390A CN117708719A CN 117708719 A CN117708719 A CN 117708719A CN 202311700390 A CN202311700390 A CN 202311700390A CN 117708719 A CN117708719 A CN 117708719A
Authority
CN
China
Prior art keywords
wavelength
plastic
near infrared
wavelengths
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311700390.9A
Other languages
Chinese (zh)
Other versions
CN117708719B (en
Inventor
袁琨
王坚
吴咏薇
程彦
张洋
王洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Caipu Technology Zhejiang Co ltd
Original Assignee
Caipu Technology Zhejiang Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Caipu Technology Zhejiang Co ltd filed Critical Caipu Technology Zhejiang Co ltd
Priority to CN202311700390.9A priority Critical patent/CN117708719B/en
Publication of CN117708719A publication Critical patent/CN117708719A/en
Application granted granted Critical
Publication of CN117708719B publication Critical patent/CN117708719B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02WCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
    • Y02W30/00Technologies for solid waste management
    • Y02W30/50Reuse, recycling or recovery technologies
    • Y02W30/62Plastics recycling; Rubber recycling

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a near infrared hyperspectral plastic sorting system and a near infrared hyperspectral plastic sorting method based on band quantity constraint, which are characterized in that a waste plastic sample library is constructed by collecting waste plastic samples and corresponding hyperspectral data and carrying out self-adaptive marking on target plastics one by adopting a cosine similarity algorithm, and classification is carried out by utilizing a PLS-DA method; in addition, the number of the characteristic variables extracted by the SPASA algorithm capable of restraining the number range of the characteristic wavelengths is also set, and 8 characteristic wavelengths are preferentially extracted from the plastic near infrared full spectrum data with 256 wavelengths in a limited range. After the built SPASA-PLS-DA model is applied to plastic sorting, the running time required for processing one frame of data is increased in a replying way while the sorting accuracy is ensured, so that the plastic sorting speed is further increased.

Description

Near infrared hyperspectral plastic sorting system and method based on band quantity constraint
Technical Field
The invention belongs to the technical field of near infrared spectrum analysis, and particularly relates to a near infrared hyperspectral plastic sorting system and method based on band quantity constraint.
Background
With the wide application of plastic products, the pollution problem of waste plastics becomes a great difficulty to be solved at present. The near infrared hyperspectral line scanning camera has the characteristics of good characteristics, insensitivity to sample color, high sample measurement speed, capability of directly carrying out rapid nondestructive measurement on solid samples, and the like, and is particularly suitable for recycling systems of waste plastics. The near infrared hyperspectral imaging system adopts a linear array near infrared hyperspectral camera to image objects on a conveyor belt, can perform visual classification on spectral characteristics and pixel positions of plastic samples in a one-to-one correspondence manner, and controls a sorting mechanism to reject or grab characteristic objects after analysis of sampling results by an identification algorithm.
In practical application, plastic fragments on a conveying belt are required to be sorted in real time, so that high requirements are placed on the speed of model discrimination. The size of the single plastic fragments is more than or equal to 3mm 2 In the case of a conveyor belt speed of 3m/s, it is required to complete detection and calculation of one frame of near infrared spectrum data in 5ms time. When the linear array sensor is used for sampling the conveyor belt, the acquisition speed can meet the requirement only when the acquisition speed reaches at least 1000 frames/s. The current linear array near infrared hyperspectral measurement system is smaller than the sampling speed, so that the application of the near infrared hyperspectral system in plastic sorting is limited.
In the prior art, through an algorithm for extracting coupling average influence values of continuous projection characteristic wavelengths, setting a quantity range of the characteristic wavelengths, optimizing the quantity range, finally obtaining 37 characteristic wavelengths, constructing a neural network model to classify five levels of corn seed vitality, wherein the prediction accuracy of the model can reach 99.1%, but the quantity of the characteristic wavelengths is higher, the characteristic wavelengths are not loaded on a graphic processor for operation, and the calculation time still has an optimization space. In addition, the genetic algorithm and the simulated annealing algorithm are used for optimizing the calculation amount of the support vector machine algorithm, 20 key wavelengths are screened out, then classification prediction is carried out, the spectrum accuracy of waste plastics is predicted after the genetic algorithm is optimized, but the number of the screened wavelengths has randomness, and the calculation time cannot be ensured. The characteristic wavelengths are processed by using a competitive self-adaptive re-weighting algorithm, a model is built by using an SVM algorithm, polyvinyl chloride, nylon, polypropylene and polystyrene 4 plastics are classified, the average accuracy is up to 96.67%, but the number of the wavelengths obtained by screening is up to 68, the calculated amount cannot be reduced, and the calculation speed is improved. From the above prior art, it can be seen that in the near infrared spectrum analysis field, the waste plastic sorting technology can be implemented, but the number of characteristic wavelengths is not studied on the premise of improving the calculation speed and ensuring the model accuracy.
Disclosure of Invention
In order to solve the defects in the prior art, the purposes of high-speed near infrared spectrum acquisition and data analysis are realized by constructing a characteristic wavelength optimization and classification model, and the invention adopts the following technical scheme:
the near infrared hyperspectral plastic sorting system based on the wave band quantity constraint comprises a near infrared hyperspectral data acquisition module, a spectrum analysis module and a sorting control mechanism;
the near infrared hyperspectral data acquisition module is used for acquiring near infrared spectral data of plastic articles;
the spectrum analysis module screens wavelengths according to near infrared spectrum data based on a continuous projection algorithm constrained by the number of wave bands, and presets a value range [ M, N ] of the number H of optimal wavelengths]M is the minimum wavelength number, N is the maximum wavelength number, when H=M, wavelength selection based on the continuous projection algorithm is carried out for the first time to obtain a wavelength set in a certain range, when H=N, wavelength selection based on the continuous projection algorithm is carried out for N-M times to obtain N-M groups of wavelength sets,calculation of N-M sets of minimum cross-validation root mean square error RMSECV for each wavelength set using PLS cross-validation i (1.ltoreq.i.ltoreq.N-M) and the corresponding wavelength combinations, selecting the corresponding minimum RMSECV i And corresponding wavelength combinations such that the resulting values are optimal within a certain range; according to the wave bands and the wave band numbers of the wavelength combination, a partial least square discriminant analysis algorithm model is constructed to carry out plastic classification and discrimination, the model is a classification algorithm based on partial least square regression, and the partial least square regression can realize regression and simultaneously carry out data dimension reduction, so that the method is suitable for high-dimension data such as spectrum data; since the smaller the number of wavelengths, the faster the calculation speed, the more benefit is brought to the accuracy loss, when N cur When the minimum wavelength number M is taken, the correction coefficient is 1; when N is cur When M, the ACC is subjected to reduced correction, so that calculation for obtaining smaller wavelength number is rewarded, but when the difference between M and N is too small, the correction condition occurs, therefore, when N-M is more than 100 and M is more than 2, the weight correction accuracy ACC (accuracy) is taken as a model evaluation standard:
wherein N is cur Representing the current number of wavelengths;
and the sorting control mechanism sorts the plastic articles according to the sorting discrimination result of the spectrum analysis module.
Further, the near infrared hyperspectral data acquisition module performs region processing on the sensor, all rows of pixels are selected from the space, a specific column of pixels is selected from the spectrum dimension to select a required wave band, so that the sampling speed of the sensor is remarkably improved, when the specific column of pixels is selected for the IMX990 of the imga planar array sensor, 8 columns of pixels are required to be included in the minimum range, 1280×8×8 pixels are actually selected under the condition that 8 ranges which are not overlapped with each other are selected, and the sampling speed is improved from 235 frames/second to 1800 frames/second.
Further, in the spectrum analysis module, the pair of continuous projection algorithmsIn a spectrum matrix X with n sample numbers and m wavelength numbers n×m Assuming that the number of wavelengths required is H, the procedure is as follows:
1. arbitrarily selecting the jth column x of the original spectral matrix j When the iteration number t=1, the corresponding x j Denoted as x k(1)
2. The set of column vector positions that are not selected is noted as:
3. respectively calculate x j Projection of the remaining column vectors:
wherein, I represents an identity matrix, and T represents the transposition of the matrix;
4. extracting spectral wavelengths containing the largest projection vector:
5. if t<H, let t=t+1, perform the next iteration, and return to step 2; if t=h, the cycle ends; when the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) And the characteristic wavelength set selected by the SPA is obtained.
Further, before classifying and judging the partial least square discriminant analysis algorithm model, classifying and marking the near infrared spectrum data through an unsupervised clustering algorithm based on a plastic sample library; a cosine similarity algorithm is selected to set a threshold value, and the target plastic samples are classified in a self-adaptive marking mode:
wherein P, Q is any two different spectral data;
and judging that the two spectrum data which participate in calculation are derived from the same type of plastic sample when the cosine value is larger than the threshold value, and calculating to obtain a corresponding label matrix.
Further, the partial least square discriminant analysis algorithm model adopts k-fold cross validation to determine the number of potential wavelength variables of the model for least square regression, specifically, a training set is randomly divided into k subsets, one subset is used as a cross validation set each time, the remaining k-1 subsets form a training set, parameters are changed, the training and predicting processes are repeated for k times, and the model is evaluated to find the best model parameters according to the k average prediction results of the cross validation set; the principal component number corresponding to the highest score of the training set and the verification set is used as a selection principle, and the score of the verification set is not higher than that of the training set.
The near infrared hyperspectral plastic sorting method based on the wave band quantity constraint comprises the following steps:
step S1: collecting near infrared spectrum data of the plastic articles;
step S2: spectral analysis, according to near infrared spectrum data, a wavelength is screened based on a continuous projection algorithm constrained by the number of wave bands, and a partial least squares discriminant analysis algorithm model is constructed to carry out plastic classification and discrimination according to the screened wave bands and the number of wave bands, wherein the model is a classification algorithm based on partial least squares regression, and the partial least squares regression can realize regression and simultaneously carry out data dimension reduction, so that the method is suitable for high-dimensional data such as spectrum data, and the wavelength screening comprises the following steps:
step S2.1.1: presetting a value range [ M, N ] of the optimal wavelength number H, wherein M is the minimum wavelength number, and N is the maximum wavelength number;
step S2.1.2: when h=m, performing wavelength selection for the first time based on the continuous projection algorithm to obtain a wavelength set in a certain range;
step S2.1.3: when h=n, N-M wavelength selection based on continuous projection algorithm is performed to obtain N-M wavelength sets, and PLS cross-validation is usedEvidence for each wavelength set calculation of N-M sets of minimum cross-validation root mean square error RMSECV i (1.ltoreq.i.ltoreq.N-M) and the corresponding wavelength combinations, selecting the corresponding minimum RMSECV i And corresponding wavelength combinations such that the resulting values are optimal within a certain range; obtaining corresponding wave bands and the number of the wave bands according to the wavelength combination; since the smaller the number of wavelengths, the faster the calculation speed, the more benefit is brought to the accuracy loss, when N cur When the minimum wavelength number M is taken, the correction coefficient is 1; when N is cur When M, the ACC is subjected to reduced correction, so that calculation for obtaining smaller wavelength number is rewarded, but when the difference between M and N is too small, the correction condition occurs, therefore, when N-M is more than 100 and M is more than 2, the weight correction accuracy ACC (accuracy) is taken as a model evaluation standard:
wherein N is cur Representing the current number of wavelengths;
step S3: and sorting the plastic articles according to the sorting and distinguishing results.
Further, in the near infrared hyperspectral data collection in the step S1, the sensor is subjected to the region of interest processing, all the row pixels are exclusively selected from the space, the spectrum dimension selects the specific column pixels to select the required wave band, so that the sampling speed of the sensor is remarkably improved, for the IMX990 of the img, when the specific column pixels are selected, the minimum range needs to include 8 columns of pixels, and when 8 ranges which are not overlapped with each other are selected, 1280×8×8 pixels are actually selected, and the sampling speed is improved from 235 frames/second to 1800 frames/second.
Further, in the spectral analysis of the step S2, the continuous projection algorithm is applied to a spectral matrix X having n numbers of samples and m numbers of wavelengths n×m Assuming that the number of wavelengths required is H, the procedure is as follows:
step S2.2.1: arbitrarily selecting the jth column x of the original spectral matrix j When the iteration number t=1, the corresponding x j Denoted as x k(1)
Step S2.2.2: the set of column vector positions that are not selected is noted as:
step S2.2.3: respectively calculate x j Projection of the remaining column vectors:
wherein, I represents an identity matrix, and T represents the transposition of the matrix;
step S2.2.4: extracting spectral wavelengths containing the largest projection vector:
step S2.2.5: if t<H, let t=t+1, perform the next iteration, and return to step 2; if t=h, the cycle ends; when the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) And the characteristic wavelength set selected by the SPA is obtained.
In step S2, before classifying and discriminating the partial least squares discriminant analysis algorithm model, classifying and marking the near infrared spectrum data by an unsupervised clustering algorithm based on the plastic sample library; a cosine similarity algorithm is selected to set a threshold value, and the target plastic samples are classified in a self-adaptive marking mode:
wherein P, Q is any two different spectral data;
and judging that the two spectrum data which participate in calculation are derived from the same type of plastic sample when the cosine value is larger than the threshold value, and calculating to obtain a corresponding label matrix.
Further, the partial least square discriminant analysis algorithm model in the step S2 adopts k-fold cross validation to determine the number of potential wavelength variables of the model for least square regression, specifically, a training set is randomly divided into k subsets, one subset is used as a cross validation set each time, the remaining k-1 subsets form a training set, parameters are changed, the training and prediction processes are repeated for k times, and the model is evaluated to find the best model parameters by k times of average prediction results of the cross validation set; the principal component number corresponding to the highest score of the training set and the verification set is used as a selection principle, and the score of the verification set is not higher than that of the training set.
The invention has the advantages that:
according to the near infrared hyperspectral plastic sorting system and method based on the band quantity constraint, the hyperspectral acquisition module is subjected to ROI processing, so that the sampling speed of the sensor is improved; then, wavelength quantity constraint is carried out, waste plastics from daily life are collected, near infrared spectrums of the waste plastics are analyzed, a plastic hyperspectral sample library of various types is established, spectrum data are collected, the optimized SPA algorithm is utilized to limit the wavelength quantity, wavelength combinations are screened out, the quantity of the spectrum data to be processed is greatly reduced, and the system operation speed is improved; in addition, a cosine similarity algorithm is selected, a label matrix is created by a method of building a self-adaptive label for a target plastic sample, a SPASA-PLS-DA model is built for plastic classification, and the classification accuracy of the three models on various plastics is up to more than 97% by comparing the model of the invention with a full spectrum PLS-DA model and a SPA-PLS-DA model; and after the three models are applied to a sorting system for classifying and judging the ABS plastic, integrally testing and comparing the speed, the speed of the SPASA-PLS-DA model is obviously higher than that of the sorting system where other models are positioned, and the sorting accuracy is 100% when the frame/s is 2 ms.
Drawings
FIG. 1 is a schematic diagram of a system in an embodiment of the invention.
FIG. 2 is a graph of the average spectra of 12 plastics in an example of the present invention.
FIG. 3a is a graph showing the variation of RMSECV according to the number of preset variables in the SPasa algorithm according to the embodiment of the present invention.
FIG. 3b is a graph of SPA characteristic wavelength selection results in the SPA algorithm in an embodiment of the present invention.
Fig. 4a is a schematic diagram of an adaptive plastic article marking based on a cosine similarity algorithm according to an embodiment of the present invention.
Fig. 4b is a schematic diagram of separation of an adaptive plastic article based on a cosine similarity algorithm according to an embodiment of the present invention.
FIG. 5 is a schematic representation of the number of principal components of different scores based on cross-validation in an embodiment of the invention.
FIG. 6 is a graph comparing the spectrum of PP, HDPE, TPE, POM, PLA in the example of the present invention.
FIG. 7 is a graph comparing the spectrum of PP, PET, PBT in the example of the present invention.
FIG. 8 is a graph comparing the spectrum of PPO, PPS, ABS, SAN in the example of the present invention.
FIG. 9 is a flow chart of a method in an embodiment of the invention.
Detailed Description
The following describes specific embodiments of the present invention in detail with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.
As shown in fig. 1, the near infrared hyperspectral plastic sorting system based on band number constraint comprises a hyperspectral data acquisition module, a spectrum analysis module and a sorting control mechanism, wherein hyperspectral data acquisition and spectrum analysis play a decisive role in the overall speed of the system.
The hyperspectral data acquisition module is a near infrared array hyperspectral camera. The working spectrum range of the linear array near infrared hyperspectral camera is 900-1700nm, the spectrum resolution is 5nm, the spatial resolution is 1280pix, and the spectrum band is 1024. The linear array hyperspectral camera can obtain 1024 wave band spectrum signals within the wavelength range of 900-1700nm by single sampling, the sensor is an InGaAs area array sensor IMX990, and the sensor can reach the imaging speed of 235 frames/second. Since 1024 wavebands are not needed in actual analysis, the number of actually used wavebands can be reduced, and the sampling speed of the sensor can be remarkably improved by selecting the actually needed wavebands through the region of interest (ROI) processing of the sensor. When the ROI is performed with the recognition accuracy ensured, all row pixels are selected in the spatial dimension, and a specific column pixel is selected in the spectral dimension. While IMX990 selects a particular column of pixels, the minimum range would include 8 columns of pixels. Under the condition of selecting 8 mutually non-overlapping ranges, 1280×8×8 pixels are actually selected, and the sampling speed can be increased to 1800 frames/second, so that the actual sampling speed requirement of the system is met.
And the spectrum analysis module is used for analyzing the acquired near infrared spectrum data of the sample through a computer and judging the type. In practical application, the speed is further improved after the wavelength is screened by a wavelength selection algorithm; after selecting proper wave band number and proper spectrum wavelength, PLS-DA algorithm is applied to build analysis model, and the collected spectrum data is judged to be plastic material.
The continuous projection algorithm (Successive Projections algortm, SPA) is a wavelength selective algorithm that is widely used in spectroscopic analysis. The algorithm is based on variable information, adopts a method for reducing the dimension in terms of variables, and obtains variable combinations containing minimum redundant information through projection analysis of vectors. For a spectrum matrix X with n sample numbers and m wavelength numbers n×m Assuming that the number of wavelengths required is H, SPA is performed as follows:
1. when the iteration time t=1, 1 column (j-th column) of the original spectrum matrix is arbitrarily selected to be marked as x j Denoted as x k(1)
2. The set of column vector positions that are not selected is denoted s,
3. respectively calculate x j Projection of the remaining column vectors:
wherein I is an identity matrix.
4. The spectral wavelength containing the largest projection vector is extracted,
5. if t is less than H, let t=t+1, carry on the next iteration, return to step 2; if t=h, the loop terminates.
When the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) The characteristic wavelength set selected by SPA. Although the target wavelength number H is set in the algorithm, the fixed number of characteristic wavelengths obtained by the calculation cannot be guaranteed to be the optimal solution.
When the practical problem of plastic sorting is solved, the number of characteristic wavelengths can meet the requirement as long as the number is smaller than a specific number. Moreover, in order to increase the sampling speed, the scheme with a small number of wavelengths has greater advantages in acquisition and operation speeds. To ensure that an optimal solution is obtained with less than the number of characteristic wavelengths H, the SPA is modified by the SPA (SPasa) based on the constraint of the number of bands to find the number of optimal wavelengths H and the optimal wavelength set { x } less than a certain number k(1) ,x k(2) ,...,x k(H) }. The SPA algorithm is modified to be based on the SPA algorithm:
for a spectrum matrix X with n sample numbers and m wavelength numbers n×m Assuming that the number of wavelengths required is H, SPA is performed as follows:
1. when the iteration time t=1, 1 column (j-th column) of the original spectrum matrix is arbitrarily selected to be marked as x j Denoted as x k(1)
2. The set of column vector positions that are not selected is denoted s,
3. respectively calculate x j Projection of the remaining column vectors:
wherein I is an identity matrix.
4. The spectral wavelength containing the largest projection vector is extracted,
5. if t is less than H, let t=t+1, carry on the next iteration, return to step 2; if t=h, the loop terminates.
When the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) The characteristic wavelength set selected by SPA. Although the target wavelength number H is set in the algorithm, the fixed number of characteristic wavelengths obtained by the calculation cannot be guaranteed to be the optimal solution.
1. When h=m, the first SPA algorithm wavelength selection described above is performed.
2. When h=n, (N-M) SPA algorithm wavelength selection is performed to obtain (N-M) group wavelength sets. Computing (N-M) group minimum cross-validation root mean square error RMSECV for each wavelength set using PLS cross-validation i (1.ltoreq.i.ltoreq.N-M) and corresponding wavelength combinations. Selecting a corresponding minimum value RMSECV i And corresponding wavelength combinations such that the resulting values are optimal within a certain range.
In the examples of the present invention, the samples used were waste plastic chips of 12 different polymers and structural components, PP, PET, HDPE, TPE, PLA, PBT, TPU, POM, PPO, PPS, ABS, SAN respectively. Sources of waste plastics include household items, medical instrument accessories, and automotive parts, among others, as shown in table 1.
TABLE 1 waste Plastic sample information
And respectively shooting hyperspectral images of all plastic samples according to types, and establishing a sample library for modeling of a subsequent PLS-DA algorithm. After the hyperspectral image photographing is completed, spectral curves are respectively extracted from each sample area, 180 groups of hyperspectral data are obtained, the spectral curves are drawn after the spectral data are averaged according to types, and the spectral curves are shown in fig. 2. This data is used in the SPAsa algorithm to screen wavelengths and determine the optimal number of wavelengths.
As can be seen from FIG. 2, most plastic samples have absorption peaks around 1200nm,1400nm and 1650nm, which may be secondary frequency doubling peaks of C-H bonds, group frequency peaks of C-H bonds and primary frequency doubling peaks of C-H bond stretching vibration, respectively. The near infrared spectrum of SAN plastic is significantly different from other 11 types of plastic, the number of absorption peaks in the 900-1700nm range is significantly less than that of other types of plastic, and the most significant difference is that SAN has no absorption peak around 1400 nm.
Using a SPASA algorithm to select wavelength variables of 180 groups of spectrum data, setting the variation range of H as [0,20], setting the random segmentation of training set and test set samples as 7:3, setting the selected variable number as an abscissa and the RMSECV value as an ordinate during the variation of H as shown in FIG. 3a, drawing a graph of the variation of the RMSECV with the variable number, gradually reducing the minimum value of the RMSECV with the increase of the number of selected variables, and setting the minimum value to be 0.0292 when the number of selected variables is 8, wherein the selected wave bands are 203,244, 93,231,249, 22 and 68,252 respectively as shown in FIG. 3 b; as the number of variables is continuously increased, the RMSECV tends to be stable;
and constructing a partial least squares discriminant analysis (PLS-DA) algorithm model according to the selected wave bands and the number of the wave bands. The PLS-DA algorithm is a classification algorithm based on partial least squares regression, and the partial least squares regression can realize regression and simultaneously reduce the dimension of data, so that the method is suitable for high-dimension data such as spectrum data.
The invention carries out classification and discrimination through PLS regression, and assigns a value of 1 or 0 to a label matrix Y to represent the category of each sample. Y has 12 columns, each column corresponding to a plastic, and the Y value of the class 1 training set is (1,0,0,0,0,0,0,0,0,0,0,0); if the class 2, Y is assigned (0,1,0,0,0,0,0,0,0,0,0,0), and the like, then a prediction function between the spectrum data X and the expression class Y is established for classification discrimination. Therefore, before PLS-DA algorithm modeling is carried out, the collected spectrum data is required to be classified and marked through an unsupervised clustering algorithm, a cosine similarity algorithm is selected to set a threshold value, a target plastic sample is classified through an adaptive marking mode, and if the cosine value is larger than the threshold value, two spectrum data which participate in calculation are judged to be derived from the same type of plastic sample, so that a corresponding label matrix Y is calculated, and then the label matrix Y is used for PLS-DA modeling.
Wherein P, Q is any two different spectral data.
The hyperspectral images of the target plastic samples are all from the established sample library, as shown in fig. 4a, the left plastic sample is ABS, the right plastic sample is PP, the target sample PP is adaptively marked on the original image, fig. 4b classifies the target plastic, other plastics and the background integrally in a mask manner, and so on, and a label database of 12 plastic samples is established.
The number of potential variables of the PLS-DA classification model for PLS regression is determined by adopting a k-fold cross validation method, specifically, a training set is randomly divided into k subsets, one subset is used as a cross validation set each time, the rest k-1 subsets form a training set, parameters are changed, the training and prediction processes are repeated for k times, and the model is evaluated to find the optimal model parameters by k times of average prediction results of the cross validation set. In the embodiment of the invention, k is 10, 10-fold cross verification is performed, the principal component number corresponding to the highest score of the training set and the verification set is used as a selection principle, the score of the verification set is not higher than the training set,
randomly dividing sample data into a training set and a verification set according to a ratio of 7:3, and taking weight correction accuracy ACC (accuracy) as a model evaluation standard, wherein the weight correction accuracy ACC calculates the formula:
wherein the value range of the preset H is [ M, N ]]M is the minimum number of wavelengths, N is the maximum number of wavelengths, N cur Is the current number of wavelengths.
Since in practical use, the smaller the number of wavelengths, the faster the calculation speed, and the greater the gain relative to the accuracy loss. When N is cur When the minimum wavelength number M is taken, the correction coefficient is 1; when N is cur When M, the ACC is corrected in a shrinking way, so that the calculation for obtaining the smaller wavelength number is rewarded. However, when the difference between M and N is too small, overcorrection may occur. In practical use tests, when N-M > 100 and M is greater than 2, it is desirable to use this method.
As shown in fig. 5, the maximum ACC value is obtained when n=4 is taken.
The 12 types of plastics are roughly divided into three types for near infrared spectrum comparison of plastics by the difference of the number of absorption peaks and characteristic wavelength on the spectrum curve, and the classification result rationality of the PLS-DA algorithm is analyzed. The first class of plastics was PP, HDPE, TPE, POM, PLA, whose spectral curve was substantially uniform for the number of absorption peaks in the 900-1700nm range, as shown in FIG. 6, and whose line shapes were also substantially similar, but with different degrees of shift in the absorption peaks around 1200nm and 1400nm, which can be distinguished by differences in characteristic wavelengths, as also demonstrated in the PLS-DA classification results.
The second type is PET and PBT, the spectrum curve pair of the PET and PBT is shown in FIG. 7, different reflection intensities are presented under the same light source, the reflection intensity of PET plastic is higher, and the reflection intensity of PBT plastic is lower. The spectrum curves are quite similar in shape, the quantity of absorption peaks in the range of 900-1700nm is consistent, but the absorption peaks near 1200nm are obviously deviated, the absorption peaks near 1198nm are deviated to a short wave area, the absorption peaks can be distinguished through the characteristic, and the algorithm accuracy is verified from the side face.
The third class is PPO, PPS, ABS, SAN, which has a spectral curve pair such as that shown in FIG. 8, with one absorption peak less around 1400nm for SAN, and therefore, it is distinguishable from PPO, ABS, PPS. PPO, PPS and ABS have smaller distinction degree, but absorption peaks near 1200nm and 1400nm have deviation from the other two types, and can be distinguished by the difference of characteristic wavelengths. The near infrared spectra of ABS and PPS are very similar, the number of absorption peaks is identical in the range of 900-1700nm, the positions are very similar, the absorption peaks near 1150nm and 1200nm are overlapped, and the absorption peaks are difficult to distinguish, which is also confirmed in PLS-DA classification results.
And respectively carrying out accuracy verification on the full spectrum PLS-DA model, the SPA-PLS-DA model and the SPasa-PLS-DA classification model according to 7:3, randomly dividing the original data set by the proportion to obtain a training set and a testing set to be tested, establishing a classification model by the same parameters, and showing a prediction result through ACC values and average calculation time, wherein the result is shown in a table 2.
Table 2 evaluation of effect of randomly dividing each classification model
After ROI processing on the sensor, 3 models were applied to the sorting system for the whole test. The computer with NVIDIA tesla M40 display card, 10 generation I7CPU and 16G memory can calculate 500 frames per second data, on the basis of the speed, the sorting system sorts and judges the target ABS plastic and other plastics, compares the separation accuracy, and counts the operation time required by the computer to process one frame of data, as shown in Table 3.
TABLE 3 Overall test time for each model
By comprehensively obtaining from tables 2 and 3, the spectral band to be processed is reduced by performing ROI processing on the sensor, and the wavelength quantity constraint is performed on the spectral data by combining with the SPASA algorithm, so that the operation time is obviously reduced, the calculation of one frame of data is completed within 2ms, the classification precision is not greatly lost, and the target ABS plastics subjected to real-time sorting are accurately separated.
The sorting control mechanism comprises a conveyor belt, a nozzle array, an air pump and a target plastic material box, wherein one end of the conveyor belt is provided with a near infrared hyperspectral camera for shooting plastic on the conveyor belt, the other end of the conveyor belt is provided with the nozzle array, the sorting control mechanism sends out an instruction according to a sorting result, the nozzle array and the air pump are controlled, an air tap is enabled to execute air blowing operation at proper time, and target plastic is blown into the specified material box from the conveyor belt, so that specific materials are removed.
As shown in fig. 9, the near infrared hyperspectral plastic sorting method based on band number constraint comprises the following steps:
step S1: collecting near infrared spectrum data of the plastic articles;
step S2: spectral analysis, according to near infrared spectrum data, a wavelength is screened based on a continuous projection algorithm constrained by the number of wave bands, and a partial least squares discriminant analysis algorithm model is constructed to carry out plastic classification and discrimination according to the screened wave bands and the number of wave bands, wherein the model is a classification algorithm based on partial least squares regression, and the partial least squares regression can realize regression and simultaneously carry out data dimension reduction, so that the method is suitable for high-dimensional data such as spectrum data, and the wavelength screening comprises the following steps:
step S2.1.1: presetting a value range [ M, N ] of the optimal wavelength number H, wherein M is the minimum wavelength number, and N is the maximum wavelength number;
step S2.1.2: when h=m, performing wavelength selection for the first time based on the continuous projection algorithm to obtain a wavelength set in a certain range;
step S2.1.3: when h=n, N-M wavelength selections based on the continuous projection algorithm are performed to obtain N-M sets of wavelengths, and N-M sets of minimum cross-validation root mean square error RMSECV is calculated for each set of wavelengths using PLS cross-validation i (1.ltoreq.i.ltoreq.N-M) and the corresponding wavelength combinations, selecting the corresponding minimum RMSECV i And corresponding wavelength combinations such that the resulting values are optimal within a certain range; obtaining corresponding wave bands and the number of the wave bands according to the wavelength combination;
step S3: and sorting the plastic articles according to the sorting and distinguishing results.
This part of the implementation is similar to the implementation of the system embodiment described above, and will not be repeated here.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the technical solutions according to the embodiments of the present invention.

Claims (10)

1. Near-infrared hyperspectral plastic sorting system based on wave band quantity constraint, including near-infrared hyperspectral data acquisition module, spectral analysis module and sorting control mechanism, its characterized in that:
the near infrared hyperspectral data acquisition module is used for acquiring near infrared spectral data of plastic articles;
the spectrum analysis module screens wavelengths according to near infrared spectrum data based on a continuous projection algorithm constrained by the number of wave bands, and presets a value range [ M, N ] of the number H of optimal wavelengths]M is the minimum wavelength number, N is the maximum wavelength number, when H=M, the first wavelength selection based on the continuous projection algorithm is performed, when H=N, the N-M wavelength selections based on the continuous projection algorithm are performed, and N-M is obtainedM sets of wavelengths, and N-M sets of minimum cross-validation root mean square errors RMSECV are calculated for each set of wavelengths using cross-validation i (1.ltoreq.i.ltoreq.N-M) and the corresponding wavelength combinations, selecting the corresponding minimum RMSECV i And corresponding wavelength combinations; constructing a partial least square discriminant analysis algorithm model according to the wave bands and the wave band numbers of the wavelength combinations to carry out plastic classification and discrimination; when N-M > 100 and M is greater than 2, the weight correction accuracy ACC is taken as a model evaluation standard:
wherein N is cur Representing the current number of wavelengths;
and the sorting control mechanism sorts the plastic articles according to the sorting discrimination result of the spectrum analysis module.
2. The near infrared hyperspectral plastic sorting system based on band number constraint of claim 1, wherein: the near infrared hyperspectral data acquisition module is used for processing the region of interest of the sensor, selecting all row pixels exclusively from space, and selecting specific column pixels in the spectrum dimension to select a required wave band.
3. The near infrared hyperspectral plastic sorting system based on band number constraint of claim 1, wherein: in the spectrum analysis module, a continuous projection algorithm is used for a spectrum matrix X with n sample numbers and m wavelength numbers n×m Assuming that the number of required wavelengths is H, the value range [ M, N ] of the optimal number of wavelengths H]M is the minimum number of wavelengths, N is the maximum number of wavelengths, and the implementation process is as follows:
1. arbitrarily selecting the jth column x of the original spectral matrix j When the iteration number t=1, the corresponding x j Denoted as x k(1)
2. The set of column vector positions that are not selected is noted as:
3. respectively calculate x j Projection of the remaining column vectors:
wherein, I represents an identity matrix, and T represents the transposition of the matrix;
4. extracting spectral wavelengths containing the largest projection vector:
5. if t<H, let t=t+1, perform the next iteration, and return to step 2; if t=h, the cycle ends; when the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) And the characteristic wavelength set selected by the SPA is obtained.
4. The near infrared hyperspectral plastic sorting system based on band number constraint of claim 1, wherein: before classifying and discriminating the partial least square discrimination analysis algorithm model, classifying and marking the near infrared spectrum data by an unsupervised clustering algorithm based on a plastic sample library; a cosine similarity algorithm is selected to set a threshold value, and the target plastic samples are classified in a self-adaptive marking mode:
wherein P, Q is any two different spectral data;
and judging that the two spectrum data which participate in calculation are derived from the same type of plastic sample when the cosine value is larger than the threshold value, and calculating to obtain a corresponding label matrix.
5. The near infrared hyperspectral plastic sorting system based on band number constraint of claim 1, wherein: the partial least square discriminant analysis algorithm model adopts k-fold cross validation to determine the number of potential wavelength variables of the model for least square regression, specifically, a training set is randomly divided into k subsets, one subset is used as a cross validation set each time, the rest k-1 subsets form a training set, parameters are changed, the training and predicting processes are repeated for k times, and the model is evaluated to find the best model parameters according to the k average prediction results of the cross validation set; the principal component number corresponding to the highest score of the training set and the verification set is used as a selection principle, and the score of the verification set is not higher than that of the training set.
6. The near infrared hyperspectral plastic sorting method based on band quantity constraint is characterized by comprising the following steps of:
step S1: collecting near infrared spectrum data of the plastic articles;
step S2: spectral analysis, namely screening wavelengths according to near infrared spectral data based on a continuous projection algorithm constrained by the number of wave bands, and constructing a partial least square discriminant analysis algorithm model to carry out plastic classification and discrimination according to the screened wave bands and the number of the wave bands, wherein the wavelength screening comprises the following steps:
step S2.1.1: presetting a value range [ M, N ] of the optimal wavelength number H, wherein M is the minimum wavelength number, and N is the maximum wavelength number;
step S2.1.2: when h=m, performing a first wavelength selection based on a continuous projection algorithm;
step S2.1.3: when h=n, N-M wavelength selections based on the continuous projection algorithm are performed to obtain N-M sets of wavelengths, and N-M sets of minimum cross-validation root mean square error RMSECV is calculated for each set of wavelengths using cross-validation i (1.ltoreq.i.ltoreq.N-M) and the corresponding wavelength combinations, selecting the corresponding minimum RMSECV i And corresponding wavelength combinations; obtaining corresponding wave bands and the number of the wave bands according to the wavelength combination; when N is-M > 100, and M is greater than 2, taking the weight correction accuracy ACC as model evaluation criterion:
wherein N is cur Representing the current number of wavelengths;
step S3: and sorting the plastic articles according to the sorting and distinguishing results.
7. The near infrared hyperspectral plastic sorting method based on band number constraint of claim 6, wherein the method comprises the following steps: in the acquisition of the near infrared hyperspectral data in the step S1, the sensor is subjected to the region of interest processing, all the row pixels are exclusively selected from the space, and the spectral dimension selects a specific column pixel to select a required wave band.
8. The near infrared hyperspectral plastic sorting method based on band number constraint of claim 6, wherein the method comprises the following steps: in the spectral analysis of the step S2, the continuous projection algorithm is applied to a spectral matrix X having n sample numbers and m wavelength numbers n×m Assuming that the number of wavelengths required is H, the procedure is as follows:
step S2.2.1: arbitrarily selecting the jth column x of the original spectral matrix j When the iteration number t=1, the corresponding x j Denoted as x k(1)
Step S2.2.2: the set of column vector positions that are not selected is noted as:
step S2.2.3: respectively calculate x j Projection of the remaining column vectors:
wherein, I represents an identity matrix, and T represents the transposition of the matrix;
step S2.2.4: extracting spectral wavelengths containing the largest projection vector:
step S2.2.5: if t<H, let t=t+1, perform the next iteration, and return to step 2; if t=h, the cycle ends; when the cycle is terminated, the resulting wavelength set { x } k(1) ,x k(2) ,...,x k(H) And the characteristic wavelength set selected by the SPA is obtained.
9. The near infrared hyperspectral plastic sorting method based on band number constraint of claim 6, wherein the method comprises the following steps: in the step S2, before classifying and discriminating the partial least square discrimination analysis algorithm model, classifying and marking the near infrared spectrum data by an unsupervised clustering algorithm based on a plastic sample library; a cosine similarity algorithm is selected to set a threshold value, and the target plastic samples are classified in a self-adaptive marking mode:
wherein P, Q is any two different spectral data;
and judging that the two spectrum data which participate in calculation are derived from the same type of plastic sample when the cosine value is larger than the threshold value, and calculating to obtain a corresponding label matrix.
10. The near infrared hyperspectral plastic sorting method based on band number constraint of claim 6, wherein the method comprises the following steps: the partial least square discriminant analysis algorithm model in the step S2 adopts k-fold cross validation to determine the number of potential wavelength variables of the model for least square regression, specifically, a training set is randomly divided into k subsets, one subset is used as a cross validation set each time, the rest k-1 subsets form a training set, parameters are changed, the training and predicting processes are repeated for k times, and the model is evaluated to find the best model parameters by the k average predicting results of the cross validation set; the principal component number corresponding to the highest score of the training set and the verification set is used as a selection principle, and the score of the verification set is not higher than that of the training set.
CN202311700390.9A 2023-12-12 2023-12-12 Near infrared hyperspectral plastic sorting system and method based on band quantity constraint Active CN117708719B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311700390.9A CN117708719B (en) 2023-12-12 2023-12-12 Near infrared hyperspectral plastic sorting system and method based on band quantity constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311700390.9A CN117708719B (en) 2023-12-12 2023-12-12 Near infrared hyperspectral plastic sorting system and method based on band quantity constraint

Publications (2)

Publication Number Publication Date
CN117708719A true CN117708719A (en) 2024-03-15
CN117708719B CN117708719B (en) 2024-06-14

Family

ID=90161772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311700390.9A Active CN117708719B (en) 2023-12-12 2023-12-12 Near infrared hyperspectral plastic sorting system and method based on band quantity constraint

Country Status (1)

Country Link
CN (1) CN117708719B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106092957A (en) * 2016-06-02 2016-11-09 浙江农林大学 The near infrared spectrum recognition methods of mahogany furniture
WO2017153726A1 (en) * 2016-03-07 2017-09-14 Micromass Uk Limited Spectrometric analysis
WO2018010352A1 (en) * 2016-07-11 2018-01-18 上海创和亿电子科技发展有限公司 Qualitative and quantitative combined method for constructing near infrared quantitative model
CN108872143A (en) * 2018-05-22 2018-11-23 南京农业大学 A kind of wheat infection head blight level detection method based near infrared spectrum

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017153726A1 (en) * 2016-03-07 2017-09-14 Micromass Uk Limited Spectrometric analysis
CN106092957A (en) * 2016-06-02 2016-11-09 浙江农林大学 The near infrared spectrum recognition methods of mahogany furniture
WO2018010352A1 (en) * 2016-07-11 2018-01-18 上海创和亿电子科技发展有限公司 Qualitative and quantitative combined method for constructing near infrared quantitative model
CN108872143A (en) * 2018-05-22 2018-11-23 南京农业大学 A kind of wheat infection head blight level detection method based near infrared spectrum

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MAZIN ABED MOHAMMED 等: "Novel crow Swarm Optimization Algorithm and Selection Approach for Optimal Deep Learning COVID-19 Diagnostic Model", 《COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE》, 13 August 2022 (2022-08-13), pages 1 - 22 *
成忠 等: "连续投影算法及其在小麦近红外光谱波长选择中的应用", 《光谱学与光谱分析》, vol. 30, no. 4, 30 April 2010 (2010-04-30), pages 949 - 952 *
罗微 等: "PCA和SPA的近红外光谱识别白菜种子品种研究", 《光谱学与光谱分析》, vol. 36, no. 11, 15 November 2016 (2016-11-15), pages 3536 - 3541 *

Also Published As

Publication number Publication date
CN117708719B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
US9989463B2 (en) Material classification
Wu et al. Variety identification of oat seeds using hyperspectral imaging: Investigating the representation ability of deep convolutional neural network
CN104657752B (en) A kind of seatbelt wearing recognition methods based on deep learning
Hong et al. Comparative study on vision based rice seed varieties identification
CN110717368A (en) Qualitative classification method for textiles
Dacal-Nieto et al. Non–destructive detection of hollow heart in potatoes using hyperspectral imaging
Mustafa et al. Classification of fruits using Probabilistic Neural Networks-Improvement using color features
Cai et al. Nondestructive gender identification of silkworm cocoons using X-ray imaging with multivariate data analysis
CN107818298A (en) General Raman spectral characteristics extracting method for machine learning material recognition
Bhuiyan et al. Automatic acute lymphoblastic leukemia detection and comparative analysis from images
Lin et al. Determination of the varieties of rice kernels based on machine vision and deep learning technology
Daskalov et al. Performance of an automatic inspection system for classification of Fusarium Moniliforme damaged corn seeds by image analysis
Duth et al. Intra class vegetable recognition system using deep learning
CN103955711B (en) A kind of mode identification method in imaging spectral target identification analysis
Deulkar et al. An automated tomato quality grading using clustering based support vector machine
Setiawan et al. Rice Foreign Object Classification Based on Integrated Color and Textural Feature Using Machine Learning.
CN117708719B (en) Near infrared hyperspectral plastic sorting system and method based on band quantity constraint
Sumathi et al. CLASSIFICATION OF FRUITS RIPENESS USING CNN WITH MULTIVARIATE ANALYSIS BY SGD.
CN116519661A (en) Rice identification detection method based on convolutional neural network
Shweta et al. External feature based quality evaluation of Tomato using K-means clustering and support vector classification
CN116071592A (en) Corn seed variety identification method and system based on hyperspectral incremental updating
Jeny et al. Machine vision-based expert system for automated cucumber diseases recognition and classification
Clark et al. Fabric composition classification using hyper-spectral imaging
Işık et al. Consensus rule for wheat cultivar classification on VL, VNIR and SWIR imaging
Wu et al. Identification of lambda-cyhalothrin residues on Chinese cabbage using fuzzy uncorrelated discriminant vector analysis and MIR spectroscopy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant