CN107918718B - Sample component content determination method based on online sequential extreme learning machine - Google Patents

Sample component content determination method based on online sequential extreme learning machine Download PDF

Info

Publication number
CN107918718B
CN107918718B CN201711068234.XA CN201711068234A CN107918718B CN 107918718 B CN107918718 B CN 107918718B CN 201711068234 A CN201711068234 A CN 201711068234A CN 107918718 B CN107918718 B CN 107918718B
Authority
CN
China
Prior art keywords
sample
data
content
learning machine
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711068234.XA
Other languages
Chinese (zh)
Other versions
CN107918718A (en
Inventor
单鹏
赵煜辉
张贝
淳宝生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University Qinhuangdao Branch
Original Assignee
Northeastern University Qinhuangdao Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University Qinhuangdao Branch filed Critical Northeastern University Qinhuangdao Branch
Priority to CN201711068234.XA priority Critical patent/CN107918718B/en
Publication of CN107918718A publication Critical patent/CN107918718A/en
Application granted granted Critical
Publication of CN107918718B publication Critical patent/CN107918718B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass

Landscapes

  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a sample component content determination method based on an online sequence extreme learning machine, which comprises the following steps: collecting a spectral data sample of a sample, and modeling by utilizing an online sequential extreme learning machine algorithm; and determining the component content of the sample by using the established model. The invention carries out modeling by utilizing an online sequential extreme learning machine algorithm, and only the learned knowledge is reserved for later use without reserving the used data; when new spectral data comes, only the hidden layer output of the new coming data needs to be calculated, and then the output weight between the middle hidden layer and the output layer is dynamically updated by utilizing the learned knowledge, so that the rapid modeling can be carried out. Compared with the traditional modeling method, the modeling speed is improved, unnecessary repeated calculation amount and consumption of data storage space are reduced, the precision and generalization performance of the model are improved, and the method can process the data which come one by one at a time and can also process the data which come one by one.

Description

Sample component content determination method based on online sequential extreme learning machine
Technical Field
The invention relates to a sample component content determination method based on an online sequence extreme learning machine, and belongs to the technical field of sample component content determination.
Background
The near infrared spectrum technology is a rapid, lossless and low-cost indirect analysis technology, the near infrared spectrum of a sample can be rapidly measured by using an infrared spectrometer, and a multivariate calibration model between the near infrared spectrum of the sample and the content of effective components is established by combining a chemometrics method, so that the response components of an unknown sample can be predicted. However, in actual use the near infrared spectral data is not generated once, but is generated in a stream. If a model is established on an existing data sample, and a new data sample may be generated along with the change of time, in order to improve the generalization performance and the prediction accuracy of the model, the newly generated data and the previous data need to be modeled together. The simplest and most direct method is to rerun all the existing data through the original algorithm, but the method is acceptable when the data volume is small, and if the data is measured by GB, the newly arrived data sample can be as many as several MB, so that the original data and the new data are together modeled, which is time-consuming and labor-consuming, sometimes the previous newly arrived data is not processed, and updated data arrives, and obviously, complete re-modeling is impossible under the condition. Online streaming algorithms are also increasingly being adapted to feed forward neural networks with Radial Basis Function (RBF) nodes. Many algorithms appear in the development process of processing the online streaming learning algorithm, and the typical algorithms are GAP-PBF algorithm and GGAP-RBF algorithm. It is desirable that these algorithms simplify the learning process and increase the learning speed, and that these algorithms require information on the distribution of input samples or the order of input samples. However, the modeling speed of the algorithms is still slow, and the generalization performance is also common. And these algorithms can only process new data one by one instead of one by one block.
In addition, in the process of measuring the near infrared spectrum, the original multivariate calibration model loses effect due to different measuring instruments or the change of measuring conditions, and the re-establishment of the model is time-consuming and labor-consuming, and even the re-modeling is not feasible at some time. It is more acceptable to do a calibration shift to correct the spectral data of the main instrument and the other instrument (sub-instrument). In essence, the spectra of the sub-instruments are transformed to look more like the data of the main spectrometer, which can then be processed using a model of the main spectrometer. Over the past few years, different calibration migration techniques have been developed, and common calibration migration methods include: a multivariate scatter correction method (abbreviated MSC), a direct normalization method (abbreviated DS), an indirect normalization method (abbreviated PDS), a typical correlation analysis method (abbreviated CCA), and the like. However, the existing calibration migration method still has the problems of poor component content prediction accuracy and stability; in addition, the multivariate scatter calibration method needs to measure an ideal spectrum of a sample to be measured and then correct other measured spectra by using the ideal spectrum, but it is difficult to obtain a so-called ideal spectrum in practical application.
Disclosure of Invention
The invention aims to provide a sample component content measuring method based on an online sequence extreme learning machine, which can effectively solve the problems existing in the prior art, in particular the problems that the existing algorithm has low modeling speed and general generalization performance, and only can process new data one by one and cannot process the data block by block.
In order to solve the technical problems, the invention adopts the following technical scheme: a sample component content determination method based on an online sequence extreme learning machine is characterized in that a spectral data sample of a sample is collected, and modeling is carried out by utilizing an online sequence extreme learning machine algorithm; and determining the component content of the sample by using the established model.
The method for measuring the content of the sample components based on the online sequential extreme learning machine specifically comprises the following steps:
s1, according to the initial main spectrum SPmaster(0)And the corresponding content y of the sample component0And the number L of nodes of the hidden layer, and an initial weight matrix α from the hidden layer to the output layer is calculated(0)Wherein, SPmaster(0)And y0Comprising M0A sample is obtained;
s2, when there is a new main spectrum SPmaster(k+1)And the corresponding content y of the sample componentk+1At the time of arrival, a weight matrix α from the hidden layer to the output layer is calculated according to an online sequence extreme learning machine algorithm(k+1)(ii) a Wherein, the k +1 th arriving data SPmaster(k+1)And yk+1Comprising Mk+1A sample is obtained; k is more than or equal to 0;
s3, if a new main spectrum SP still existsmaster(k+1)And the corresponding sample component content yk+1If yes, let k be k +1, go to S2, otherwise go to S4;
s4, calculating and obtaining the spectral data sp of the sample according to the following formulamasterPreThe corresponding component content prediction value pre _ y:
pre_y=Hpreα(new)
wherein the content of the first and second substances,
Figure BDA0001456253940000021
α(new)is the latest hidden-to-output layer weight matrix, spmasterPreComprises N samples; w and b are respectively a randomly generated orthogonal input weight matrix and an offset; g (w, sp)masterPreAnd b) is an activation function.
Preferably, the method further comprises the following steps: modeling spectral data samples of a main spectrometer and samples collected from the spectrometer by utilizing an online sequential extreme learning machine algorithm to realize the migration of spectral data of the slave spectrometer to a spectral data space of the main spectrometer; and then, measuring the component content of the sample by using a content prediction model established by the main spectrometer.
More preferably, the modeling of the spectrum data samples of the main spectrometer and the sample collected from the spectrometer by using the online sequential extreme learning algorithm, so as to migrate the spectrum data of the slave spectrometer to the spectrum data space of the main spectrometer, includes the following steps:
s01, according to the initial main spectrum SPmaster0And from spectrum spslave0And number of hidden nodes L, generating a weight matrix β from hidden to output(0)Wherein SPmaster0The number of samples contained in (1) is M0
S02, when there is new inclusion Mk+1SP of individual samplemaster(k+1)And spslave(k+1)At the time of arrival, a weight matrix β from the hidden layer to the output layer is calculated according to an online sequence extreme learning machine algorithm(k+1)(ii) a Wherein k is more than or equal to 0;
s03, if there is a new SP of the samplemaster(k+1)And spslave(k+1)When the arrival comes, k is made to be k +1, and the process goes to S02, otherwise, the process goes to S04; s04, testing data sp containing N samples according to the following formulaslaveTestCarrying out migration:
spslaveTomaster=H'testβnew
wherein spslaveTomasterRepresenting the spectrum data after migration βnewA weight matrix representing the latest hidden layer to the output layer;
Figure BDA0001456253940000031
w and b are respectively a randomly generated orthogonal input weight matrix and an offset; g (w, sp)slaveTestAnd b) is an activation function.
In the method for measuring the component content of the sample based on the online sequential extreme learning machine, the number L of hidden nodes is less than or equal to the number M of initial samples0. Therefore, the OSELM-based sample component content prediction model has higher calculation speed and consumes less system resources.
Preferably, in step S1,
α(0)=(H0 TH0)-1H0 Ty0
wherein the content of the first and second substances,
Figure BDA0001456253940000032
preferably, in step S2,
Figure BDA0001456253940000035
wherein the content of the first and second substances,
Figure BDA0001456253940000033
Figure BDA0001456253940000034
preferably, in step S01,
β(0)=(H'0 TH'0)-1H'0 Tspmaster0
wherein the content of the first and second substances,
Figure BDA0001456253940000041
preferably, in step S02,
Figure BDA0001456253940000042
wherein the content of the first and second substances,
Figure BDA0001456253940000043
Figure BDA0001456253940000044
in the invention, the optimal number L of hidden nodes is determined by a k-fold cross validation method; the sigmoid function is adopted as the activation function, so that the prediction precision of the content of the sample components can be improved.
The method for measuring the component content of the sample is suitable for all kinds of spectrum samples in online detection, and particularly has better effect on measuring the component content of tablets and corns.
Preferably, the spectral data of the slave spectrometer is migrated to the spectral data space of the master spectrometer as a new master spectrum SPmaster(k+1)Measuring the component content of the sample by using a content prediction model established by a main spectrometer to obtain the corresponding component content y of the samplek+1Let k be k +1, and go to S2. Thereby, the prediction accuracy of the model can be further improved.
Compared with the prior art, the method utilizes the online sequential extreme learning machine algorithm to carry out modeling, so that the previously used data is not required to be reserved, and only the previously learned knowledge is reserved for later use; when new spectral data comes, only the hidden layer output of the new coming data needs to be calculated, and then the output weight between the middle hidden layer and the output layer is dynamically updated by utilizing the learned knowledge, so that the rapid modeling can be carried out. Compared with the traditional modeling method, the modeling speed is increased, unnecessary repeated calculation amount and consumption of data storage space are reduced, the precision and generalization performance of the model are improved, and the method can process the data which come one by one at a time and can also process the data which come one by one. In addition, the invention realizes the migration of the spectrum data of the slave spectrometer to the spectrum data space of the master spectrometer by utilizing the online sequence extreme learning machine algorithm to model the spectrum data samples of the master spectrometer and the samples collected from the slave spectrometer; and then, the content of the sample component is determined by using a content prediction model established by the main spectrometer, so that the precision and the stability of the content prediction of the sample component are improved. The experimental results show that: the calibration migration method based on the online sequence extreme learning machine algorithm shows better performance on the tablet data set and the corn data set compared with the PDS and the spectrum migration algorithm based on the CCA.
Drawings
FIG. 1 shows spectra in the maize dataset (a) for MP5, (b) for M5, (c) for MP 6;
FIG. 2 shows a tablet data set master spectrum (a) slave spectrum (b) with a deviation spectrum (c);
FIG. 3 is a diagram illustrating RMSEP as a function of the number of hidden nodes;
FIG. 4 is the M5 and MP5 deviations after migration (a) and the M5 and MP5 deviations without migration (b) of the maize dataset;
FIG. 5 is a schematic illustration of the effect of modeled sample number on RMSEP;
FIG. 6 is a diagram of the relationship between RMSEP and the number of hidden nodes;
FIG. 7 is a schematic diagram of predicted values of various types of spectra with respect to water;
FIG. 8 is a schematic diagram showing the predicted values of protein (a), starch (b), oil (c), and water (d);
FIG. 9 is a schematic diagram of an ELM-based migration algorithm versus OSELM-based migration algorithm modeling runtime comparison;
FIG. 10 is a schematic representation of the relationship of hidden nodes to the spectrum RMSEP in a tablet data set;
fig. 11 is a schematic diagram of (a) spectral migration residual (b) non-migrated residual based on tablet data;
FIG. 12 is a graphical representation comparing the predicted value to the actual value of the first active ingredient in a pharmaceutical tablet;
FIG. 13 is a graphical representation comparing the predicted value and the actual value of (a) a second active ingredient and (b) a third active ingredient in a pharmaceutical tablet;
FIG. 14 is a graphical comparison of predicted results of PDS, CCA, TLOSELM based on moisture content of corn data;
FIG. 15 is a graphical comparison of predicted results for PDS, CCA, TLOSELM based on the third active ingredient of the tablet;
FIG. 16 is a schematic flow chart of the method of the present invention.
The invention is further described with reference to the following figures and detailed description.
Detailed Description
Example 1 of the invention: a sample component content determination method based on an online sequence extreme learning machine is disclosed, as shown in FIG. 16, a spectral data sample of a sample is collected, and modeling is performed by utilizing an online sequence extreme learning machine algorithm; and determining the component content of the sample by using the established model.
The method specifically comprises the following steps:
s1, according to the initial main spectrum SPmaster(0)And the corresponding content y of the sample component0And the number L of nodes of the hidden layer, and an initial weight matrix α from the hidden layer to the output layer is calculated(0)Wherein, SPmaster(0)And y0Comprising M0A sample is obtained;
s2, when there is a new main spectrum SPmaster(k+1)And the corresponding content y of the sample componentk+1At the time of arrival, a weight matrix α from the hidden layer to the output layer is calculated according to an online sequence extreme learning machine algorithm(k+1)(ii) a Wherein, the k +1 th arriving data SPmaster(k+1)And yk+1Comprising Mk+1A sample is obtained; k is more than or equal to 0;
s3, if a new main spectrum SP still existsmaster(k+1)And the corresponding sample component content yk+1If yes, let k be k +1, go to S2, otherwise go to S4;
s4, calculating and obtaining the spectral data sp of the sample according to the following formulamasterPreThe corresponding component content prediction value pre _ y:
pre_y=Hpreα(new)
wherein the content of the first and second substances,
Figure BDA0001456253940000061
α(new)is the latest hidden-to-output layer weight matrix, spmasterPreComprises N samples; w and b are respectively a randomly generated orthogonal input weight matrix and an offset; g (w, sp)masterPreAnd b) is an activation function.
In order to enable the OSELM-based sample component content prediction model to have higher calculation speed and consume less system resources, the number L of hidden layer nodes is less than or equal to the number M of initial samples0
In the step S1, in the step S,
α(0)=(H0 TH0)-1H0 Ty0
wherein the content of the first and second substances,
Figure BDA0001456253940000062
in the step S2, in the step S,
Figure BDA0001456253940000063
wherein the content of the first and second substances,
Figure BDA0001456253940000064
Figure BDA0001456253940000065
in order to perform accurate content prediction on spectral data collected from a spectrometer, the method further comprises the following steps: modeling spectral data samples of a main spectrometer and samples collected from the spectrometer by utilizing an online sequential extreme learning machine algorithm to realize the migration of spectral data of the slave spectrometer to a spectral data space of the main spectrometer; and then, measuring the component content of the sample by using a content prediction model established by the main spectrometer.
The method for modeling the spectrum data samples of the main spectrometer and the samples collected from the spectrometer by utilizing the online sequence extreme learning machine algorithm to realize the migration of the spectrum data of the slave spectrometer to the spectrum data space of the main spectrometer comprises the following steps:
s01, according to the initial main spectrum SPmaster0And from spectrum spslave0And number of hidden nodes L, generating a weight matrix β from hidden to output(0)Wherein SPmaster0The number of samples contained in (1) is M0
S02, when there is new inclusion Mk+1SP of individual samplemaster(k+1)And spslave(k+1)At the time of arrival, a weight matrix β from the hidden layer to the output layer is calculated according to an online sequence extreme learning machine algorithm(k+1)(ii) a Wherein k is more than or equal to 0;
s03, if there is a new SP of the samplemaster(k+1)And spslave(k+1)When the arrival comes, k is made to be k +1, and the process goes to S02, otherwise, the process goes to S04; s04, testing data sp containing N samples according to the following formulaslaveTestCarrying out migration:
spslaveTomaster=H'testβnew
wherein spslaveTomasterRepresenting the spectrum data after migration βnewA weight matrix representing the latest hidden layer to the output layer;
Figure BDA0001456253940000071
w and b are respectively a randomly generated orthogonal input weight matrix and an offset; g (w, sp)slaveTestAnd b) is an activation function.
In order to enable the OSELM-based sample component content prediction model to have higher calculation speed and consume less system resources, the number L of hidden layer nodes is less than or equal to the number M of initial samples0
Specifically, in step S01,
β(0)=(H'0 TH'0)-1H'0 Tspmaster0
wherein the content of the first and second substances,
Figure BDA0001456253940000072
in the step S02, in the step S,
Figure BDA0001456253940000073
wherein the content of the first and second substances,
Figure BDA0001456253940000074
Figure BDA0001456253940000075
in the method, the optimal number L of hidden nodes is determined by a k-fold cross validation method; the activation function adopts a sigmoid function.
Example 2: a sample component content determination method based on an online sequence extreme learning machine is characterized in that a spectral data sample of a sample is collected, and modeling is carried out by utilizing an online sequence extreme learning machine algorithm; and determining the component content of the sample by using the established model.
The method specifically comprises the following steps:
s1, according to the initial main spectrum SPmaster(0)And the corresponding content y of the sample component0And the number L of nodes of the hidden layer, and an initial weight matrix α from the hidden layer to the output layer is calculated(0)Wherein, SPmaster(0)And y0Comprising M0A sample is obtained;
s2, when there is a new main spectrum SPmaster(k+1)And the corresponding content y of the sample componentk+1At the time of arrival, a weight matrix α from the hidden layer to the output layer is calculated according to an online sequence extreme learning machine algorithm(k+1)(ii) a Wherein, the k +1 th arriving data SPmaster(k+1)And yk+1Comprising Mk+1A sample is obtained; k is more than or equal to 0;
s3, if a new main spectrum SP still existsmaster(k+1)And the corresponding sample component content yk+1If yes, let k be k +1, go to S2, otherwise go to S4;
s4, calculating and obtaining the spectral data sp of the sample according to the following formulamasterPreThe corresponding component content prediction value pre _ y:
pre_y=Hpreα(new)
wherein the content of the first and second substances,
Figure BDA0001456253940000081
α(new)is the latest hidden-to-output layer weight matrix, spmasterPreComprises N samples; w and b are respectively a randomly generated orthogonal input weight matrix and an offset; g (w, sp)masterPreAnd b) is an activation function.
In order to enable the OSELM-based sample component content prediction model to have higher calculation speed and consume less system resources, the number L of hidden layer nodes is less than or equal to the number M of initial samples0
Specifically, in step S1,
α(0)=(H0 TH0)-1H0 Ty0
wherein the content of the first and second substances,
Figure BDA0001456253940000082
specifically, in step S2,
Figure BDA0001456253940000091
wherein,
Figure BDA0001456253940000092
Figure BDA0001456253940000093
To verify the effect of the present invention, the inventors compared the migration algorithm based on OSELM (i.e., online sequential extreme learning machine) in the invention with PDS algorithm and CCA-based migration algorithm. During the experiment, the slave spectrum needs to be migrated to the master space using PDS, CCA and OSELM based algorithmic models. And then the transferred spectrum data is brought into a prediction model which is established on the master spectrum and corresponds to the spectrum and the component content to predict the component content.
1.1 Experimental Environment
This experiment was performed based on python 2.7. The operating system of the computer is win8.1, 64-bit operating system, the CPU is AMD A84500, and the memory is 8 GB. Several commonly used packages of python were used for the experiments such as: numpy, matplotpy, and sklern packages. The programs used in the experiment are developed and completed on an integrated development environment Eclipse.
1.2 Experimental data
In this experiment, a corn NIR spectral data set was used, which contained eighty different NIR data samples. The data set contains three NIR spectral data sheets of M5, MP5 and MP6 (these three spectral data sheets are NIR spectral measurements from different spectrometers on the same substance, i.e. corn), and the data set also contains physicochemical characteristics corresponding to the eighty spectral data samples, such as: water content (water), oil (soil), protein content (protein), starch content (starch). The spectral data ranged in wavelength from 1100nm to 2498nm, spaced 2nm apart (containing 700 channels). The data sheet for the spectrum MP5 is from FOSS nissystem 5000 as the master (master) instrument, while M5, MP6 are from FOSS nissystem 6000 and FOSS nissystem 5000 as slave (slave) spectra, respectively.
In fig. 1, (a) the subgraph is the spectrum of MP5, (b) the subgraph is the spectrum of M5, and (c) the subgraph is the spectrum of MP6, and it can be seen from fig. 1 that the images of MP5 and MP6 are very similar because they are measured by the same type of spectrometer. The difference between the M5 image and the MP5 image is large, and M5 has a significant upward movement tendency compared with MP 5.
The tablet spectral dataset, which was a copy of NIR spectral data published by IDRC in 2002, and which included 654 tablets from two spectrometers, was also used in this experiment. These two spectrometers are FOSS NIRSs and Silver-Spring, respectively. The two NIR spectral data from different spectrometers are divided into a calibration set (calibration set containing 155 spectral data samples) and a test set (test set containing 460 spectral data samples) and also a validation set (validation set containing 40 spectral data samples). The spectral wavelengths in the data set were centered between 1100nm and 1750 nm.
In FIG. 2, the two spectra SPEC of the validation set in the tablet data set are shown in FIG. 2 for panels (a) and (b), respectively1And spectrometer SPEC2The above collection results for the same substance. The difference between the spectral results measured at SPEC1 and the results measured at SPCE2 is shown in panel (c) of fig. 2. From the results of sub-graph (c), it can be seen that the difference in the spectra obtained on the two spectrometers is not very significant over most of the wavelength bands. The results of sub-graph (c) can even be said to be very similar, since the difference tends to be substantially 0 and the band change only after 1700nm starts to be relatively large.
1.3 design of the experiment
1.3.1 corn-based experiments
The NIR spectral data of corn was divided into triplicates, and the data was divided into calibration set (calibration set), validation set (validation set) and test set (test set). The calibration set contained 56 NIR spectral data, the validation set 8 NIR spectral data, and the test set 16 NIR spectral data. The calibration set is used to build an online migration model, and the verification set is used to select the optimal number of hidden nodes. The test set is used to represent the generalization performance of the algorithm.
1. Selection of the number L of parametrically optimal hidden nodes
In order to select the number of hidden nodes that is optimal for the migration model, the inventors used 8-fold cross validation to select. And drawing a root mean square error graph obtained by cross validation of different hidden layer nodes. The calculation formula of the spectrum root mean square error after migration is as follows:
Figure BDA0001456253940000101
equation (1) is the equation for calculating the root mean square error, where NumSample represents the number of samples of the batch of NIR spectra to be migrated. It can be seen from fig. 3 that as the number of hidden nodes increases, the RMSEP first decreases and then slowly increases, and when the number of hidden nodes reaches a certain number, the increasing speed of the RMSEP is obviously increased. From the figure it follows that: the hidden node setting 19 is most suitable in terms of root mean square error of spectral shift.
Therefore, the hidden node L of the model is set to the optimal number of nodes 19. Giving algorithm 32 sp initiallymasterNIR spectral data samples and 32 spslaveNIR spectral data samples. As input and output, respectively, of the model, which is substantially spslaveTo spmasterEstablishing an initial migration model may result in an initial output weight matrix β0Then, assuming one NIR spectral data sample comes each time, an updated output weight matrix β is calculated based on OSELM's idea(k). And substituting the spectrum into the established migration model when the slave spectrum needing to be migrated exists. The NIR spectral data thus achieve the goal of spatial migration to the master spectrum based on the OSELM algorithm.
The upper subfigure (a) in FIG. 4 is the pair spslaveTestSubtracting the corresponding sp from the result obtained after migrationmasterTestAs a result, the lower subfigure (b) is spslaveTestDirect subtraction of spmasterTestThe result of (1). From the upper graph (a), it can be seen that the modeling of the transfer learning is realized by the OSELM, and most of sp in the upper sub graph (a)slaveTestAll have better migration effect, sp after migrationslaveTestAnd spmasterTestThe difference fluctuates slightly substantially around zero. While the slave spectrum without migrationThe difference from the master spectrum fluctuates between 0.04 and 0.06.
2. Influence of initial sample number on algorithm precision during calibration and migration of spectrum
The model is a migration learning model built based on OSELM. One problem with initialization in the OSELM algorithm is how many samples to use as the initial owned samples to generate the base model. It is likely that the initial samples will have different effects on the established migration model, and the following table 1 shows the effect of the initial sample number on the model.
The results also conform to the inclusion of the OSELM algorithm, with all incoming data samples ultimately utilized in the online modeling of the streaming incoming NIR spectral data set, with OSELM and ELM being substantially different in nature, with OSELM as an online modeling algorithm that reduces the number of unnecessary iterations, however, if the number of initially owned NIR samples is less than the number of hidden nodes, then output matrix β is solved to solve(0)The problem of solving the violation matrix may be involved, so that the root mean square error of the migration processing result is larger.
TABLE 1 initialization sample number vs. RMSEP
Figure BDA0001456253940000111
3. Influence of each sequentially arriving data block size on algorithm precision
In the above experiments, it is assumed that samples come one by one, and it is seen below whether the manner in which NIR spectral data comes has an influence on the accuracy of the model. The default setting of the experiment is that the number of hidden nodes is 15, and the number of samples is 32 at the beginning. Step in table 2 represents the number of NIR spectral samples per time. It can be seen from the table that the NIR spectral data samples, whether piecemeal or piecemeal, have no effect on the performance of the algorithm. The incoming NIR samples are eventually used so that for testing, the root mean square error for prediction is the same as long as they are modeled using the same number of NIR samples and number of hidden nodes L.
TABLE 2 influence of streaming data sample size on RMSEP
Figure BDA0001456253940000121
4. Influence of number of samples on prediction precision based on OSELM migration model
FIG. 5 shows NIR spectral data and sp shifted by OSELM as NIR spectral data samples arrive sequentiallymasterTestThe magnitude of the root mean square error. It can be seen that as online learning progresses with more and more NIR data samples accumulated, the general trend for the root mean square error after test set migration is smaller and smaller. It can be seen in the figure that the prediction error increases at about the time of the 55 th and 56 th samples, which may be the occurrence of relatively abnormal data samples, resulting in the situation becoming worse. But in terms of the predicted general trend, the longer the online learning time is, the more the accumulated knowledge about the corn spectrum data is, and the accuracy of the transfer learning is improved. From the data in the figure, it can be concluded that: the more the number of the incoming samples is, the more accurate the model is, the more stable the generalization capability is. This conclusion is equivalent to the need to respond to online modeling, and if data arrives not all at once but one block or one block at a time and there is a prediction need in the interim, then a model must be built based on the existing sample data, and when a sample arrives next time, the model is updated without reconstructing the model from scratch, which is the connotation of OSELM.
Previous experimentsThe root mean square error between the spectrum after migration and the main spectrum is discussed, but the ultimate goal of the migration model is to make the spectrum at SPECslaveUp collected spslaveCan be utilized in SPECmasterOn-line update model built above, i.e. ultimately it is desired to pass SPECmasterThe model built above predicts about spslaveThe physical and chemical properties of the composition. The effect of migration will therefore be seen in the later tests from the high level of physicochemical properties. Due to spslaveTo spmasterThe process of migration is performed online, thus for spmasterThe prediction model with the corresponding physicochemical characteristic y also needs to be dynamically updated. In the following experiment, sp collected on a slave spectrometer was first collectedslaveMigration of online to spmasterSubstituting the shifted spectrum into sp-based spectrummasterAnd y. Sp passage is required in maize data setsslaveThe data were predicted for moisture, protein, oil, and starch content.
5. Looking at the performance of a migration model at the height of physicochemical indexes
Firstly, selecting the optimal number of hidden nodes of a prediction model from a main spectrum to physicochemical characteristics through k-fold cross validation. In the following experiment, the spectrum data after migration was substituted into the newly established physicochemical characteristic prediction model. As can be seen from table 3, L represents the number of hidden nodes of the migration model, and RMSEP represents the root mean square error with respect to the prediction of physicochemical properties. It can be seen that if the number of hidden nodes is set to be too small, the root mean square error of the prediction result is relatively large, and if the number of hidden nodes is larger than a certain point, the prediction error starts to increase. It is very obvious in the table that when the number of hidden nodes is larger than the number of initial samples, the calculated root mean square error becomes large sharply.
TABLE 3 influence of number of hidden nodes on RMSEP of physicochemical indices
Figure BDA0001456253940000131
In order to show the result more clearly, the relationship between RMSEP and the number L of hidden nodes can be plotted by a broken line. The trend can be seen more intuitively in fig. 6.
The abscissa in FIG. 7 represents watertestAnd the ordinate represents the corresponding predicted value pre _ water. Triangle representation directly combines spslvaeTestThe predicted value obtained by the prediction model with the physicochemical characteristics is introduced, and the pentagram represents spslaveTestAnd carrying out migration and then bringing the migration into a prediction value of a prediction model of the physicochemical property. The plus sign indicates spmasterTestThe predicted values are brought into the model. It can be seen in FIG. 7 that the predicted physicochemical properties of the migrated spectral data are very close to the true values if SPECslaveDirect use of SPEC without migration of the spectra collected abovemasterThe prediction error is very large by the aid of the physicochemical characteristic prediction model. From FIG. 7, it can be seen that the OSELM-based migration model has achieved great success on the maize dataset.
As shown in fig. 8, triangles, stars and plus signs in the subgraph all have the same meaning as expressed in the upper graph. Wherein, the abscissas of the subgraphs (a), (b), (c) and (d) are protein, starch, oil content and water content in turn, and the ordinate of each subgraph is the predicted value of the corresponding abscissa in turn. The first panel depicts a prediction model established for proteins based on physicochemical properties of maize. The second sub-graph depicts a predictive model established for starch (starch) based on physicochemical properties in the corn data set. The third sub-graph depicts a prediction model based on oil content in corn. The fourth figure is the top figure 7. The effect of spectral data migration can be clearly seen from the results in fig. 8. In the four subgraphs (a), (b), (c) and (d), a triangle and a star have obvious decomposition intervals.
6. Operational efficiency of OSELM
Screenshot 9 the first line represents the time required for a conventional ELM to run once. The second row represents the time required for the OSELM to migrate 11 times online. The third row represents the time required if the model is modeled 11 times with the ELM algorithm. From FIG. 9, it can be seen that the time to run the ELM migration algorithm once for this experiment on the corn data set is approximately 0.02 seconds, while the OSELM migration algorithm modeled 11 runs for approximately 0.04 seconds. It is not difficult to derive OSELM-based migration models by simple calculations much more than ELM-only migration models. OSELM is faster than traditional ELM iterative modeling because OSELM reduces the amount of iterative computations. OSELM retains knowledge of the last time it modeled, and it only needs to model new knowledge if new data arrives.
7. Summary of experiments based on maize dataset:
from the above experimental results, it can be seen that the OSELM-based NIR spectral data migration model exhibits relatively ideal performance on the corn data set. The result of substituting the spectrum after migration and the spectrum without migration into a prediction model about physicochemical properties is obviously different. Compared with the traditional online learning algorithm, the algorithm has the advantages of being faster and stronger in adaptability. The algorithmic model can resolve whether the data arrives one at a time or two, three, or more on the corn data set. And the algorithm does not need to specify the size of the next brought data volume before starting to run. The algorithm makes it possible to predict the corresponding water content, oil content, protein content and starch content by directly utilizing the model on the master instrument without repeated modeling of the silver NIR spectrum.
1.3.2 tablet-based experiments
The data in the tablet data set was collected on two different spectrometers. Spectrometer 1 is herein denoted SPEC for convenience of description below1 Spectrometer 2 as SPEC2。SPEC1The collected NIR spectral data are recorded as sp1,SPEC2The collected NIR spectral data is recorded as sp2. When the data are introduced, the data set is provided with 9 different tables: calibration set (calibrate _1, calibrate _2, calibrate _ Y), validation set (valid _1, valid _2, valid _ Y), test set (test _1, test _2, test _ Y). Wherein reference numerals 1, 2 denote NIR spectral data from the spectrometer SPEC1And SPEC2. In machine learning algorithms, calibrarte set is typically used to build the model and calibrateset is the hyper-parameter used to select the algorithm model. Test set is typically used to Test the performance of the model, which is then examinedAnd bringing the established model to obtain a result. The result can be used for judging the generalization ability and the accuracy of the model in the same way as the mode whether the overfitting phenomenon occurs. But only 155 samples in the calibre set and up to 460 samples in the test set. Such a situation does not conform to the test set settings in general machine learning algorithms, and therefore the experimental data is subdivided in the present experiment. In the experiment, valid set is not needed for a while, and the change of the hyper-parameter in the algorithm can be shown by the result of a plurality of experiments to cause the change of the performance of the algorithm. The data for the test is assumed in this experiment to be 40 NIR spectral data samples.
1. Selecting optimal number of hidden layer nodes based on spectrum migration
The optimal number of hidden nodes must be chosen differently for different datasets, tablet datasets have a greater number of NIR spectral samples than corn datasets, and therefore β is sought in building a preliminary migration model(0)The initial number of samples is set to 100. In such a context, it is desirable to explore the selection of the rms error and the number L of hidden nodes based on spectral shifts.
The abscissa hiddenNum in fig. 10 below represents the root mean square error of the spectrum after migration and the corresponding main spectrum, and the ordinate RMSEP corresponds to the number of hidden nodes in the migration algorithm. It can be seen in fig. 10 that RMSEP exhibits an unstable state fluctuating up and down with an increase in the number of hidden nodes, but still exhibits a downward trend in terms of the general trend, with the RMSEP value being minimal when hiddenNum is about 59. As hiddenNum continues to increase, RMSEP begins to exhibit a slowly rising trend. The RMSEP in the graph can be seen to increase dramatically when the number of hidden nodes exceeds the number of samples originally assigned.
It can be seen from FIG. 11 that the NIR spectral data after OSELM algorithm migration is largely centered around 0 fluctuations and generally ranges from [ -0.02,0.02]. But the migration effect of a few samples is not very good. This is probably because the difference of the original data itself is small. When the graph is carefully viewed, it can be found that the change in SPEC1NIR data collected above and at SPEC2The difference of the NIR data collected above is [ -0.1,0.1 [ -0.1 [)]And can be seen from the figureThe difference between the spectra is substantially close to 0 out of most of the bands.
2. Effect of selection of activation function on algorithm accuracy
Sometimes, the selection of the activation function has a direct influence on the performance of the algorithm, for example, the tanh function maps the final value between [ -1,1], and the sigmoid function maps the final value between [0,1], so that the obtained result difference is relatively large. Therefore, the influence of the activation function on the algorithm precision is to be researched in the experiment of the section. The usual sigmoid, tanh, tribas, hardlim functions are used here. The following experiment starts with the selection of the optimal activation function. In table 4, RMSEP indicates the root mean square error of the predicted value of the first physicochemical characteristic in the tablet data set, N indicates the number of samples participating in modeling, 20 NIR samples are initially input in the experiment, and the hidden node L is set to 15. From experimental results, it can be found that the RMSEP shows a downward trend as the number of samples learned by the model increases no matter which activation function is selected. Comparing the data in the table can learn that the sigmoid function has better performance on the tablet data set than the tanh, tribas and hardlim functions.
TABLE 4 Effect of activation function on Algorithm accuracy
Figure BDA0001456253940000151
Figure BDA0001456253940000161
3. Initializing the relation between the input sample number and the hidden node
The following experiment explores how the number of NIR spectral data samples input at initialization and the number of hidden nodes L set up will affect the performance of the algorithm when the migration model migrates tablet NIR spectral data. In table 5, the horizontal axis represents the number L of hidden nodes, Sn represents the number of NIR samples input at initialization, and the number in the middle of the table represents RMSEP. The empty space in table 5 indicates that the error is large here. It is obvious from the number change in the table that RMSEP increases sharply once the number of hidden nodes is set to exceed the number of samples at the beginning, and sometimes the error can reach thousands. It is thus verified again that the number of hidden nodes cannot exceed the initial number of samples. The numbers in the table also show that the variation in the number of samples of the NIR spectrum input for initialization has substantially no effect on RMSEP when the hidden node is unchanged (but the number of samples needs to be equal to or greater than the number of hidden nodes).
TABLE 5 tablet data set initialization input sample number versus hidden layer node relationship
Figure BDA0001456253940000162
4. Effect of each incoming data block size on algorithm accuracy
The following table is used to indicate whether the size of the data block streamed in has an impact on the accuracy of the algorithm. In this experiment, the hidden node is set to 90, and the number of initialized NIR samples is 100.
As can be seen from the data in table 6, the size of each incoming data does not affect the accuracy of the algorithm when streaming data is processed by the OSELM algorithm. The OSELM algorithm does not need to know the next incoming data block size. And the OSELM algorithm has the advantage that most other algorithms do not have that it can handle either one incoming data sample or a data block of varying size.
TABLE 6 fast sample size of streaming data vs. RMSEP
Figure BDA0001456253940000163
Figure BDA0001456253940000171
5. From the high level of physicochemical indices, consider the performance of OSELM-based migration models
Using the tablet NIR data set, the spectral data measured from the instrument SPEC2 is first shifted online to the spectral space of the master SPEC2 and then used to generate predicted values for the prediction model for the first active ingredient in the tablet. The predicted effect of the first active ingredient on the spectra collected from the instrument can be observed in figure 12. SP1_ predict in fig. 12 indicates the predicted value obtained by substituting the test data of the main apparatus into the first active ingredient prediction model of the tablet. TLSP2_ predict represents the predicted value based on the first active ingredient prediction model after using the OSELM migration model. SP2_ predict indicates that SP2 directly substituted without migration into the prediction model for the first active ingredient in the tablet to obtain a predicted value. The figure shows that there are clear lines of resolution between the triangles and the plus and the stars, and thus the effect on the nominal shift from the spectrum is still significant.
Plotted in figure 13 are the predicted values of SP1_ test, TLSP2_ test, SP2 for the second and third active ingredients in the tablets. Although the five stars and the plus signs do not return to the straight line, which only indicates that the parameters are not well selected or the data set has poor effect on the physicochemical characteristic prediction model, the first sub-graph still shows that the predicted values of the plus signs and the five stars are basically the same, and the triangles obviously have obvious segmentation with the five stars and the plus signs. The second sub-graph shows about the same performance as the first sub-graph and will not be discussed here.
1.3.3 comparative test
Whether the model established above is superior to other algorithms or not needs to be known by a comparison party. The PDS algorithm and CCA algorithm described in the background were used in comparative experiments. The PDS algorithm has window size as the hyper-parameter, and different window sizes will have certain influence on the algorithm effect. The table below shows the performance of the PDS in different windows by selecting different window value sizes. The PDS, the CCA and the TLOSELM (namely, the OSELM-based migration algorithm provided by the invention) are used for migrating the sample spectrum data, and then the migrated model is substituted into an online-established prediction model about physicochemical characteristics (the hyper-parameters of the prediction model are selected through cross validation), so that the physicochemical characteristic value corresponding to the migrated spectrum can be predicted. The root mean square error of only the two physicochemical properties, moisture and protein, were compared in the corn data set. Comparison of the root mean square error of the predicted values of the three active ingredients was performed in the tablet data set.
Tables 7 and 8 show a comparison of the distribution representing the prediction error of the three algorithms on corn moisture and protein content. Table 9, table 10, 11 distributions show the predicted error of PDS, CCA, TLOSELM distribution on the tablet data set for the first, second, and third active ingredients. The letter N in the above tables indicates the number of samples used to construct the migration model, and RMSEP indicates the root mean square error with respect to physicochemical properties. From the data in table 7 and fig. 9, it can be concluded that in the corn dataset, the shift model based on the CCA algorithm is better than the PDS algorithm when the M5 spectrum shifts to the MP5 spectrum, but the PDS effect is better when the MP6 shifts to the MP5 spectrum. The TLOSELM algorithm (i.e., the online sequential extreme learning machine-based migration algorithm of the present invention) performs best among the three algorithms for the entire corn NIR spectral dataset. In all three algorithms, as the number of modeling samples increases, the RMSEP of the TLOSELM algorithm model and the CCA algorithm model is continuously reduced, but the PDS algorithm shows an unstable aspect, and initially, as the number of samples increases, the RMSEP starts to decrease, and as the number of samples continues to increase, the RMSEP starts to show a growing trend. Tables 9, 10, 11 show that the TLOSELM algorithm is still the best performing of the three algorithms in the pill data set. In addition to predicting a CCA better than a PDS for the first active component, PDS is better than the CCA algorithm in the prediction for the second and third active components. FIG. 14 is a graph of the predicted effect of three algorithms on corn moisture content and FIG. 15 is a graph of the predicted effect of three algorithms on the third active ingredient of the tablet data.
TABLE 7 maize data set error prediction for water under different algorithms
Figure BDA0001456253940000181
TABLE 8 prediction error of maize data set for protein under different algorithms
Figure BDA0001456253940000182
Figure BDA0001456253940000191
TABLE 9 prediction error of first active ingredient of tablet data set
Figure BDA0001456253940000192
TABLE 10 prediction error of second active ingredient in tablet data set
Figure BDA0001456253940000193
TABLE 11 prediction error of third active ingredient for tablet data set
Figure BDA0001456253940000201
It was concluded from the above experiments that: on corn and tablet data sets, the TLOSELM algorithm of the present invention has better performance in shifting from the spectrum to the main spectrum than the PDS algorithm and the CCA algorithm. When using MP6 as the slave spectrum and MP5 as the master spectrum in the corn data set, while still TLOSELM best, it can be seen that the CCA algorithm is not much better than the PDS algorithm. While CCA and TLOSELM have a great advantage over PDS when the tablet set or M5 is the main spectrum from spectrum MP5, and the TLOSELM algorithm of the present invention performs optimally.

Claims (6)

1. A sample component content determination method based on an online sequence extreme learning machine is characterized by comprising the following steps: collecting a spectrum data sample of a sample, and modeling the spectrum data sample of the sample collected on the main spectrometer and the slave spectrometer by utilizing an online sequence extreme learning machine algorithm to realize the migration of the spectrum data of the slave spectrometer to the spectrum data space of the main spectrometer; then, measuring the component content of the sample by using a content prediction model established by a main spectrometer; the method comprises the following steps of utilizing an online sequence extreme learning machine algorithm to model spectral data samples of a main spectrometer and samples collected from the spectrometer, and realizing the migration of spectral data of the slave spectrometer to a spectral data space of the main spectrometer, wherein the online sequence extreme learning machine algorithm comprises the following steps:
s01, according to the initial main spectrum
Figure 473232DEST_PATH_IMAGE002
And from the spectrum
Figure 859214DEST_PATH_IMAGE004
And the number L of hidden layer nodes, and generating a weight matrix from the hidden layer to the output layer
Figure 232427DEST_PATH_IMAGE006
Wherein
Figure 96478DEST_PATH_IMAGE002
The number of samples contained in
Figure DEST_PATH_IMAGE008
(ii) a Wherein the content of the first and second substances,
Figure 707588DEST_PATH_IMAGE009
Figure DEST_PATH_IMAGE010
s02, when there is a new inclusion
Figure 389105DEST_PATH_IMAGE011
Of a sample
Figure 859400DEST_PATH_IMAGE013
And
Figure 418820DEST_PATH_IMAGE015
when arriving, the weight matrix from the hidden layer to the output layer is calculated according to the algorithm of the online sequence extreme learning machine
Figure 290961DEST_PATH_IMAGE017
(ii) a Wherein the content of the first and second substances,
Figure 877800DEST_PATH_IMAGE019
Figure 100971DEST_PATH_IMAGE021
Figure 103562DEST_PATH_IMAGE023
Figure 220423DEST_PATH_IMAGE025
Figure 119109DEST_PATH_IMAGE027
s03, if there is new sample
Figure 688630DEST_PATH_IMAGE013
And
Figure 432595DEST_PATH_IMAGE015
when the current signal arrives, k = k +1, and go to S02, otherwise go to S04;
s04, testing data containing N samples according to the following formula
Figure 341645DEST_PATH_IMAGE029
Carrying out migration:
Figure 768822DEST_PATH_IMAGE031
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE033
representing the migrated spectral data;
Figure 560061DEST_PATH_IMAGE035
a weight matrix representing the latest hidden layer to the output layer;
Figure DEST_PATH_IMAGE036
(ii) a w and b are respectively a randomly generated orthogonal input weight matrix and an offset;
Figure 107717DEST_PATH_IMAGE037
is an activation function.
2. The method for measuring the content of the components in the sample based on the online sequential extreme learning machine as claimed in claim 1, which comprises the following steps:
s1, according to the initial main spectrum
Figure 199170DEST_PATH_IMAGE002
And the corresponding sample component content
Figure 174079DEST_PATH_IMAGE039
And the number L of nodes of the hidden layer, and calculating an initial weight matrix from the hidden layer to the output layer
Figure 718193DEST_PATH_IMAGE041
Wherein, in the step (A),
Figure 69540DEST_PATH_IMAGE002
and
Figure 687603DEST_PATH_IMAGE039
comprises
Figure 725091DEST_PATH_IMAGE008
A sample is obtained;
s2, when there is a new main spectrum
Figure 366288DEST_PATH_IMAGE013
And the corresponding sample component content
Figure 380380DEST_PATH_IMAGE043
When arriving, the weight matrix from the hidden layer to the output layer is calculated according to the algorithm of the online sequence extreme learning machine
Figure 118529DEST_PATH_IMAGE045
(ii) a Wherein, the k +1 th coming data
Figure 700820DEST_PATH_IMAGE013
And
Figure 219526DEST_PATH_IMAGE043
comprises
Figure DEST_PATH_IMAGE047
A sample is obtained;
Figure 240572DEST_PATH_IMAGE048
s3, if there is also a new main spectrum
Figure 770910DEST_PATH_IMAGE013
And corresponding sample component content
Figure 320840DEST_PATH_IMAGE043
If yes, let k = k +1, go to S2, otherwise go to S4;
s4, calculating and obtaining the spectral data of the sample according to the following formula
Figure 831237DEST_PATH_IMAGE050
Corresponding component content prediction value
Figure DEST_PATH_IMAGE052
Figure DEST_PATH_IMAGE054
Wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE055
the latest hidden-to-output layer weight matrix,
Figure DEST_PATH_IMAGE057
comprises N samples; w and b are respectively a randomly generated orthogonal input weight matrix and an offset;
Figure 983870DEST_PATH_IMAGE058
is an activation function.
3. The method for measuring the content of the sample components based on the online sequential extreme learning machine according to claim 1 or 2, wherein the number L of hidden nodes is less than or equal to the number of initial samples
Figure 368715DEST_PATH_IMAGE008
4. The method for determining the content of a sample component based on an on-line sequential extreme learning machine according to claim 3, wherein in step S1,
Figure DEST_PATH_IMAGE059
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE060
5. the method for determining the content of a sample component based on an on-line sequential extreme learning machine according to claim 4, wherein in step S2,
Figure 886284DEST_PATH_IMAGE062
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE064
Figure DEST_PATH_IMAGE066
Figure DEST_PATH_IMAGE068
6. the method for measuring the content of the sample components based on the online sequential extreme learning machine according to claim 1 or 2, characterized in that the optimal number L of hidden nodes is determined by a k-fold cross validation method; the activation function adopts a sigmoid function.
CN201711068234.XA 2017-11-03 2017-11-03 Sample component content determination method based on online sequential extreme learning machine Active CN107918718B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711068234.XA CN107918718B (en) 2017-11-03 2017-11-03 Sample component content determination method based on online sequential extreme learning machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711068234.XA CN107918718B (en) 2017-11-03 2017-11-03 Sample component content determination method based on online sequential extreme learning machine

Publications (2)

Publication Number Publication Date
CN107918718A CN107918718A (en) 2018-04-17
CN107918718B true CN107918718B (en) 2020-05-22

Family

ID=61896071

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711068234.XA Active CN107918718B (en) 2017-11-03 2017-11-03 Sample component content determination method based on online sequential extreme learning machine

Country Status (1)

Country Link
CN (1) CN107918718B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515715B1 (en) 2019-06-25 2019-12-24 Colgate-Palmolive Company Systems and methods for evaluating compositions
CN112414966B (en) * 2019-08-21 2022-06-10 东北大学秦皇岛分校 Near infrared spectrum multi-target calibration migration method based on affine change
CN112834546A (en) * 2020-12-01 2021-05-25 上海纽迈电子科技有限公司 Method for testing water content and oil content in plant grains and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095652A (en) * 2015-07-10 2015-11-25 东北大学 Method for testing component in sample based on stacking extreme learning machine
CN106596450A (en) * 2017-01-06 2017-04-26 东北大学秦皇岛分校 Incremental method for analysis of material component content based on infrared spectroscopy
CN106680238A (en) * 2017-01-06 2017-05-17 东北大学秦皇岛分校 Method for analyzing material composition content on basis of infrared spectroscopy
CN106803124A (en) * 2017-01-21 2017-06-06 中国海洋大学 Field migration extreme learning machine method based on manifold canonical and norm canonical

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10057367B2 (en) * 2016-03-02 2018-08-21 Huawei Technologies Canada Co., Ltd. Systems and methods for data caching in a communications network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095652A (en) * 2015-07-10 2015-11-25 东北大学 Method for testing component in sample based on stacking extreme learning machine
CN106596450A (en) * 2017-01-06 2017-04-26 东北大学秦皇岛分校 Incremental method for analysis of material component content based on infrared spectroscopy
CN106680238A (en) * 2017-01-06 2017-05-17 东北大学秦皇岛分校 Method for analyzing material composition content on basis of infrared spectroscopy
CN106803124A (en) * 2017-01-21 2017-06-06 中国海洋大学 Field migration extreme learning machine method based on manifold canonical and norm canonical

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Fault Classification, Location in a Series Compensated Power Transmission Network using Online Sequential Extreme Learning Machine;Aditie Garg等;《IEEE》;20161231;全文 *
基于NIR光谱的半监督在线序列ELM回归算法研究;刘月;《中国优秀硕士学位论文全文数据库 基础科学辑》;20160315(第03期);第11、14-18、42-44、47-52页 *

Also Published As

Publication number Publication date
CN107918718A (en) 2018-04-17

Similar Documents

Publication Publication Date Title
Nunez et al. Regression modeling strategies
CN107918718B (en) Sample component content determination method based on online sequential extreme learning machine
US20190130277A1 (en) Ensembling of neural network models
JP2021047854A (en) Method and apparatus for pruning neural network
WO2022206320A1 (en) Prediction model training and data prediction methods and apparatuses, and storage medium
CN108152239A (en) The sample composition content assaying method of feature based migration
Smit et al. Statistical data processing in clinical proteomics
CN110390561B (en) User-financial product selection tendency high-speed prediction method and device based on momentum acceleration random gradient decline
CN104298893B (en) Imputation method of genetic expression deletion data
WO2023035926A1 (en) Method for predicting microsatellite instability from pathological picture on basis of self-attention mechanism
US20140156569A1 (en) Method and apparatus for improving resilience in customized program learning network computational environments
FR2944125A1 (en) METHOD FOR IDENTIFYING AERODYNAMIC MODELS FOR AIRCRAFT SIMULATION METHOD
WO2020065806A1 (en) Processing device, processing method, and program
Chen et al. A unified recursive just-in-time approach with industrial near infrared spectroscopy application
CN112285056B (en) Method for selecting and modeling personalized correction set of spectrum sample
WO2017083411A1 (en) Wafer point by point analysis and data presentation
US20200227134A1 (en) Drug Efficacy Prediction for Treatment of Genetic Disease
WO2020130947A1 (en) Method and system for predicting quantitative measures of oil adulteration of an edible oil sample
CN111598844B (en) Image segmentation method and device, electronic equipment and readable storage medium
JP6484449B2 (en) Prediction device, prediction method, and prediction program
CN113049500A (en) Water quality detection model training and water quality detection method, electronic equipment and storage medium
JP7482782B2 (en) Sequence-based protein structure and properties determination
US20220318596A1 (en) Learning Molecule Graphs Embedding Using Encoder-Decoder Architecture
CN114154615A (en) Neural architecture searching method and device based on hardware performance
WO2020209086A1 (en) Data analysis device, data analysis method, and data analysis program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant