CN109632693A - A kind of tera-hertz spectra recognition methods based on BLSTM-RNN - Google Patents
A kind of tera-hertz spectra recognition methods based on BLSTM-RNN Download PDFInfo
- Publication number
- CN109632693A CN109632693A CN201811504359.7A CN201811504359A CN109632693A CN 109632693 A CN109632693 A CN 109632693A CN 201811504359 A CN201811504359 A CN 201811504359A CN 109632693 A CN109632693 A CN 109632693A
- Authority
- CN
- China
- Prior art keywords
- terahertz
- spectrum
- blstm
- rnn
- tera
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001228 spectrum Methods 0.000 title claims abstract description 68
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 19
- 239000000126 substance Substances 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000005457 optimization Methods 0.000 claims abstract description 8
- 230000008569 process Effects 0.000 claims abstract description 8
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 5
- 238000001514 detection method Methods 0.000 claims abstract description 5
- 238000013480 data collection Methods 0.000 claims abstract description 3
- 239000000463 material Substances 0.000 claims description 15
- 238000004611 spectroscopical analysis Methods 0.000 claims description 10
- 210000002569 neuron Anatomy 0.000 claims description 6
- 230000003287 optical effect Effects 0.000 claims description 6
- 238000001328 terahertz time-domain spectroscopy Methods 0.000 claims description 6
- 238000009795 derivation Methods 0.000 claims description 5
- 238000010521 absorption reaction Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000001537 neural effect Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000008033 biological extinction Effects 0.000 claims description 2
- 230000005284 excitation Effects 0.000 claims description 2
- 238000009499 grossing Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 210000005036 nerve Anatomy 0.000 claims 1
- 238000012360 testing method Methods 0.000 abstract description 13
- 238000000605 extraction Methods 0.000 abstract description 8
- 238000010183 spectrum analysis Methods 0.000 abstract description 4
- 238000012952 Resampling Methods 0.000 abstract 1
- 238000010606 normalization Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 9
- 230000003595 spectral effect Effects 0.000 description 8
- 238000012706 support-vector machine Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 230000000306 recurrent effect Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 150000002894 organic compounds Chemical class 0.000 description 4
- 238000000513 principal component analysis Methods 0.000 description 4
- 238000000411 transmission spectrum Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- UJOBWOGCFQCDNV-UHFFFAOYSA-N 9H-carbazole Chemical compound C1=CC=C2C3=CC=CC=C3NC2=C1 UJOBWOGCFQCDNV-UHFFFAOYSA-N 0.000 description 2
- AUNGANRZJHBGPY-SCRDCRAPSA-N Riboflavin Chemical compound OC[C@@H](O)[C@@H](O)[C@@H](O)CN1C=2C=C(C)C(C)=CC=2N=C2C1=NC(=O)NC2=O AUNGANRZJHBGPY-SCRDCRAPSA-N 0.000 description 2
- 238000000862 absorption spectrum Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- SVBWNHOBPFJIRU-UHFFFAOYSA-N 1-O-alpha-D-Glucopyranosyl-D-fructose Natural products OC1C(O)C(O)C(CO)OC1OCC1(O)C(O)C(O)C(O)CO1 SVBWNHOBPFJIRU-UHFFFAOYSA-N 0.000 description 1
- DBTMGCOVALSLOR-UHFFFAOYSA-N 32-alpha-galactosyl-3-alpha-galactosyl-galactose Natural products OC1C(O)C(O)C(CO)OC1OC1C(O)C(OC2C(C(CO)OC(O)C2O)O)OC(CO)C1O DBTMGCOVALSLOR-UHFFFAOYSA-N 0.000 description 1
- AUNGANRZJHBGPY-UHFFFAOYSA-N D-Lyxoflavin Natural products OCC(O)C(O)C(O)CN1C=2C=C(C)C(C)=CC=2N=C2C1=NC(=O)NC2=O AUNGANRZJHBGPY-UHFFFAOYSA-N 0.000 description 1
- RXVWSYJTUUKTEA-UHFFFAOYSA-N D-maltotriose Natural products OC1C(O)C(OC(C(O)CO)C(O)C(O)C=O)OC(CO)C1OC1C(O)C(O)C(O)C(CO)O1 RXVWSYJTUUKTEA-UHFFFAOYSA-N 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- OKPQBUWBBBNTOV-UHFFFAOYSA-N Kojibiose Natural products COC1OC(O)C(OC2OC(OC)C(O)C(O)C2O)C(O)C1O OKPQBUWBBBNTOV-UHFFFAOYSA-N 0.000 description 1
- 238000001237 Raman spectrum Methods 0.000 description 1
- FTNIPWXXIGNQQF-UHFFFAOYSA-N UNPD130147 Natural products OC1C(O)C(O)C(CO)OC1OC1C(CO)OC(OC2C(OC(OC3C(OC(OC4C(OC(O)C(O)C4O)CO)C(O)C3O)CO)C(O)C2O)CO)C(O)C1O FTNIPWXXIGNQQF-UHFFFAOYSA-N 0.000 description 1
- LUEWUZLMQUOBSB-UHFFFAOYSA-N UNPD55895 Natural products OC1C(O)C(O)C(CO)OC1OC1C(CO)OC(OC2C(OC(OC3C(OC(O)C(O)C3O)CO)C(O)C2O)CO)C(O)C1O LUEWUZLMQUOBSB-UHFFFAOYSA-N 0.000 description 1
- YASYVMFAVPKPKE-UHFFFAOYSA-N acephate Chemical compound COP(=O)(SC)NC(C)=O YASYVMFAVPKPKE-UHFFFAOYSA-N 0.000 description 1
- BNABBHGYYMZMOA-AHIHXIOASA-N alpha-maltoheptaose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@@H](CO)O[C@H](O[C@@H]2[C@H](O[C@H](O[C@@H]3[C@H](O[C@H](O[C@@H]4[C@H](O[C@H](O[C@@H]5[C@H](O[C@H](O[C@@H]6[C@H](O[C@H](O)[C@H](O)[C@H]6O)CO)[C@H](O)[C@H]5O)CO)[C@H](O)[C@H]4O)CO)[C@H](O)[C@H]3O)CO)[C@H](O)[C@H]2O)CO)[C@H](O)[C@H]1O BNABBHGYYMZMOA-AHIHXIOASA-N 0.000 description 1
- PYKYMHQGRFAEBM-UHFFFAOYSA-N anthraquinone Natural products CCC(=O)c1c(O)c2C(=O)C3C(C=CC=C3O)C(=O)c2cc1CC(=O)OC PYKYMHQGRFAEBM-UHFFFAOYSA-N 0.000 description 1
- 150000004056 anthraquinones Chemical class 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- RIOXQFHNBCKOKP-UHFFFAOYSA-N benomyl Chemical compound C1=CC=C2N(C(=O)NCCCC)C(NC(=O)OC)=NC2=C1 RIOXQFHNBCKOKP-UHFFFAOYSA-N 0.000 description 1
- MITFXPHMIHQXPI-UHFFFAOYSA-N benzoxaprofen Natural products N=1C2=CC(C(C(O)=O)C)=CC=C2OC=1C1=CC=C(Cl)C=C1 MITFXPHMIHQXPI-UHFFFAOYSA-N 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- FAPWYRCQGJNNSJ-UBKPKTQASA-L calcium D-pantothenic acid Chemical compound [Ca+2].OCC(C)(C)[C@@H](O)C(=O)NCCC([O-])=O.OCC(C)(C)[C@@H](O)C(=O)NCCC([O-])=O FAPWYRCQGJNNSJ-UBKPKTQASA-L 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- UOAMTSKGCBMZTC-UHFFFAOYSA-N dicofol Chemical compound C=1C=C(Cl)C=CC=1C(C(Cl)(Cl)Cl)(O)C1=CC=C(Cl)C=C1 UOAMTSKGCBMZTC-UHFFFAOYSA-N 0.000 description 1
- 238000013095 identification testing Methods 0.000 description 1
- 238000002329 infrared spectrum Methods 0.000 description 1
- PZDOWFGHCNHPQD-OQPGPFOOSA-N kojibiose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](C=O)O[C@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O PZDOWFGHCNHPQD-OQPGPFOOSA-N 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- FJCUPROCOFFUSR-UHFFFAOYSA-N malto-pentaose Natural products OC1C(O)C(OC(C(O)CO)C(O)C(O)C=O)OC(CO)C1OC1C(O)C(O)C(OC2C(C(O)C(OC3C(C(O)C(O)C(CO)O3)O)C(CO)O2)O)C(CO)O1 FJCUPROCOFFUSR-UHFFFAOYSA-N 0.000 description 1
- UYQJCPNSAVWAFU-UHFFFAOYSA-N malto-tetraose Natural products OC1C(O)C(OC(C(O)CO)C(O)C(O)C=O)OC(CO)C1OC1C(O)C(O)C(OC2C(C(O)C(O)C(CO)O2)O)C(CO)O1 UYQJCPNSAVWAFU-UHFFFAOYSA-N 0.000 description 1
- FJCUPROCOFFUSR-GMMZZHHDSA-N maltopentaose Chemical compound O[C@@H]1[C@@H](O)[C@@H](O[C@H]([C@H](O)CO)[C@H](O)[C@@H](O)C=O)O[C@H](CO)[C@H]1O[C@@H]1[C@H](O)[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O[C@@H]3[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O3)O)[C@@H](CO)O2)O)[C@@H](CO)O1 FJCUPROCOFFUSR-GMMZZHHDSA-N 0.000 description 1
- LUEWUZLMQUOBSB-OUBHKODOSA-N maltotetraose Chemical compound O[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@H](CO)O[C@@H](O[C@@H]2[C@@H](O[C@@H](O[C@@H]3[C@@H](O[C@@H](O)[C@H](O)[C@H]3O)CO)[C@H](O)[C@H]2O)CO)[C@H](O)[C@H]1O LUEWUZLMQUOBSB-OUBHKODOSA-N 0.000 description 1
- FYGDTMLNYKFZSV-UHFFFAOYSA-N mannotriose Natural products OC1C(O)C(O)C(CO)OC1OC1C(CO)OC(OC2C(OC(O)C(O)C2O)CO)C(O)C1O FYGDTMLNYKFZSV-UHFFFAOYSA-N 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000009659 non-destructive testing Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 229960002477 riboflavin Drugs 0.000 description 1
- 235000019192 riboflavin Nutrition 0.000 description 1
- 239000002151 riboflavin Substances 0.000 description 1
- 238000009394 selective breeding Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- PVNUIRUAPVSSOK-UHFFFAOYSA-N tert-butylimino(tripyrrolidin-1-yl)-$l^{5}-phosphane Chemical compound C1CCCN1P(N1CCCC1)(=NC(C)(C)C)N1CCCC1 PVNUIRUAPVSSOK-UHFFFAOYSA-N 0.000 description 1
- 238000002834 transmittance Methods 0.000 description 1
- NMXLJRHBJVMYPD-IPFGBZKGSA-N trehalulose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@]1(O)CO[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 NMXLJRHBJVMYPD-IPFGBZKGSA-N 0.000 description 1
- FYGDTMLNYKFZSV-BYLHFPJWSA-N β-1,4-galactotrioside Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@H](CO)O[C@@H](O[C@@H]2[C@@H](O[C@@H](O)[C@H](O)[C@H]2O)CO)[C@H](O)[C@H]1O FYGDTMLNYKFZSV-BYLHFPJWSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3581—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation
- G01N21/3586—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation by Terahertz time domain spectroscopy [THz-TDS]
Abstract
The tera-hertz spectra recognition methods based on BLSTM-RNN that the present invention relates to a kind of, belongs to spectrum analysis and substance classes detection technique field.Denoising is filtered to terahertz light spectrum data set first, cubic spline interpolation then is carried out to spectrum curve, data in comparable same frequency range is intercepted and carries out resampling, data normalization processing is completed with this.The Automatic Feature Extraction of full range spectrum information is carried out to training set sample by building BLSTM-RNN model, and carries out successive ignition training pattern using time reversal propagation algorithm and Adam optimization algorithm, the final high-precision for realizing test set identifies classification.The present invention can the Terahertz frequency spectrum data collection to higher-dimension carry out automatically extracting feature and effectively classify, existing tera-hertz spectra recognition methods is avoided to improve nicety of grading, identifies the problems such as process is cumbersome, technical requirements are high brought by classification again after need to first extracting a small amount of crucial spectrum signature.
Description
Technical field
The present invention relates to one kind to be based on BLSTM-RNN (bidirectional long short term memory-
Recurrent neural network, two-way long short-term memory Recognition with Recurrent Neural Network) tera-hertz spectra recognition methods, belong to
Spectrum analysis and substance classes detection technique field.
Background technique
In material identification field, spectroscopy is played a very important role.Wherein, near infrared spectrum, Raman spectrum etc.
Molecular vibration spectral analysis technology is quickly grown, they using material exhibits come out characteristic spectrum carry out substance Qualitive test and
Quantitative analysis.Terahertz (THz) spectrum in far infrared band also has " fingerprint " characteristic, and terahertz wave band has thoroughly
Depending on property, safety and wave spectrum resolution capability.These features make application tool of the Thz technology in terms of material identification and non-destructive testing
It is significant.In recognition methods based on THz spectrum analysis, the characteristic absorption peak of early utilization frequency spectrum carries out material type
Directly determine, but identify particular types according to different spectral signatures of the substance within the scope of Terahertz, be easy to cause artificial point
The error of class, without apparent characteristic absorption peak or frequency spectrum, there are overlap of peaks effects in Thz wave band for especially some mixtures.Cause
This, studies the quickly and effectively entirety figure feature extraction of substance Terahertz frequency spectrum and recognition methods, can be material type and frequency spectrum
Good supporting role is played in the corresponding relationship research of figure.
Statistics and machine learning method are widely used in Terahertz spectrum signature and extract and identify classification at present, such as
The method that principal component analysis (PCA) is combined respectively at support vector machines (SVM), fuzzy diagnosis.Such method is first with PCA pairs
Higher-dimension tera-hertz spectra carries out principal component decomposition, selects the biggish principal component of contribution rate as terahertz light spectrum signature, reaches drop
Low spectral signature dimension purpose.Then the classifier of SVM and fuzzy diagnosis as tera-hertz spectra, the former is by extracted spectrum
Feature is mapped to higher-dimension or Infinite-dimensional by the non-linear sample space of low-dimensional as input vector, and based on structural risk minimization
Optimal hyperlane is found in feature space carries out Classification and Identification;The latter be by computer mathematical technique method it is for example European close to
It spends and fuzzy diagnosis is carried out to tera-hertz spectra as the Similarity Principle of module.Although redundancy can be effectively eliminated using PCA
Information, but the small principal component of the principal component and contribution rate that need artificial selection to be retained when obtaining terahertz light spectrum signature often may be used
It can be unfavorable for the subsequent identification of similar data set containing the important information to differences between samples.SVM is suitble to small sample, low-dimensional data
Classification, its shortcoming is that parameter is difficult to determination and data calculation amount is excessive.And the calculation of fuzzy diagnosis is relatively easy, but past
Toward highly dependent upon good Feature Engineering.Therefore, above-mentioned tera-hertz spectra recognition methods there are identification process cumbersome, feature extraction
The problems such as technical requirements are high.
Summary of the invention
For the problem present on, the tera-hertz spectra recognition methods based on BLSTM-RNN that the present invention provides a kind of,
Automatic learning characteristic is directly carried out to the original terahertz light modal data of higher-dimension, is joined by the repetitive exercise more new model of model
Number, obtains a tera-hertz spectra identification prediction model.
A kind of tera-hertz spectra recognition methods based on BLSTM-RNN, includes the following steps:
(1) pass through the terahertz time-domain spectroscopy data of terahertz time-domain spectroscopy system acquisition reference signal and sample of material,
The substance classes of detection are no less than two classes;Terahertz time-domain spectroscopy is converted into Terahertz time-frequency domain by discrete Fourier transform
Spectrum;
(2) transmissivity, 4 kinds of refractive index, absorption coefficient and extinction coefficient optical parameters are extracted from Terahertz frequency domain spectra
Spectrum;
(3) 4 kinds of optical parameter spectrum are smoothed, intercept the Terahertz parameter spectrum of similar frequency bands and it is done
Unified resolution processing, obtains the Terahertz frequency spectrum data collection that multiple groups unify frequency range and resolution ratio;
(4) using any one optical parameter Jing Guo pretreated terahertz light spectrum data set as spectroscopic data to
Amount, corresponding material classification label form training set as categorization vector;
(5) it is exercised supervision training using BLSTM-RNN model to training set, more by the training data iteration of certain number
New model parameter;
(6) tera-hertz spectra is identified using updated BLSTM-RNN model.
Preferably, the method for smoothing processing is that Savitzky-Golay is smooth in the step (3), at unified resolution
The method of reason is cubic spline interpolation.
Preferably, the general frame of BLSTM-RNN model is an input layer, a hidden layer in the step (5)
And an output layer, wherein hidden layer is a two-way LSTM neural unit, and hidden layer is carried out using ReLU activation primitive
Nonlinear Processing, output layer are Softmax excitation function, are full connection status between input layer, hidden layer and output layer;Often
A LSTM neural unit includes 4 elements: input gate forgets the neuron of door, out gate and circulation from connection;Mapping function is
Y=Wx+b, wherein Y is neuron output value, and X is neuron input value, and W and b are respectively weight and bias matrix.
Specific step is as follows for iteration update model parameter in the step (5):
Step1: the training set of pretreated terahertz light modal data is S={ (x1,y1), (x2,y2) ..., (xi,
yi)...,(xn,y n), i=1,2 ..., n, wherein xiFor terahertz optics parameter, that is, spectroscopic data vector, yiFor material classification
Label, that is, categorization vector;
Step2: input x firstiPropagated forward predicted operation is carried out, first along the output of 1 → T direction calculating forward direction LSTM
State value obtains the binary feature output o of each time step further along the output state value of the reversed LSTM of the direction calculating of T → 1t;
Step3:otPredicted value is obtained by one softmax layersWith true material classification label yiCompare, and utilizes
Cross entropy loss function calculates loss;
Step4: and then operation of the backpropagation to objective function derivation is carried out, first to output otDerivation, then along T →
The derivative of the output state value of 1 direction calculating forward direction LSTM, further along the output state value of the reversed LSTM of 1 → T direction calculating
Derivative;
Step5: acquiring gradient value according to reversed time propagation algorithm, using Adam optimization algorithm update Model Weight W and
Bias matrix b completes primary training;
Step6: repeating Step2-Step5 step, judges whether to meet given maximum number of iterations, the mould if meeting
Type optimization is completed.
The beneficial effects of the present invention are:
(1) recognition methods proposed by the present invention can be quickly and effectively to the complete of tera-hertz spectra compared to traditional recognition methods
Spectrum information carries out Automatic Feature Extraction, simplifies data prediction process, can identify for the fast accurate of terahertz light modal data
Provide a kind of new effective recognition methods;
(2) the method for the invention has the original tera-hertz spectra validity feature extractability to higher-dimension, without taking
Dimension is brief or dimensionality reduction technology means select tera-hertz spectra manual features;
(3) prediction model of the method for the invention building also has very high knowledge to similar terahertz light spectrum data set
Other precision meets complicated terahertz light spectrum data set high-precision identification and requires.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is the Terahertz transmission spectrum of five kinds of organic compounds in embodiment with apparent sharp peaks characteristic;
Fig. 3 is the Terahertz transmission spectrum of five kinds of organic compounds without apparent sharp peaks characteristic in embodiment;
Fig. 4 is the Terahertz transmission spectrum of five kinds of organic compounds in embodiment with higher similar spectral line.
Specific embodiment
Invention is further described in detail combined with specific embodiments below, but protection scope of the present invention is not limited to
The content.
Embodiment 1: as shown in Figure 1, carrying out data prediction to collected Terahertz original spectral data first, then
Supervised training BLSTM-RNN model classifies tera-hertz spectra by trained volume model to obtain corresponding material classification, tool
The process of body is as follows: the starting small-sized time-domain spectroscopy transmission-type test platform of Terahertz first, such as the small-sized frequency domain of zomega company
Spectrographic detection platform, obtains the frequency domain absorption spectrum of each substance equal resolution, or is with existing Terahertz frequency spectrum data
Basis, obtain respectively with Anthraquinone, Benomyl, Carbazole, Mannose, Riboflavin, Acephate,
Dicofol、Kojibiose、Pantothenate Calcium、Trehalulose、Malthexaose、Maltoheptaose、
Maltopentaose, Maltotetraose, Maltotriose 0.9~6THz band limits 15 kinds of organic compounds too
For hertz transmittance spectra data.The denoising of Savitzky-Golay smothing filtering, warp are carried out to the spectroscopic data of this 15 kinds of substances
Each substance of unified resolution 100 is obtained after crossing cubic spline interpolation processing, the data point of every curve of spectrum is 6349.
70 are randomly selected from the Terahertz transmitted spectrum of every kind of substance as training set, remaining 30 are used as test set, and according to
Tera-hertz spectra spectral line characteristic is divided into 4 set types as experimental data.Here with two-way shot and long term memory unit (LSTM)
Recognition with Recurrent Neural Network establishes prediction model, final to realize test by updating model parameter to the certain repetitive exercise of training set
The Automatic Feature Extraction for concentrating spectrum corresponding to each substance and effectively identification.Specific recognition methods the following steps are included:
A, to the Terahertz frequency domain spectra data x of each samplemiDo Savitzky-Golay filtering, filter order 3,
Window size is 15, spectrum y after being filteredmi;
B, by filtering data y obtained in step BmiCubic spline interpolation is carried out, the dimension m of every group of spectroscopic data sequence is made
Increase to 6000 or more;
C, the terahertz light modal data of unified interception 0.9~6THz band limits, makes the dimension of every group of spectroscopic data sequence
Reach 6349 dimensions, so far obtains the multiple groups terahertz light modal data of unified resolution, frequency range.
D, it is respectively obtained after step A-B data prediction at each substance of 0.9~6THz band limits 100, every
The data point of the curve of spectrum is 6349.70 are randomly selected from the Terahertz transmitted spectrum of every kind of substance as training
Collection, remaining 30 are used as test set.
E, the Terahertz transmitted spectrum for 15 kinds of substances for obtaining D step is by whether there is or not obvious peak value feature and spectral line are similar
Degree is divided into dataset-1, dataset-2, dataset-3 and dataset-4.Wherein, five kinds of substances in dataset-1
Terahertz absorption spectra all has apparent sharp peaks characteristic, is easy for workers to define;Five kinds of substances do not have apparent peak in dataset-2
Value tag is not easy to Manual definition's feature;Then spectral line is very much like for 5 kinds of substances in dataset-3, and special without obvious peak value
Sign;Spectrum set of the dataset-4 as 15 kinds of substances of dataset-1, dataset-2 and dataset-3.dataset-1,
Dataset-2 and dataset-3 data set spectral line sample is as shown in Figure 2, Figure 3 and Figure 4.
F, it in embodiment, constructs a BLSTM-RNN Recognition with Recurrent Neural Network model and shares 1 input layer, 1 output
Then training parameter learning rate learning_rate=0.1, maximum number of iterations max_epoch is arranged in layer and 1 hidden layer
=30, crowd size batch_size=32, the respective the number of hidden nodes n_hidden=256 of two-way LSTM are adopted in training process
Prediction model is trained with BTPP algorithm and Adam autoadapted learning rate optimization algorithm.
G, using based on BLSTM-RNN Recognition with Recurrent Neural Network to dataset-1, dataset-2, dataset-3 and
The training set Automatic Feature Extraction of dataset-4, and after obtaining prediction model by certain repetitive exercise data set, it is right
The test set of dataset-1, dataset-2, dataset-3 and dataset-4 carry out prediction classification.The specific steps of which are as follows:
The training set of terahertz light modal data is represented by S={ (x after G1, given pretreatment1,y1), (x2,
y2) ..., (xi,yi)...,(xn,yn), i=1,2 ..., n, wherein xiFor terahertz optics parameter, that is, spectroscopic data vector, yi
For material classification label, that is, categorization vector.
G2, first input feature vector sequence data xiPropagated forward (forward pass) predicted operation is carried out, our first edges
The state of 1 → T direction calculating forward direction RNN obtain each time step further along the state of the reversed RNN of the direction calculating of T → 1
Binary feature exports ot;
G3、otConnection one average pond layer, obtains predicted value using one softmax layersAnd it is damaged using cross entropy
It loses function and calculates loss loss;
G4, the operation of backpropagation (back pass) to objective function derivation is then carried out, we are first to output otIt asks
It leads, then along the derivative of the state of the direction calculating forward direction of T → 1 RNN, further along the state of the reversed RNN of 1 → T direction calculating
Derivative;
G5, the gradient value acquired according to reversed time propagation algorithm (BPTT) update model parameter using optimization algorithm, complete
At primary training;
G6, Step1-Step4 step is repeated, judges whether to meet given maximum number of iterations, the model if meeting
Optimization is completed.Prediction classification, accuracy in computation are carried out to test set.
So far, the test set of dataset-1, dataset-2, dataset-3 and dataset-4 are directly carried out respectively
Simultaneously classification is effectively predicted in Automatic Feature Extraction, in order to preferably verify BLSTM-RNN model proposed in this paper often compared to other
See that sorting algorithm possesses bigger advantage, selected machine learning algorithm SVM, KNN and neural network algorithm MLP, CNN as pair
Than experiment.The kernel function of SVM model is set as radial basis function, penalty coefficient C=1.0, nuclear parameter gamma=' auto ';KNN
The neighbor point number n_neighbors=5 of model, algorithm algorithm=' auto ', both of which are tested using ten foldings intersection
Card obtains test set accuracy rate.MLP model uses two layers of hidden layer structure, and every layer of neuron number is 256;CNN model uses
LeNet-5 structure, wherein the parameter of convolution kernel and pond layer is arranged referring to LeNet-5.MLP and CNN model is to prevent from intending
It closes and Dropout is added, when training is arranged keep_prob=0.75, and when test is keep_prob=1.Hidden layer uses ReLU
Activation primitive carries out Nonlinear Processing, and output layer connects softmax function prediction and classifies and calculate intersection entropy loss, training process
The mini-batch for the use of size being 128, is trained using the Adam optimization algorithm of autoadapted learning rate.
Accuracy such as the following table 1 institute is measured to the experiment of 4 seed type Terahertz transmission spectrum test sets based on 5 kinds of sorting algorithms
Show.On the whole, traditional machine learning algorithm SVM and KNN to the predictablity rates of 4 groups of different type terahertz light modal datas simultaneously
Undesirable, especially when identifying similar data set, discrimination is only 75% or so.4 groups of differences are tested in neural network algorithm
During collection is identified, the recognition effect of MLP is poor, and the accuracy of identification of CNN is higher, Average Accuracy 94.86%, and this hair
The BLSTM-RNN disaggregated model of bright proposition is 98.48% to the average recognition rate of 4 groups of test sets, better than these common classification
Algorithm.Therefore it can be concluded that the method for the present invention has preferable Automatic Feature Extraction ability and height to original higher-dimension tera-hertz spectra
Precision identification, has reached simplified data prediction process purpose, can be the fast accurate identification of complicated terahertz light modal data
Provide a kind of new effective recognition methods.
1 this method of table and other several method Comparative result tables
Above in conjunction with attached drawing, the embodiment of the present invention is explained in detail, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
It puts and makes a variety of changes.
Claims (4)
1. a kind of tera-hertz spectra recognition methods based on BLSTM-RNN, which comprises the steps of:
(1) pass through the terahertz time-domain spectroscopy data of terahertz time-domain spectroscopy system acquisition reference signal and sample of material, detection
Substance classes be no less than two classes;Terahertz time-domain spectroscopy is converted into Terahertz time-frequency domain light by discrete Fourier transform
Spectrum;
(2) transmissivity, 4 kinds of refractive index, absorption coefficient and extinction coefficient optical parameter spectrum are extracted from Terahertz frequency domain spectra;
(3) 4 kinds of optical parameter spectrum are smoothed, intercept the Terahertz parameter spectrum of similar frequency bands and do unification to it
Resolution processes obtain the Terahertz frequency spectrum data collection that multiple groups unify frequency range and resolution ratio;
(4) right using any one optical parameter Jing Guo pretreated terahertz light spectrum data set as spectroscopic data vector
The material classification label answered forms training set as categorization vector;
(5) it is exercised supervision training using BLSTM-RNN model to training set, the training data iteration for passing through certain number updates mould
Shape parameter;
(6) tera-hertz spectra is identified using updated BLSTM-RNN model.
2. the tera-hertz spectra recognition methods according to claim 1 based on BLSTM-RNN, which is characterized in that the step
Suddenly the method for smoothing processing is that Savitzky-Golay is smooth in (3), and the method for unified resolution processing is cubic spline interpolation.
3. the tera-hertz spectra recognition methods according to claim 1 based on BLSTM-RNN, which is characterized in that the step
Suddenly the general frame of BLSTM-RNN model is an input layer, a hidden layer and an output layer in (5), wherein hiding
Layer is a two-way LSTM neural unit, and hidden layer carries out Nonlinear Processing using ReLU activation primitive, and output layer is
Softmax excitation function is full connection status between input layer, hidden layer and output layer;Each LSTM neural unit includes 4
A element: input gate forgets the neuron of door, out gate and circulation from connection;Mapping function is Y=Wx+b, and wherein Y is nerve
First output valve, X are neuron input value, and W and b are respectively weight and bias matrix.
4. the tera-hertz spectra recognition methods according to claim 1 based on BLSTM-RNN, which is characterized in that the step
Suddenly specific step is as follows for iteration update model parameter in (5):
Step1: the training set of pretreated terahertz light modal data is S={ (x1,y1), (x2,y2) ..., (xi,yi)...,
(xn,yn), i=1,2 ..., n, wherein xiFor terahertz optics parameter, that is, spectroscopic data vector, yiFor material classification label, that is, class
Other vector;
Step2: input x firstiPropagated forward predicted operation is carried out, first along the output state of 1 → T direction calculating forward direction LSTM
Value obtains the binary feature output o of each time step further along the output state value of the reversed LSTM of the direction calculating of T → 1t;
Step3:otPredicted value is obtained by one softmax layersWith true material classification label yiCompare, and utilizes intersection
Entropy loss function calculates loss;
Step4: and then operation of the backpropagation to objective function derivation is carried out, first to output otDerivation, then along the direction T → 1
The derivative for calculating the output state value of forward direction LSTM, further along the derivative of the output state value of the reversed LSTM of 1 → T direction calculating;
Step5: acquiring gradient value according to reversed time propagation algorithm, updates Model Weight W and biasing using Adam optimization algorithm
Matrix b completes primary training;
Step6: repeating Step2-Step5 step, judges whether to meet given maximum number of iterations, model is excellent if meeting
Change and completes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811504359.7A CN109632693A (en) | 2018-12-10 | 2018-12-10 | A kind of tera-hertz spectra recognition methods based on BLSTM-RNN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811504359.7A CN109632693A (en) | 2018-12-10 | 2018-12-10 | A kind of tera-hertz spectra recognition methods based on BLSTM-RNN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109632693A true CN109632693A (en) | 2019-04-16 |
Family
ID=66072354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811504359.7A Pending CN109632693A (en) | 2018-12-10 | 2018-12-10 | A kind of tera-hertz spectra recognition methods based on BLSTM-RNN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109632693A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110068544A (en) * | 2019-05-08 | 2019-07-30 | 广东工业大学 | Material identification network model training method and tera-hertz spectra substance identification |
CN110108647A (en) * | 2019-04-30 | 2019-08-09 | 深圳市太赫兹科技创新研究院有限公司 | A kind of discrimination method and identification system of meat kind |
CN110261109A (en) * | 2019-04-28 | 2019-09-20 | 洛阳中科晶上智能装备科技有限公司 | A kind of Fault Diagnosis of Roller Bearings based on bidirectional memory Recognition with Recurrent Neural Network |
CN110335653A (en) * | 2019-06-30 | 2019-10-15 | 浙江大学 | Non-standard case history analytic method based on openEHR case history format |
CN110412470A (en) * | 2019-04-22 | 2019-11-05 | 上海博强微电子有限公司 | Electric automobile power battery SOC estimation method |
CN110646350A (en) * | 2019-08-28 | 2020-01-03 | 深圳和而泰家居在线网络科技有限公司 | Product classification method and device, computing equipment and computer storage medium |
CN111104891A (en) * | 2019-12-13 | 2020-05-05 | 天津大学 | Composite characteristic optical fiber sensing disturbing signal mode identification method based on BiLSTM |
CN111678599A (en) * | 2020-07-07 | 2020-09-18 | 安徽大学 | Laser spectrum noise reduction method and device based on deep learning optimization S-G filtering |
CN112485217A (en) * | 2020-12-02 | 2021-03-12 | 仲恺农业工程学院 | Method and device for constructing meat identification model applied to origin tracing |
CN112485218A (en) * | 2020-11-05 | 2021-03-12 | 电子科技大学中山学院 | Terahertz dangerous liquid identification method based on artificial neural network |
CN112666119A (en) * | 2020-12-03 | 2021-04-16 | 山东省科学院自动化研究所 | Method and system for detecting ginseng tract geology based on terahertz time-domain spectroscopy |
CN112945897A (en) * | 2021-01-26 | 2021-06-11 | 广东省科学院智能制造研究所 | Continuous terahertz image non-uniformity correction method |
CN113344051A (en) * | 2021-05-28 | 2021-09-03 | 青岛青源峰达太赫兹科技有限公司 | Neural network classification method based on terahertz data |
CN114088656A (en) * | 2020-07-31 | 2022-02-25 | 中国科学院上海高等研究院 | Terahertz spectrum substance identification method and system, storage medium and terminal |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160099010A1 (en) * | 2014-10-03 | 2016-04-07 | Google Inc. | Convolutional, long short-term memory, fully connected deep neural networks |
CN106599520A (en) * | 2016-12-31 | 2017-04-26 | 中国科学技术大学 | LSTM-RNN model-based air pollutant concentration forecast method |
CN107561033A (en) * | 2017-09-21 | 2018-01-09 | 上海理工大学 | Key substance is qualitative in mixture based on tera-hertz spectra and method for quantitatively determining |
CN107844751A (en) * | 2017-10-19 | 2018-03-27 | 陕西师范大学 | The sorting technique of guiding filtering length Memory Neural Networks high-spectrum remote sensing |
CN108458989A (en) * | 2018-04-28 | 2018-08-28 | 江苏建筑职业技术学院 | A kind of Coal-rock identification method based on Terahertz multi-parameter spectrum |
-
2018
- 2018-12-10 CN CN201811504359.7A patent/CN109632693A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160099010A1 (en) * | 2014-10-03 | 2016-04-07 | Google Inc. | Convolutional, long short-term memory, fully connected deep neural networks |
CN106599520A (en) * | 2016-12-31 | 2017-04-26 | 中国科学技术大学 | LSTM-RNN model-based air pollutant concentration forecast method |
CN107561033A (en) * | 2017-09-21 | 2018-01-09 | 上海理工大学 | Key substance is qualitative in mixture based on tera-hertz spectra and method for quantitatively determining |
CN107844751A (en) * | 2017-10-19 | 2018-03-27 | 陕西师范大学 | The sorting technique of guiding filtering length Memory Neural Networks high-spectrum remote sensing |
CN108458989A (en) * | 2018-04-28 | 2018-08-28 | 江苏建筑职业技术学院 | A kind of Coal-rock identification method based on Terahertz multi-parameter spectrum |
Non-Patent Citations (1)
Title |
---|
姜华 等: "一种双向长短时记忆循环神经网络的问句语义关系识别方法", 《福州大学学报(自然科学版)》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110412470A (en) * | 2019-04-22 | 2019-11-05 | 上海博强微电子有限公司 | Electric automobile power battery SOC estimation method |
CN110412470B (en) * | 2019-04-22 | 2021-09-21 | 上海博强微电子有限公司 | SOC estimation method for power battery of electric vehicle |
CN110261109B (en) * | 2019-04-28 | 2020-12-08 | 洛阳中科晶上智能装备科技有限公司 | Rolling bearing fault diagnosis method based on bidirectional memory cyclic neural network |
CN110261109A (en) * | 2019-04-28 | 2019-09-20 | 洛阳中科晶上智能装备科技有限公司 | A kind of Fault Diagnosis of Roller Bearings based on bidirectional memory Recognition with Recurrent Neural Network |
CN110108647A (en) * | 2019-04-30 | 2019-08-09 | 深圳市太赫兹科技创新研究院有限公司 | A kind of discrimination method and identification system of meat kind |
CN110068544B (en) * | 2019-05-08 | 2021-09-17 | 广东工业大学 | Substance identification network model training method and terahertz spectrum substance identification method |
CN110068544A (en) * | 2019-05-08 | 2019-07-30 | 广东工业大学 | Material identification network model training method and tera-hertz spectra substance identification |
CN110335653A (en) * | 2019-06-30 | 2019-10-15 | 浙江大学 | Non-standard case history analytic method based on openEHR case history format |
CN110646350A (en) * | 2019-08-28 | 2020-01-03 | 深圳和而泰家居在线网络科技有限公司 | Product classification method and device, computing equipment and computer storage medium |
CN111104891A (en) * | 2019-12-13 | 2020-05-05 | 天津大学 | Composite characteristic optical fiber sensing disturbing signal mode identification method based on BiLSTM |
CN111678599A (en) * | 2020-07-07 | 2020-09-18 | 安徽大学 | Laser spectrum noise reduction method and device based on deep learning optimization S-G filtering |
CN114088656A (en) * | 2020-07-31 | 2022-02-25 | 中国科学院上海高等研究院 | Terahertz spectrum substance identification method and system, storage medium and terminal |
CN112485218A (en) * | 2020-11-05 | 2021-03-12 | 电子科技大学中山学院 | Terahertz dangerous liquid identification method based on artificial neural network |
CN112485217A (en) * | 2020-12-02 | 2021-03-12 | 仲恺农业工程学院 | Method and device for constructing meat identification model applied to origin tracing |
CN112485217B (en) * | 2020-12-02 | 2023-04-25 | 仲恺农业工程学院 | Construction method and device of meat identification model applied to origin tracing |
CN112666119A (en) * | 2020-12-03 | 2021-04-16 | 山东省科学院自动化研究所 | Method and system for detecting ginseng tract geology based on terahertz time-domain spectroscopy |
CN112945897A (en) * | 2021-01-26 | 2021-06-11 | 广东省科学院智能制造研究所 | Continuous terahertz image non-uniformity correction method |
CN112945897B (en) * | 2021-01-26 | 2023-04-07 | 广东省科学院智能制造研究所 | Continuous terahertz image non-uniformity correction method |
CN113344051A (en) * | 2021-05-28 | 2021-09-03 | 青岛青源峰达太赫兹科技有限公司 | Neural network classification method based on terahertz data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109632693A (en) | A kind of tera-hertz spectra recognition methods based on BLSTM-RNN | |
Feilhauer et al. | Multi-method ensemble selection of spectral bands related to leaf biochemistry | |
CN109493287A (en) | A kind of quantitative spectra data analysis processing method based on deep learning | |
CN111126386B (en) | Sequence domain adaptation method based on countermeasure learning in scene text recognition | |
CN110717368A (en) | Qualitative classification method for textiles | |
CN109993236A (en) | Few sample language of the Manchus matching process based on one-shot Siamese convolutional neural networks | |
CN110705372A (en) | LIBS multi-component quantitative inversion method based on deep learning convolutional neural network | |
CN107679569A (en) | Raman spectrum substance automatic identifying method based on adaptive hypergraph algorithm | |
CN108596246A (en) | The method for building up of soil heavy metal content detection model based on deep neural network | |
CN108596085A (en) | The method for building up of soil heavy metal content detection model based on convolutional neural networks | |
Guo et al. | Deep learning for ‘artefact’removal in infrared spectroscopy | |
CN103207015A (en) | Spectrum reconstruction method and spectrometer device | |
Menaka et al. | Chromenet: A CNN architecture with comparison of optimizers for classification of human chromosome images | |
Drass et al. | Semantic segmentation with deep learning: detection of cracks at the cut edge of glass | |
Di Frischia et al. | Enhanced data augmentation using gans for Raman spectra classification | |
CN113408616B (en) | Spectral classification method based on PCA-UVE-ELM | |
Devlin et al. | Disentangled attribution curves for interpreting random forests and boosted trees | |
Shao et al. | A new approach to discriminate varieties of tobacco using vis/near infrared spectra | |
Zhang et al. | Characterizing dissolved organic matter in Taihu Lake with PARAFAC and SOM method | |
CN112966735B (en) | Method for fusing supervision multi-set related features based on spectrum reconstruction | |
Du et al. | Application of near-infrared spectroscopy and CNN-TCN for the identification of foreign fibers in cotton layers | |
CN110070004A (en) | A kind of field hyperspectrum Data expansion method applied to deep learning | |
Zhang et al. | Open set maize seed variety classification using hyperspectral imaging coupled with a dual deep SVDD-based incremental learning framework | |
Yu et al. | LSCA-net: A lightweight spectral convolution attention network for hyperspectral image processing | |
Xu et al. | Using deep learning algorithms to perform accurate spectral classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190416 |