CN110516701A - Method based on data mining quick predict perovskite Curie temperature - Google Patents
Method based on data mining quick predict perovskite Curie temperature Download PDFInfo
- Publication number
- CN110516701A CN110516701A CN201910648969.2A CN201910648969A CN110516701A CN 110516701 A CN110516701 A CN 110516701A CN 201910648969 A CN201910648969 A CN 201910648969A CN 110516701 A CN110516701 A CN 110516701A
- Authority
- CN
- China
- Prior art keywords
- sample
- curie temperature
- perovskite
- independent variable
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000007418 data mining Methods 0.000 title claims abstract description 11
- 238000012360 testing method Methods 0.000 claims abstract description 26
- 239000000126 substance Substances 0.000 claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 16
- 230000000750 progressive effect Effects 0.000 claims abstract description 15
- 239000000463 material Substances 0.000 claims abstract description 13
- 238000012706 support-vector machine Methods 0.000 claims abstract description 11
- 238000009396 hybridization Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 6
- 230000007547 defect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 201000004569 Blindness Diseases 0.000 abstract description 5
- 230000008901 benefit Effects 0.000 abstract description 5
- 230000001681 protective effect Effects 0.000 abstract description 5
- 238000002474 experimental method Methods 0.000 abstract description 3
- 229910000473 manganese(VI) oxide Inorganic materials 0.000 description 43
- 239000011575 calcium Substances 0.000 description 10
- 230000006870 function Effects 0.000 description 5
- 238000012216 screening Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 239000000696 magnetic material Substances 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 238000011109 contamination Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 229910002585 La0.45Sr0.55MnO3 Inorganic materials 0.000 description 1
- 229910003369 La0.67Sr0.33MnO3 Inorganic materials 0.000 description 1
- 229910002179 La0.75Sr0.25MnO3 Inorganic materials 0.000 description 1
- 229910003410 La0.7Ca0.3MnO3 Inorganic materials 0.000 description 1
- 229910002182 La0.7Sr0.3MnO3 Inorganic materials 0.000 description 1
- JTCFNJXQEFODHE-UHFFFAOYSA-N [Ca].[Ti] Chemical compound [Ca].[Ti] JTCFNJXQEFODHE-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000010411 cooking Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005308 ferrimagnetism Effects 0.000 description 1
- 230000005307 ferromagnetism Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000003455 independent Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000005415 magnetization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000005057 refrigeration Methods 0.000 description 1
- 230000001235 sensitizing effect Effects 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000010936 titanium Substances 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Investigating Or Analyzing Materials Using Thermal Means (AREA)
Abstract
The invention discloses a kind of methods based on data mining quick predict perovskite Curie temperature, and steps are as follows: 1) searching ABO from document and database3The Curie temperature numerical value and chemical formula of type inorganic hybridization perovskite material;2) corresponding descriptor is generated according to chemical formula;3) data set is randomly divided into training set and test set using Euclidean distance determination method;4) independent variable is screened with progressive method combination supporting vector machine leaving-one method;5) with target variable and the independent variable that has screened, and the forecasting model of perovskite material Curie temperature is established with support vector machines by training set sample;6) according to the Curie temperature of established model prediction test set sample.The method of the present invention establishes the forecasting model of efficient quick by deriving from the sample data of document and database, has the advantages that quick and convenient, inexpensive, environmentally protective, while also can avoid blindness to experiment practical operation directive function.
Description
Technical field
The present invention relates to a kind of perovskite electromagnetic performance test methods, test more particularly to a kind of perovskite Curie temperature
Method.Applied to perovskite performance characterization and analysis and testing technology field.
Background technique
Perovskite is due to its stable crystal structure, unique physicochemical property and the hot spot for being increasingly becoming research.It can be with
Applied to catalyst, the dye sensitizing agent of dye-sensitized solar cells can also be used as.Part perovskite is due to huge magnetic
Resistance can be utilized to as giant magnetic material.Perovskite giant magnetic resistor material has in terms of magnetic refrigeration, magnetic storage, magnetic sensing
There is good application prospect.
Curie temperature (Curie Temperature) refers to temperature when spontaneous magnetization in magnetic material drops to zero,
Symbol is Tc, is the critical point that ferromagnetism or ferrimagnetism substance are transformed into paramagnet.Substance when lower than Curie-point temperature
It is ferromagnet, related magnetic field is difficult to change with material at this time.When temperature is higher than curie point, which becomes paramagnet, magnetic
The magnetic field of body is easy to the change with surrounding magnetic field and changes.Curie temperature is the ceiling temperature of many magnetic material work, because
This research Curie temperature is to have very important significance.
Progressive method is one kind of independent variable screening compared with classical way, and principle is simple but highly effective.So-called progressive method, refers to
Retain some variable at the beginning, be then gradually adding other variables, while contribution of the observation variable to model, retains contributive
Variable simultaneously rejects the small variable of contribution, until model is optimal.
Support vector machines (support vector machine, abbreviation SVM) is mathematician Vladimir N.Vapnik etc.
The machine learning established on the basis of Statistical Learning Theory (statistical learning theory, abbreviation SLT) is newly square
Method, including supporting vector classification (support vector classification, abbreviation SVC) algorithm and support vector regression
(support vector regression, abbreviation SVR) algorithm.Support vector machines can carry out small sample few in number
Modeling, and obtain the preferable model of prediction ability.At present to the test of the Curie temperature of perovskite usually require by test into
Row, may cause chemical contamination using chemical substance, to the perovskite Curie temperature of a large amount of different atomic parameters, structural parameters
Test job amount is huge, low efficiency, and there are blindness for partial test, is not able to satisfy at this stage to the Curie of series of components perovskite
The needs of the comprehensive cognition of temperature.
Summary of the invention
In order to solve prior art problem, it is an object of the present invention to overcome the deficiencies of the prior art, and to provide one kind
Based on the method for data mining quick predict perovskite Curie temperature, pass through theoretical and CALCULATING PREDICTION ABO3Type inorganic hybridization calcium titanium
Pit wood material Curie temperature, using Euclidean distance determination method, progressive method combination supporting vector machine leaving-one method, by data mining side
Method only needs the several seconds to can be obtained by calculated result, convenient and efficient, saves manpower, environmentally protective.
In order to achieve the above objectives, the present invention adopts the following technical scheme:
A method of based on data mining quick predict perovskite Curie temperature, include the following steps:
1) ABO is searched from document and database3The Curie temperature numerical value and chemical formula of type inorganic hybridization perovskite material,
As data set sample;
2) using the atomic parameter and structural parameters being collected into, corresponding atomic parameter and structure are generated according to chemical formula
Parameter descriptor, and in descriptor generating process, delete processing is carried out to the sample of defect numerical value;
3) Euclidean distance determination method is utilized, the data set sample random division obtained in the step 1) is training
Collection and test set;
4) using the Curie temperature being collected into the step 1) as target variable, the original of the generation in the step 2)
Subparameter and structural parameters descriptor are independent variable;With progressive method combination supporting vector machine leaving-one method, training set is carried out from change
Amount screening, selects the subset of the optimal independent variable of modeling;
5) independent variable screened with target variable and in the step 4), and with support vector machines, by
Training set sample obtained in the step 3), establishes the forecasting model of perovskite material Curie temperature;
6) it according to the forecasting model for the perovskite Curie temperature established in the step 5), forecasts in the step 3)
The Curie temperature of obtained test set sample.
As currently preferred technical solution, in the step 3), the specific steps of Euclidean distance determination method are such as
Under:
3-1) using the atomic parameter of the generation in the step 2) and structural parameters descriptor as independent variable, and to become certainly
The coordinate as each sample is measured, a high latitude space is created;
3-2) select the maximum sample of forbidden bandwidth;
The sample of selection 3-3) is included in modeling collection;
3-4) using the sample as the center of circle, R is the sphere that radius establishes a high latitude space, defines radius R are as follows:
Wherein c is the customized discrimination factor (Dissimilarity level), and setting c as 0.5, V is respectively to become certainly
The product of most value difference is measured, N is sample number, and K is space dimensionality;
3-5) sample by sample spacing d less than radius R is included in test set, defines sample i and sample i+1 spacing d are as follows:
Wherein xi,nIt is n-th of independent variable of sample i, xi+1, n is n-th of independent variable of sample i+1;
The maximum sample of forbidden bandwidth in remaining sample set 3-6) is chosen, and repeats step 3-2) to 3-5), until all
Sample be included into modeling collection and test set.
As currently preferred technical solution, in the step 4), using progressive method screen independent variable the step of such as
Under:
A feature, dividing when combining it with the feature being selected into are selected in the feature being never selected into every time
It is maximum from criterion J, until the number of features being selected into reaches specified dimension D;
If being selected into k feature, it is denoted as Xk, m-k feature x not being selected intoj, one by one with the feature set X that has been selected intokGroup
J value is calculated after conjunction, wherein j=1,2 ..., m-k, if meeting the following formula:
J(Xk+x1)≥J(Xk+x2)≥…≥J(Xk+xn-k)
Then x1It is selected into, the feature group of next step is combined into Xk+1=Xk+xi;K=0 when beginning, the process are performed until k=D
Until;In progressive method, the characteristic of selection is 8.
The present invention compared with prior art, has following obvious prominent substantive distinguishing features and remarkable advantage:
1. the method for the present invention overcomes the shortcomings that traditional " cooking method ", constantly trial and error is avoided, passes through theoretical and CALCULATING PREDICTION
ABO3Type inorganic hybridization perovskite material Curie temperature;The method of the present invention carries out Curie temperature using support vector machine method
Forecast, and cross validation has been carried out to result, descriptor is generated using the atomic parameter and structural parameters being collected into, will be obtained
Descriptor import model, it is only necessary to the several seconds can be obtained by calculated result, convenient and efficient, and a people can be completed;
2. the method for the present invention is not related to experiment and chemical article in the whole process, chemical contamination is not generated, green is met
Environmental protection concept;Preparation method of the present invention is simple, it is easy to accomplish, it is suitble to promote and apply;
3. the method for the present invention can prejudge ABO by model prediction in advance3Curie's temperature of type inorganic hybridization perovskite material
Degree selects satisfactory sample and carries out experimental verification, the efficiency of experiment can be improved, plays directive function, avoid blindness
Detailed description of the invention
Fig. 1 is the Support vector regression model modeling result figure of one perovskite Curie temperature of the embodiment of the present invention.
Fig. 2 is the Support vector regression model leave one cross validation knot of two perovskite Curie temperature of the embodiment of the present invention
Fruit figure.
Fig. 3 is the Support vector regression Model Independent test set result of three perovskite Curie temperature of the embodiment of the present invention
Figure.
Specific embodiment
Above scheme is described further below in conjunction with specific implementation example, the preferred embodiment of the present invention is described in detail such as
Under:
Embodiment one:
In the present embodiment, a method of based on data mining quick predict perovskite Curie temperature, including walk as follows
It is rapid:
1) ABO is searched from document and database3The Curie temperature numerical value and chemical formula of type inorganic hybridization perovskite material,
As data set sample;Part chemical formula and Curie temperature numerical value are as shown in table 1, and table 1 is part perovskite chemical formula and Curie
The set of data samples of temperature value:
The set of data samples of 1. perovskite chemical formula of table and Curie temperature numerical value
Chemical formula | Tc/K | Chemical formula | Tc/K |
La0.7Sr0.3Mn0.5Cr0.5O3 | 226 | La0.9Pb0.1MnO3 | 235 |
La0.7Sr0.3Mn0.8Cr0.2O3 | 286 | La0.8Pb0.2MnO3 | 310 |
La0.7Sr0.3Mn0.9Cu0.1O3 | 350 | La0.7Pb0.3MnO3 | 358 |
La0.75Sr0.25MnO3 | 340 | La0.6Pb0.4MnO3 | 360 |
La0.7Sr0.3Mn0.6Cr0.4O3 | 242 | La0.5Pb0.5MnO3 | 355 |
La0.7Sr0.25Ag0.05MnO3 | 303 | La0.65Sr0.35MnO3 | 377 |
La0.7Sr0.05Ag0.25MnO3 | 363 | La0.55Pr0.1Sr0.35MnO3 | 353 |
La0.75Ba0.1Ag0.15MnO3 | 315 | La0.45Pr0.2Sr0.35MnO3 | 344 |
La0.7Ca0.3MnO3 | 250 | La0.35Pr0.3Sr0.35MnO3 | 334 |
La0.7Ag0.3MnO3 | 270 | La0.7Sr0.1Ag0.2MnO3 | 286.5 |
La0.89Sr0.11MnO3 | 195 | La0.67Sr0.33MnO3 | 372.5 |
La0.88Sr0.12MnO3 | 170 | La0.7Sr0.3MnO3 | 370 |
La0.875Sr0.125MnO3 | 188 | La0.7Sr0.3Mn0.95Fe0.05O3 | 330 |
La0.865Sr0.135MnO3 | 214 | La0.7Sr0.3Mn0.9Cr0.1O3 | 326 |
La0.855Sr0.145MnO3 | 230.5 | La0.7Sr0.3Mn0.85Cr0.15O3 | 304 |
La0.845Sr0.155MnO3 | 242 | La0.7Sr0.3Mn0.85Fe0.15O3 | 175 |
La0.835Sr0.165MnO3 | 260.5 | La0.68Nd0.02Ba0.3Mn0.9Cr0.1O3 | 300 |
La0.83Sr0.17MnO3 | 265 | La0.7Ba0.3Mn0.9Cr0.1O3 | 298 |
La0.825Sr0.175MnO3 | 283 | La0.7Sr0.3Mn0.9Fe0.1O3 | 261 |
La0.72Sr0.28MnO3 | 375 | La0.7Ba0.3Mn0.9Fe0.1O3 | 215 |
La0.69Sr0.31MnO3 | 380 | La0.67Ca0.33MnO3 | 275 |
La0.64Sr0.36MnO3 | 372 | La0.6Sr0.1Cu0.3MnO3 | 232 |
La0.52Sr0.48MnO3 | 330 | La0.65Ca0.18Sr0.17MnO3 | 323 |
La0.50Sr0.50MnO3 | 310 | La0.67Ba0.33Mn0.98Ti0.02O3 | 314 |
La0.48Sr0.52MnO3 | 290 | La0.7Ca0.2Sr0.1MnO3 | 315 |
La0.45Sr0.55MnO3 | 260 | La0.65Nd0.05Ca0.3MnO3 | 250 |
La0.4Sm0.3Sr0.3MnO3 | 256 | La0.6Sr0.2Ba0.2MnO3 | 354 |
Ba0.95Sr0.05MnO3 | 353 | La0.8Ba0.1Ca0.1Mn0.97Fe0.03O3 | 281 |
La0.7Sr0.3Mn0.93Fe0.07O3 | 296 | La0.9Mg0.1MnO3 | 160 |
La0.7Sr0.3Mn0.9Al0.1O3 | 310 | La0.8Ba0.2MnO3 | 295 |
La0.67Ca0.33Mn0.85V0.15O3 | 287.2 | La0.67Ba0.23Ca0.1MnO3 | 350 |
La0.6Nd0.1Ca0.15Sr0.15Mn0.9Fe0.1O3 | 298 | La0.7Ba0.3MnO3 | 328 |
La0.6Nd0.1Ca0.15Sr0.15Mn0.95Fe0.05O3 | 306 | La0.57Dy0.1Sr0.33MnO3 | 358 |
La0.6Nd0.1Ca0.15Sr0.15MnO3 | 326 |
2) using the atomic parameter and structural parameters being collected into, corresponding atomic parameter and structure are generated according to chemical formula
Parameter descriptor, and in descriptor generating process, delete processing, the complete sample number of data are carried out to the sample of defect numerical value
It is 67, perovskite chemical formula and Curie temperature numerical value are as shown in step 1) table 1;Utilize the atomic parameter and structure being collected into
Parameter generates descriptor, amounts to 147, part of descriptor is as shown in table 2:
2. part descriptors table of table
A_aff | A_Radius | A_Tm | A_Tb | A_work function(eV) |
B_aff | B_Radius | B_Tm | B_Tb | B_work function(eV) |
TF | A_modulus bulk | A_Density | A_ionic | A_quantum number |
rc | B_modulus bulk | B_Density | B_ionic | B_quantum number |
Za | A_group number | A_Hfus | Mass | A_atomic weight(10-3kg) |
Zb | B_group number | B_Hfus | R_a/R_b | B_atomic weight(10-3kg) |
3) Euclidean distance determination method is utilized, will be in 67 data set sample random divisions obtained in the step 2)
Training set and test set, ratio 4:1, training set and test set sample size are respectively 54 and 13;
Specific step is as follows for Euclidean distance determination method:
3-1) using the atomic parameter of the generation in the step 2) and structural parameters descriptor as independent variable, and to become certainly
The coordinate as each sample is measured, a high latitude space is created;
3-2) select the maximum sample of forbidden bandwidth;
The sample of selection 3-3) is included in modeling collection;
3-4) using the sample selected as the center of circle, R is the sphere that radius establishes a high latitude space, defines radius R are as follows:
Wherein c is the customized discrimination factor (Dissimilarity level), and setting c as 0.5, V is respectively to become certainly
The product of most value difference is measured, N is sample number, and K is space dimensionality;
3-5) sample by sample spacing d less than radius R is included in test set, defines sample i and sample i+1 spacing d are as follows:
Wherein xi,nIt is n-th of independent variable of sample i, xi+1, n is n-th of independent variable of sample i+1;
The maximum sample of forbidden bandwidth in remaining sample set 3-6) is chosen, and repeats step 3-2) to 3-5), until all
Sample be included into modeling collection and test set;
4) using the Curie temperature being collected into the step 1) as target variable, the original of the generation in the step 2)
Subparameter and structural parameters descriptor are independent variable;It is verified with progressive method combination supporting vector machine leaving-one method, to training set
Independent variable screening is carried out, selects 8 optimal independents variable, the subset of the optimal independent variable as modeling;
The step of screening independent variable using progressive method is as follows:
Progressive method is a kind of simple searching method from bottom to top, selects a spy in the feature being never selected into every time
Sign, separable criterion J when combining it with the feature being selected into is maximum, until the number of features being selected into reaches specified
Dimension D until;
If being selected into k feature, it is denoted as Xk, m-k feature x not being selected intoj, one by one with the feature set X that has been selected intokGroup
J value is calculated after conjunction, wherein j=1,2 ..., m-k, if meeting the following formula:
J(Xk+x1)≥J(Xk+x2)≥…≥J(Xk+xn-k)
Then x1It is selected into, the feature group of next step is combined into Xk+1=Xk+xi;K=0 when beginning, the process are performed until k=D
Until;In progressive method, the characteristic of selection is 8;In the method, the characteristic of selection is 8;
The Fast Prediction model that perovskite Curie temperature is established with support vector machines, the optimal variable selected are as shown in table 3:
The selected optimal descriptor table of 3. progressive method combination supporting vector machine leaving-one method of table
A_enthalpy vacancies Miedema(kJ·mole-1) | A_modulus rigidity(GPa) |
A_modulus Young(GPa) | A_distance core electron(Schubert)(A) |
B_nWS1/3Miedema(a.u.-1/3) | B_enthalpy surface Miedema(kJ·mole-1) |
B_enthalpy vacancies Miedema(kJ·mole-1) | B_ionic |
In this step, erased noise is big and the higher variable of repeatability, selects the optimal variable subset of modeling, reduces
Noise data improves screening precision;
5) independent variable screened with target variable and in the step 4), and with support vector machines, by
Training set sample obtained in the step 3), establishes the forecasting model of perovskite material Curie temperature, selects the optimal of modeling
The subset of variable;
Perovskite Curie temperature Fast Prediction model of the present embodiment according to foundation, the Curie of Fast Prediction test set sample
Temperature.Modeling result based on the Curie temperature quantitative forecast model that 54 perovskite sample combination supporting vector machines are established, such as
Shown in Fig. 1.
The present embodiment carries out regression modeling to 54 perovskite sample datas using Support vector regression algorithm, establishes nothing
The Support vector regression quantitative model of machine hydridization perovskite Curie temperature.Perovskite Curie temperature model prediction value and document are true
The related coefficient of real value is 0.9076.The present embodiment method establishes height by the sample data from document and database
Effect efficiently forecasting model, have the advantages that it is quick and convenient, inexpensive, environmentally protective, while can also be to testing practical operation
Directive function is played, blindness is avoided.
Embodiment two:
The present embodiment is basically the same as the first embodiment, and is particular in that:
In the present embodiment, A is numbered in 54 samples in training set1, A2……A54.The first step is with A1, A2……
A53It establishes model 1 using the optimal independent variable subset being the same as example 1 for training set and performance model 1 forecasts A54Residence
In temperature.Second step is with A1, A2……A52, A54It is established for training set using the optimal independent variable subset being the same as example 1
Model 2 and the forecast of performance model 2 A53Curie temperature.And so on, after establishing 54 models, pass through predicted value and true value
The stability and reliability of error judgment Data Modeling Method.
According to the perovskite Curie temperature Fast Prediction model of foundation, the Curie temperature of Fast Prediction training set sample.Base
It is handed over inside the leaving-one method for the perovskite Curie temperature quantitative forecast model that 54 perovskite sample combination supporting vector machines are established
Verification result is pitched, as shown in Figure 2.
The support vector machines for the perovskite Curie temperature that the present embodiment method establishes 54 sample datas using leaving-one method
Quantitative forecast model carries out leaving-one method cross-validation, and the model prediction value of perovskite Curie temperature and document are true in leaving-one method
The related coefficient of real value is 0.8485.The present embodiment method establishes instruction by the sample data from document and database
The forecasting model for practicing collection leave one cross validation, have the advantages that it is quick and convenient, inexpensive, environmentally protective, while can also be right
The stability and reliability of Data Modeling Method make assessment.
Embodiment three:
The present embodiment is substantially the same as in the previous example, and is particular in that:
In the present embodiment, according to the perovskite Curie temperature Fast Prediction model of foundation, Fast Prediction test set sample
Curie temperature.Based on 54 sample data combination supporting vector machines establish perovskite Curie temperature quantitative forecast model it is only
Vertical test set forecast result, as shown in Figure 3.
The present embodiment method is using the support vector machines quantitative forecast model for the perovskite Curie temperature established to independent survey
13 samples that examination is concentrated are forecast, preferable result has been obtained.The model prediction value and document of perovskite Curie temperature are true
The related coefficient of real value is 0.7938, and the present embodiment method establishes height by the sample data from document and database
Effect efficiently forecasting model, have the advantages that it is quick and convenient, inexpensive, environmentally protective, while can also be to testing practical operation
Directive function is played, blindness is avoided.
Combination attached drawing of the embodiment of the present invention is illustrated above, but the present invention is not limited to the above embodiments, it can be with
The purpose of innovation and creation according to the present invention makes a variety of variations, under the Spirit Essence and principle of all technical solutions according to the present invention
Change, modification, substitution, combination or the simplification made, should be equivalent substitute mode, as long as meeting goal of the invention of the invention,
Without departing from the present invention is based on the technical principle and inventive concept of the method for data mining quick predict perovskite Curie temperature,
Belong to protection scope of the present invention.
Claims (3)
1. a kind of method based on data mining quick predict perovskite Curie temperature, which comprises the steps of:
1) ABO is searched from document and database3The Curie temperature numerical value and chemical formula of type inorganic hybridization perovskite material, as
Data set sample;
2) using the atomic parameter and structural parameters being collected into, corresponding atomic parameter and structural parameters are generated according to chemical formula
Descriptor, and in descriptor generating process, delete processing is carried out to the sample of defect numerical value;
3) utilize Euclidean distance determination method, the data set sample random division obtained in the step 1) be training set and
Test set;
4) using the Curie temperature being collected into the step 1) as target variable, the atom of the generation in the step 2) is joined
Several and structural parameters descriptor is independent variable;With progressive method combination supporting vector machine leaving-one method, independent variable sieve is carried out to training set
Choosing, selects the subset of the optimal independent variable of modeling;
5) independent variable screened with target variable and in the step 4), and with support vector machines, by described
Training set sample obtained in step 3) establishes the forecasting model of perovskite material Curie temperature;
6) according to the forecasting model for the perovskite Curie temperature established in the step 5), forecast obtains in the step 3)
Test set sample Curie temperature.
2. the method according to claim 1 based on data mining quick predict perovskite Curie temperature, it is characterised in that: In
In the step 3), specific step is as follows for Euclidean distance determination method:
3-1) made using the atomic parameter of the generation in the step 2) and structural parameters descriptor as independent variable, and with independent variable
For the coordinate of each sample, a high latitude space is created;
3-2) select the maximum sample of forbidden bandwidth;
The sample of selection 3-3) is included in modeling collection;
3-4) using the sample as the center of circle, R is the sphere that radius establishes a high latitude space, defines radius R are as follows:
Wherein c be the customized discrimination factor (Dissimilarity level), set c as 0.5, V be respective independent variable most
The product of value difference, N are sample number, and K is space dimensionality;
3-5) sample by sample spacing d less than radius R is included in test set, defines sample i and sample i+1 spacing d are as follows:
Wherein xi,nIt is n-th of independent variable of sample i, xi+1, n is n-th of independent variable of sample i+1;
The maximum sample of forbidden bandwidth in remaining sample set 3-6) is chosen, and repeats step 3-2) to 3-5), until all samples
Originally modeling collection and test set are included into.
3. the method according to claim 1 based on data mining quick predict perovskite Curie temperature, it is characterised in that: In
In the step 4), using progressive method screen independent variable the step of it is as follows:
A feature is selected in the feature that is never selected into every time, separable when combining it with the feature being selected into is sentenced
According to J maximum, until the number of features being selected into reaches specified dimension D;
If being selected into k feature, it is denoted as Xk, m-k feature x not being selected intoj, one by one with the feature set X that has been selected intokAfter combination
J value is calculated, wherein j=1,2 ..., m-k, if meeting the following formula:
J(Xk+x1)≥J(Xk+x2)≥…≥J(Xk+xn-k)
Then x1It is selected into, the feature group of next step is combined into Xk+1=Xk+xi;K=0 when beginning, until which is performed until k=D;
In progressive method, the characteristic of selection is 8.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910630966 | 2019-07-12 | ||
CN2019106309666 | 2019-07-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110516701A true CN110516701A (en) | 2019-11-29 |
Family
ID=68623455
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910648969.2A Pending CN110516701A (en) | 2019-07-12 | 2019-07-18 | Method based on data mining quick predict perovskite Curie temperature |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516701A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111329847A (en) * | 2020-03-19 | 2020-06-26 | 上海大学 | Method for predicting insulin secretion promoting performance by using dihydrochalcone compound and application |
CN112116091A (en) * | 2020-08-24 | 2020-12-22 | 上海大学 | On-line forecasting method for rapidly forecasting band gap of organic-inorganic hybrid perovskite based on machine learning |
CN112133383A (en) * | 2020-08-21 | 2020-12-25 | 上海大学 | Method for predicting perovskite specific surface area based on genetic symbol regression |
CN112132185A (en) * | 2020-08-26 | 2020-12-25 | 上海大学 | Method for rapidly predicting band gap of double perovskite oxide based on data mining |
CN112132187A (en) * | 2020-08-27 | 2020-12-25 | 上海大学 | Method for rapidly judging perovskite structure stability based on random forest |
CN112132182A (en) * | 2020-08-20 | 2020-12-25 | 上海大学 | Method for rapidly predicting resistivity of ternary gold alloy based on machine learning |
CN112132177A (en) * | 2020-08-14 | 2020-12-25 | 上海大学 | ABO rapid prediction based on machine learning3Perovskite band gap online forecasting method |
CN115599761A (en) * | 2021-08-05 | 2023-01-13 | 日立金属株式会社(Jp) | Database, material data processing system, and method for creating database |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106153725A (en) * | 2016-08-17 | 2016-11-23 | 西安长大公路养护技术有限公司 | A kind of pavement distress detection equipment and detection method thereof |
CN109473147A (en) * | 2018-10-08 | 2019-03-15 | 上海大学 | A kind of method of quick predict macromolecule forbidden bandwidth |
-
2019
- 2019-07-18 CN CN201910648969.2A patent/CN110516701A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106153725A (en) * | 2016-08-17 | 2016-11-23 | 西安长大公路养护技术有限公司 | A kind of pavement distress detection equipment and detection method thereof |
CN109473147A (en) * | 2018-10-08 | 2019-03-15 | 上海大学 | A kind of method of quick predict macromolecule forbidden bandwidth |
Non-Patent Citations (2)
Title |
---|
亓呈明 等: "《机器学习、智能计算与高光谱遥感影像分类研究应用》", 31 May 2018 * |
刘怡飞 等: "Predicting the curie temperatures of LaxM1-x-zRzMnyN1-yO3 perovskites based on support vector Regression", 《COMPUTERS AND APPLIED CHEMISTRY》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111329847A (en) * | 2020-03-19 | 2020-06-26 | 上海大学 | Method for predicting insulin secretion promoting performance by using dihydrochalcone compound and application |
CN112132177A (en) * | 2020-08-14 | 2020-12-25 | 上海大学 | ABO rapid prediction based on machine learning3Perovskite band gap online forecasting method |
CN112132182B (en) * | 2020-08-20 | 2024-03-22 | 上海大学 | Method for rapidly predicting resistivity of ternary gold alloy based on machine learning |
CN112132182A (en) * | 2020-08-20 | 2020-12-25 | 上海大学 | Method for rapidly predicting resistivity of ternary gold alloy based on machine learning |
CN112133383B (en) * | 2020-08-21 | 2023-06-13 | 上海大学 | Method for predicting perovskite specific surface area based on genetic symbolic regression |
CN112133383A (en) * | 2020-08-21 | 2020-12-25 | 上海大学 | Method for predicting perovskite specific surface area based on genetic symbol regression |
CN112116091A (en) * | 2020-08-24 | 2020-12-22 | 上海大学 | On-line forecasting method for rapidly forecasting band gap of organic-inorganic hybrid perovskite based on machine learning |
CN112132185A (en) * | 2020-08-26 | 2020-12-25 | 上海大学 | Method for rapidly predicting band gap of double perovskite oxide based on data mining |
CN112132187A (en) * | 2020-08-27 | 2020-12-25 | 上海大学 | Method for rapidly judging perovskite structure stability based on random forest |
CN115599761A (en) * | 2021-08-05 | 2023-01-13 | 日立金属株式会社(Jp) | Database, material data processing system, and method for creating database |
US11803522B2 (en) | 2021-08-05 | 2023-10-31 | Proterial, Ltd. | Database, material data processing system, and method of creating database |
US11934360B2 (en) | 2021-08-05 | 2024-03-19 | Proterial, Ltd. | Database, material data processing system, and method of creating database |
CN115599761B (en) * | 2021-08-05 | 2024-04-16 | 株式会社博迈立铖 | Database, material data processing system and database manufacturing method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516701A (en) | Method based on data mining quick predict perovskite Curie temperature | |
US20230141886A1 (en) | Method for assessing hazard on flood sensitivity based on ensemble learning | |
CN107194803A (en) | P2P net loan borrower credit risk assessment device | |
CN109711474A (en) | A kind of aluminium material surface defects detection algorithm based on deep learning | |
CN110443302A (en) | Load discrimination method and its application based on Fusion Features and deep learning | |
Ludlow et al. | The peaks formalism and the formation of cold dark matter haloes | |
CN110413973A (en) | Computer automatically generates the method and its system of set volume | |
CN110533116A (en) | Based on the adaptive set of Euclidean distance at unbalanced data classification method | |
CN105740635B (en) | A kind of cloud ideal solution evaluation method of transformer electromagnetic design scheme | |
CN107944495A (en) | A kind of household electricity load classification recognition methods based on deep layer forest algorithm | |
CN108647425B (en) | K-means high flow or low flow time forecasting procedure based on particle group optimizing | |
CN109543720A (en) | A kind of wafer figure defect mode recognition methods generating network based on confrontation | |
Cain | Sample-plot technique applied to alpine vegetation in Wyoming | |
CN107341363A (en) | A kind of Forecasting Methodology of proteantigen epitope | |
CN110033035A (en) | A kind of AOI defect classification method and device based on intensified learning | |
CN109002859A (en) | Sensor array feature selecting and array optimization method based on principal component analysis | |
CN104850868A (en) | Customer segmentation method based on k-means and neural network cluster | |
CN106951728B (en) | Tumor key gene identification method based on particle swarm optimization and scoring criterion | |
CN105279582B (en) | Super short-period wind power prediction technique based on dynamic correlation feature | |
CN108564009A (en) | A kind of improvement characteristic evaluation method based on mutual information | |
CN115130795A (en) | Method and device for evaluating development potential of heat storage, storage medium and computer equipment | |
CN110264010B (en) | Novel rural power saturation load prediction method | |
Guo et al. | Using a hidden Markov model to analyze the flood-season rainfall pattern and its temporal variation over East China | |
Paul et al. | An intelligent system for domestic appliance identification using deep dense 1-D convolutional neural network | |
CN110276478A (en) | Short-term wind power forecast method based on segmentation ant group algorithm optimization SVM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191129 |