CN112382350A - Machine learning estimation method for sensitivity, mechanical property and relation of energetic substances - Google Patents

Machine learning estimation method for sensitivity, mechanical property and relation of energetic substances Download PDF

Info

Publication number
CN112382350A
CN112382350A CN202011311694.2A CN202011311694A CN112382350A CN 112382350 A CN112382350 A CN 112382350A CN 202011311694 A CN202011311694 A CN 202011311694A CN 112382350 A CN112382350 A CN 112382350A
Authority
CN
China
Prior art keywords
model
descriptors
molecular
energetic
sensitivity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011311694.2A
Other languages
Chinese (zh)
Other versions
CN112382350B (en
Inventor
蒲雪梅
邓倩倩
郭延芝
徐涛
刘建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202011311694.2A priority Critical patent/CN112382350B/en
Publication of CN112382350A publication Critical patent/CN112382350A/en
Application granted granted Critical
Publication of CN112382350B publication Critical patent/CN112382350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C60/00Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Investigating Strength Of Materials By Application Of Mechanical Stress (AREA)

Abstract

The invention belongs to the technical field of compound performance evaluation, and discloses a machine learning estimation method for sensitivity, mechanical performance and relationship of energetic materials. According to the invention, 7 QSPR models of the impact sensitivity and the volume modulus of the nitro energetic compound are established based on the molecular descriptor calculated by E-Dragon and several kinds of common molecular structure information, so that the process of experimental research of energetic materials is favorably shortened, and the design and comprehensive evaluation of novel energetic compounds are favorably realized.

Description

Machine learning estimation method for sensitivity, mechanical property and relation of energetic substances
Technical Field
The invention belongs to the technical field of compound performance evaluation, and particularly relates to a machine learning estimation method for sensitivity, mechanical property and relation of energetic substances.
Background
At present, energetic materials are compounds or mixtures containing explosive groups or oxidants and combustibles, which can independently carry out chemical reactions and output energy, and are important components of military explosives, propellants and rocket propellant formulations. The energetic material has wide application in the fields of national defense science and technology industry, aerospace industry and civil use, and the research on the compounds not only has great academic significance, but also has great application value. However, the experiment has the problems of long period, high cost, high risk, and many influence factors, which cause low result reproducibility, and the performance data of the energy-containing material which is not synthesized can not be obtained through the experiment. And the practical application has higher requirements on various properties (high detonation property, good thermal stability, low sensitivity, excellent mechanical property, environmental friendliness and the like), so the development is relatively slow, and deep research from the theoretical aspect has positive guiding significance on accelerating the development and development of the energetic material.
In the middle of the 20 th century, the mode of carrying out simulation on scientific experiments by using an electronic computer is rapidly popularized, researchers can deduce more and more complex phenomena through the structure and movement of simulation substances, the research progress of energetic materials is greatly accelerated due to the appearance of a simulation calculation mode, and the experimental researchers can spend resources on molecules which are expected to improve performance, reduce sensitivity and reduce environmental hazards through calculating the energetic materials screened and designed by a model. However, although the result of the calculation simulation is accurate and reliable, there are a few limitations, such as complicated calculation process, high model requirement, and long time consumption, and high-precision calculation can often be performed only for small batches of substances. Since this century, technology has rapidly developed such that computing power has increased dramatically and data has increased explosively, and the combination of big data and artificial intelligence (including data mining, machine/statistical learning, deep learning, compressed sensing, etc.) has prompted the emergence of the fourth paradigm of scientific researchThe fourth paradigm in the field of materials is also known as "material 4.0". Scientific research in the fourth paradigm, supported by a large amount of data, made it possible to computationally derive previously unknown, plausible theories. The advent of artificial intelligence methods can greatly alter and enhance the role of computers in science and engineering. Machine learning is one of the artificial intelligence branches which are rapidly developed in recent years, and the core statistical algorithm of the machine learning can be continuously improved through training. Such techniques are suitable for handling complex problems involving a large number of combinatorial spaces or non-linear processes, which conventional methods either fail to address or can only handle at great computational expense. The number of applications in the chemical field is increasing at a surprising rate, and the method is widely applied to material synthesis guidance, molecular design, drug discovery, property prediction of various substances and the like. Machine learning has also long been used in the prediction of various important properties of energetic materials. For example, in the prior art 1, a two-dimensional quantitative structure-activity relationship between the toxicity of 148 aromatic nitro compounds to 9 different targets and the molecular structure characteristics of the aromatic nitro compounds is constructed by using Multiple Linear Regression (MLR) and 20 topological descriptors, and 9 model correlation coefficients R2The minimum is 0.71 and the maximum is 0.92. In the prior art 2, detonation heat, density and orbital energy difference are used as input in 2014, prediction of detonation velocities of 54 high-nitrogen compounds is realized through Multivariate Linear Regression (MLR) and least squares support vector machine (LS-SVM), and R of test sets of two methods20.921 and 0.971, respectively. In 2019, in the prior art 3, 104 data points are extracted from 65 CHNO high-energy explosives, and the composition, the structure, the heat of formation and the loading density of the explosives are taken as characteristics, so that the explosion velocity is predicted by using an Artificial Neural Network (ANN). 2016, the prior art 4 collects the experimental density values of 170 nitro energetic compounds, the established MLR and ANN models have good robustness, and a test set R20.886 and 0.931 respectively, provides new opportunities for effectively and rapidly predicting the crystal density and designing new energetic materials with high performance. In prior art 5, in 2017 and 2018, an MLR method is adopted to research quantitative structure-activity relationships of 100 azole compounds and 36 tetrazole nitrogen oxide salts and molecular structures thereof, and a model R20.923 and 0.9321, respectively. 2018 prior art 6 molecular junctionThe structure is characterized in that the autoignition temperature is researched by adopting an MLR method on the basis of 111 experimental data of 54 energetic substances, the RMS of a QSPR model is 47.45K, and the determination of the autoignition temperature of the energetic compounds is simplified. In the same year, the prior art 7 adopts a mixed model SVR-GSA, only molecular weight and CHON number are used as descriptors to establish a prediction model of the spontaneous combustion temperature of 53 organic energetic compounds, and the performance is respectively improved by 37.34 percent and 79.05 percent compared with the two models of the former people. 2018, in the prior art 8, a series of quantitative structure-activity relationship models of properties such as detonation performance, generated heat, density and the like are established for 109 CHONF energetic molecules, various molecular characteristics and machine learning methods are comprehensively compared, and the optimal characteristics and models are respectively bond sum and Kernel Ridge Regression (KRR), so that guidance is provided for further application of machine learning in the field. In addition, a series of researches on the melting point, the lattice energy, the density, the decomposition temperature and the detonation velocity of the energetic eutectic and the detonation performance and the melting point of the energetic ionic liquid are also carried out by a plurality of scholars by using a machine learning method.
Although there have been extensive studies on properties such as density, detonation properties, stability, etc., many problems remain in the field of energetic materials that are worthy of research and exploration.
First, the research on the relationship between mechanical properties and molecular structure is still lacking. The mechanical property is an important practical property of the energetic material, has important significance for the preparation and the use of the energetic material, and means the capability of resisting deformation and fracture under certain temperature conditions and external force action, and is directly related to the energy and the use safety of substances. Although scientists pay more attention to the mechanical properties of energetic composite materials such as PBX explosives, researches indicate that the types, the contents, the proportions and the like of energetic composite components have substantial influence on the mechanical properties and the like, and the monomer energetic material which is taken as a main explosive occupies 90 to 95 percent of the total content (the polymer binder occupies 5 to 10 percent) of the energetic composite material directly influences the overall mechanical properties of an energetic composite system, so that the selection of the main energetic material with excellent properties is the key of the formula design of the energetic composite material. The research on the mechanical property of the energetic material has important significance for guiding the formulation of the energetic material and the design of a structural member, and carrying out safety evaluation, life prediction and the like on the energetic material, however, the research on the quantitative structure-activity relationship between the mechanical property of the energetic material and the structure thereof does not exist at present.
Secondly, the accuracy of the sensitivity QSPR prediction model still has a space for improvement. The sensitivity is one of the most important properties of the energetic material, and is a main index for evaluating the stability and reliability of the energetic material in the using process. The impact sensitivity is the most common one, and represents the easiness of explosion or combustion of the material after mechanical impact, and is generally determined by the explosion probability of a sample in a drop hammer test or the characteristic drop height h under the condition of 50 percent of explosion50And (4) showing. In 2009, the prior art 9 establishes a QSPR model of 156 nitro-energetic compounds based on 16 electrical topological state indexes by adopting a Back Propagation Neural Network (BPNN), a Multiple Linear Regression (MLR) and a Partial Least Squares (PLS), and the R of an integral data test set20.740, 0.715 and 0.718, respectively. The prior art 10 in 2012 features 10 molecular descriptors, and the same data set is researched by using ANN and MLR methods, which test R of sets20.8658 and 0.7222, respectively, the prediction accuracy is improved significantly, but more work is still required to explore and construct a more accurate, more generalized, and more representative model.
And thirdly, the research on the relevance between different properties such as sensitivity and mechanical property is less. Researches indicate that the energetic material has high correlation among various properties, such as correlation between mechanical property and high safety (the energetic material is easy to form stress concentration under the action of external force, and can cause hot spots to form, thereby causing unexpected detonation), and the impact sensitivity is the external expression of the instability of the molecular structure of the explosive. Heretofore, numerous scholars have conducted some research into the relationship between sensitivity and other properties of energetic materials and sensitivities of different forms. For example, in 2004, prior art 11 and the like research the explosion heat and the sensitivity of 30 kinds of nitramines under different predicted reaction paths through a quantum mechanical method, and find that the explosion heat and the natural logarithm of the sensitivity have linear correlation. The prior art 12 in 2017 theoretically proves the correlation between the detonation performance and the sensitivity of the explosive, and researches on 93 aliphatic nitro compounds find that log(h50) With D-4、P-2Or EG -1Linearly increasing and determining the coefficient R between the data2The value is close to 0.8. The prior art 13 introduced a new relationship between impact sensitivity and thermal decomposition activation energy based on 40 CHON nitroaromatic energetic compounds in 2014, and studies showed that impact sensitivity is a function of thermal decomposition activation energy, atomic number ratio of H to O, and specific molecular structural parameters. 2016, the prior art 14 provides a general simple model for predicting friction sensitivity, and the friction sensitivity models of 21 cyclic and acyclic nitramines are constructed by adopting an MLR method by taking thermal decomposition activation energy and molecular structure parameters as characteristics, wherein the RMS of the model is 14.2N. 2016, in the prior art 15, researches on the electric spark sensitivity and the impact sensitivity of 28 CHON nitroaromatic energetic compounds and the impact sensitivity and the static sensitivity of 27 nitroaromatic compounds are respectively carried out, and the two forms of sensitivities have obvious correlation, and the RMS of the constructed linear models is 1.55J and 2.4 percent respectively. In 2018, the prior art 16 also adopts a linear modeling method, and then a relation model of impact sensitivity and electric spark sensitivity of 11 explosives, electric spark sensitivity and impact sensitivity of 31 nitroaromatics and 14 nitramines is constructed, wherein RMS is 2.38kbar and 1.31J respectively. The relationship of mutual influence and mutual restriction between different properties has very important guiding significance for experimental research, but the relationship between sensitivity and mechanical properties is not researched at present. In addition, although the most commonly used methods such as ANN in machine learning have the advantages of being good at extracting abstract features, high in accuracy of quantification and classification models, capable of automatically generating and optimizing brand new structures, and the like, the subsequent interpretability of the models is insufficient.
With the rapid development of artificial intelligence and computer hardware, the impact sensitivity value of a substance can be obtained by a quantitative calculation or empirical formula method, and although many methods are available today, the impact sensitivity value can be obtained by calculation, but certain limitations still exist. Although the quantitative calculation method established in the prior art 17 has high calculation accuracy, the process is complex and time-consuming, and is only suitable for prediction of a few specific molecules, and a large amount of sample calculation consumes huge calculation resources and time cost. For another QSPR model method, a large number of samples can be easily predicted in a short time, but as mentioned above, the accuracy is still more improved. With respect to mechanical properties, due to the difficulty in preparing samples, the assessment of the mechanical behaviour (elasticity, plasticity and fracture phenomena) of molecular crystals is often complicated. Therefore, at present, monomers such as HMX and energetic compounds such as energetic eutectics are generally studied by means of kinetic (MD) simulation, and various mechanical properties are usually calculated from elastic constants. However, the MD simulation has the defects of long calculation time, incapability of calculating a large number of samples in a short time and incapability of being applied to experiment of an uncomplicated substance, so it is necessary to establish a structural property relationship to reduce the prediction time cost of unknown samples or evaluate novel energetic materials.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) at present, the relation research between the mechanical property of energetic materials and the quantitative structure-activity relation and the relation research between the sensitivity and the mechanical property of the structure of the energetic materials do not exist;
(2) the prediction accuracy of the conventional sensitivity prediction model still has a certain promotion space, the generalization capability is insufficient, the representativeness is not strong, and the subsequent interpretability of the model is insufficient;
(3) the mechanical property calculation process of the substance is complex and long in time consumption, is only suitable for the calculation of a few specific molecules, cannot calculate a large number of samples in a short time, and has the defect that the method cannot be applied to experiments on non-synthesized substances.
The difficulty in solving the above problems and defects is:
(1) data acquisition is difficult: experimental data of mechanical properties of a large amount of monomer energetic materials are difficult to obtain, and a large amount of computing resources are consumed for obtaining computing data, so that the cost is high;
(2) alternative methods are limited: although methods such as deep learning, which are more powerful than machine learning, are developed better and better at present, the methods such as deep learning cannot be applied to the methods due to the limitation of the existing data volume, and the common machine learning methods often have the problem of insufficient subsequent interpretability.
The significance of solving the problems and the defects is as follows:
(1) the research on the mechanical properties of the energetic material has important significance for guiding the formulation of the energetic material and the design of structural members, and carrying out safety evaluation, life prediction and the like on the energetic material;
(2) the research on the relationship between the sensitivity and the mechanical property and the molecular structure is helpful for comprehensively evaluating the performance of the novel energetic material so as to accelerate the research and development process of the energetic material;
(3) the relationship of mutual influence and mutual restriction between the mechanical property and the sensitivity has very important guiding significance for experimental research.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a machine learning estimation method for sensitivity, mechanical property and relation of energetic substances.
The invention is realized in such a way that a machine learning estimation method of sensitivity, mechanical property and relation thereof of energetic materials comprises the following steps: the molecular descriptor and the molecular structure information calculated by E-Dragon are taken as characteristics, a quantitative structure-activity relationship model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an Artificial Neural Network (ANN) and a method for determining independent screening and sparse operator (SISSO), and the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the molecular structure of the nitro energetic materials are determined by utilizing the constructed quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic materials.
Further, the machine learning estimation method of the sensitivity and the mechanical property of the energetic material and the relation thereof comprises the following steps:
acquiring and processing data, acquiring impact sensitivity and volume modulus values of 240 nitro compounds mainly containing nitro aromatic hydrocarbons, collecting 7 common characteristics such as molecular weight, crystal density and the like, and calculating Dragon molecular descriptors of 240 substances by taking SMILES character strings as input;
step two, gradually screening out corresponding final characteristics by taking the impact sensitivity and the volume modulus as target properties, and establishing a QSPR (quantitative correlation response) relation model between the impact sensitivity and the volume modulus of the energetic material and the molecular structure of the energetic material by adopting ANN (artificial neural network) and SISSO (siso) methods;
thirdly, establishing a correlation formula of the sensitivity and the mechanical property of the energetic material by taking the impact sensitivity of 240 nitro energetic compounds as output and the atomic number, the molecular weight, the crystal density, the oxygen balance and the bulk modulus of CHON elements as characteristics, and analyzing the relationship between the impact sensitivity and the mechanical property of the nitro energetic material;
and step four, comparing the difference of the performance of the ANN and SISSO methods and the combination model of the two characteristics, calculating a 5-time cross validation result of the model after finding the optimal parameter for each model, and comparing the adaptability of the model with the characteristics selected according to experience and the molecular descriptor calculated according to the SMILES character string.
Further, the data acquisition method comprises the following steps:
obtaining crystal structure files and SMILES character strings of 240 nitro energetic compounds, and obtaining mechanical property data of the nitro energetic compounds in MS software by adopting molecular dynamics simulation of a Forcite module; after the structure is optimized, a COMPASS force field is applied, Anderson temperature control and Parrinello pressure control are adopted under NPT ensemble and 295K temperature, the pressure is set to be 0.0001GPa, an atom-based and EWald addition method is respectively adopted under Van der Waals and electrostatic action, the truncation radius is 0.95nm, and truncation tail correction is carried out. The initial atom movement speed is determined according to Maxwell-Boltzman distribution, the solution of a Newton movement equation is established on the basic assumption that the periodic boundary condition and the time average are equivalent to the ensemble average, the integration adopts a Verlet method, the time step is 1fs, and the track is stored every 10 fs; after the system is balanced, the mechanical property analysis is carried out by adopting a simulation track of 1ns after the balance to obtain an elastic coefficient CijAnd (i, j is 1-6), and calculating to obtain the mechanical property parameters. Impact sensitivity values of 240 nitro energetic materials are calculated by adopting an impact sensitivity prediction empirical formula of the nitro energetic materials proposed by W.P.Lai et al in 2010.
Further, the molecular descriptor calculation method includes:
(1) calculating a molecular descriptor:
1666 molecular descriptors of each molecule are calculated online on the basis of SMILES character strings by adopting E-Dragon1.0 software; obtaining crystal density of nitro energetic compound from CSD (Cambridge crystal database), and obtaining molecular formula C from energetic materialaHbOcNdC, H, O, N atomic number and molecular weight of each substance are extracted;
wherein a, b, c and d respectively represent the atomic number of C, H, O, N elements in a molecule, M is relative molecular mass and has the unit of g/mol, OB is oxygen balance of energetic molecules and has the unit of g/g; the oxygen balance value calculation formula is as follows:
Figure BDA0002790027680000051
(2) the number of descriptors is reduced using statistical methods to obtain the required descriptors for building the QSPR model.
Further, in the step (2), the reducing the number of the descriptors by using a statistical method to obtain the required descriptors for constructing the QSPR model includes the following steps:
1) all descriptors which contain error information and can not calculate exact numerical values are eliminated;
2) removing descriptors with more than 75% of samples having the same value;
3) omitting descriptors with a relative standard deviation RSD less than 0.05;
4) deleting the descriptors with the relevant Pearson coefficients r larger than 0.75, and removing the descriptors with smaller relevance with the target value when the relevance of the two descriptors is larger than 0.75;
5) and removing the descriptors with p values, namely probability values larger than 0.005 by adopting an MLR forward stepwise regression method.
Further, the feature descriptors and target properties used for building the QSPR model after screening are sorted into the following 4 data sets;
dataset-1: obtaining 14 molecular descriptors by taking the bulk modulus as a target property; 9 types of descriptors including structure descriptors, information exponents, edge adjacency exponents, BCUT descriptors, geometric descriptors, 3D-MorSE descriptors, GETAWAY descriptors, atom center fragments and molecular properties;
dataset-2: obtaining 17 molecular descriptors including 2D autocorrelation descriptor, geometric descriptor, RDF descriptor, 3D-Morse descriptor, WHIM descriptor, GETAWAY descriptor and 8 kinds of descriptors including molecular property by taking impact sensitivity as target property;
dataset-3: 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atomic number;
dataset-4: and combining and de-duplicating the descriptors screened twice respectively by taking the impact sensitivity and the volume modulus as target properties together to obtain 26 descriptors comprising 10 types of descriptors.
Further, the construction method of the quantitative structure-activity relationship model of the impact sensitivity and the bulk modulus of the nitro energetic substance comprises the following steps:
adopting Dataset-1, Dataset-2, Dataset-3 and Dataset-4 as data sets, randomly dividing the data sets into two subsets, using 80% of data as a training set and 20% of data as a testing set; modeling 4 data sets including Dataset-1, Dataset-2, Dataset-3 and Dataset-4 by using SISSO and ANN methods to obtain a quantitative structure-activity relationship model of impact sensitivity and bulk modulus of 7 nitro energetic substances;
the quantitative structure-activity relationship models of the impact sensitivity and the volume modulus of the nitro energetic material are respectively as follows:
model-1: ANN model of bulk modulus with corresponding 14 molecular descriptors (Dataset-1);
model-2: an ANN model (Dataset-2) with impact sensitivity and corresponding 17 molecular descriptors;
model-3: an ANN model (Dataset-3) with 8 characteristics of impact sensitivity, bulk modulus, crystal density, oxygen balance, molecular weight and CHON atom number;
model-4: an ANN model of impact sensitivity and bulk modulus with corresponding 26 molecular descriptors (Dataset-4);
model-5: SISSO model of bulk modulus with 14 molecular descriptors (Dataset-1);
model-6: SISSO model of impact sensitivity with 17 molecular descriptors (Dataset-2);
model-7: impact sensitivity and bulk modulus, crystal density, oxygen balance, molecular weight, CHON atom number of 8 characteristics of SISSO model (Dataset-3).
Further, in step four, the two features are:
oxygen balance and related characteristics selected from the Dataset-3 according to chemical experience;
and the molecular descriptors in the Dataset-1, Dataset-2 and Dataset-4 are gradually screened out by adopting a statistical method according to target properties.
Further, in step four, the comparing the differences of the model performance of the two methods of ANN and SISSO in combination with the two features, and the comparing the two features with the model adaptability, the architecture of the best QSPR model obtained for each data set includes:
using root mean square error RMSE, Pearson correlation coefficient R and decision coefficient R2Comprehensively evaluating the performance of the training set and the test set of the 7 QSPR models;
the formula is as follows:
Figure BDA0002790027680000061
Figure BDA0002790027680000062
Figure BDA0002790027680000063
wherein N is the number of compounds in each data set, yi trueIs the true value, yi predIn order to predict the value of the model,
Figure BDA0002790027680000064
is the average of the true values of the samples,
Figure BDA0002790027680000065
is the average of the sample predictions.
Further, the optimal QSPR model corresponding to the data set is:
the best model featuring 14 descriptors and the target property of bulk modulus (mechanical properties) is ANN model (model-1); the best model featuring 17 descriptors with impact sensitivity as the target property is also ANN model (Mode-2).
Further, the relational formula between the impact sensitivity and the mechanical property of the nitro energetic substance is as follows:
Figure BDA0002790027680000066
h50for impact sensitivity, a, b, c, d are the number of C, H, O, N elements in the molecule, M is the molecular weight, OB is the oxygen balance number, and K is the bulk modulus.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of: the molecular descriptor and the molecular structure information calculated by E-Dragon are taken as characteristics, a quantitative structure-activity relationship model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an artificial neural network and a method for determining independent screening and sparse operators, and the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the molecular structure are determined by utilizing the constructed quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic materials.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of: the molecular descriptor and the molecular structure information calculated by E-Dragon are taken as characteristics, a quantitative structure-activity relationship model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an artificial neural network and a method for determining independent screening and sparse operators, and the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the molecular structure are determined by utilizing the constructed quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic materials.
Another object of the present invention is to provide a machine learning estimation system for sensitivity and mechanical properties of energetic materials and their relationships, which implements the machine learning estimation method for sensitivity and mechanical properties of energetic materials and their relationships, the machine learning estimation system for sensitivity and mechanical properties of energetic materials and their relationships comprising:
the quantitative structure-activity relationship model construction module is used for constructing quantitative structure-activity relationship models of the impact sensitivity and the bulk modulus of 5 nitro energetic materials and the molecular structure thereof respectively based on an artificial neural network and a method for determining independent screening and sparse operators by taking the molecular descriptor calculated by the E-Dragon as a characteristic;
and the impact sensitivity and mechanical property relation determining module is used for determining the relation between the impact sensitivity and the mechanical property of the nitro energetic material by using a quantitative structure-activity relation model of the impact sensitivity and the volume modulus of 2 nitro energetic materials which is constructed by taking molecular structure information as characteristics.
By combining all the technical schemes, the invention has the advantages and positive effects that: according to the invention, 7 QSPR models (models 1-7) of impact sensitivity and bulk modulus (mechanical property) of the nitro energetic compound are established based on the molecular descriptor calculated by E-Dragon and several common and easily-obtained molecular structure information, so that the process of experimental research of energetic materials is favorably shortened, and the design and comprehensive evaluation of novel energetic compounds are favorably realized. Meanwhile, experiments prove that the model has certain advantages and high accuracy in predicting the impact sensitivity.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
FIG. 1 is a schematic diagram of a machine learning estimation method for sensitivity and mechanical properties of energetic materials and relationships thereof according to an embodiment of the present invention.
FIG. 2 is a flow chart of a machine learning estimation method for sensitivity and mechanical properties of energetic materials and their relationships according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a model comparison using a data set 1 according to an embodiment of the present invention;
in the figure: a: r of Dataset-1 Dataset model2(ii) a B: r for the Dataset-1 model; c: RMSE for the Dataset-1 Dataset model.
FIG. 4 is a schematic diagram of a model comparison using a data set 2 provided by an embodiment of the present invention;
in the figure: a: r of Dataset-2 Dataset model2(ii) a B: r of the Dataset-2 model; c: RMSE for the Dataset-2 Dataset model.
FIG. 5 is a schematic diagram of a model comparison using a data set 3 provided by an embodiment of the present invention;
in the figure: a: r of Dataset-3 Dataset model2(ii) a B: r for the Dataset-3 model; c: RMSE for the Dataset-3 Dataset model.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a machine learning estimation method for sensitivity and mechanical property of energetic substances and relationship thereof, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, a machine learning estimation method for sensitivity and mechanical property of energetic materials and their relationship provided by an embodiment of the present invention includes: and (3) taking the molecular descriptor and the molecular structure information calculated by the E-Dragon as characteristics, constructing quantitative structure-activity relationship models of the impact sensitivity and the bulk modulus of 7 nitro energetic materials based on an artificial neural network and a method for determining independent screening and sparse operators, and determining the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the molecular structure of the nitro energetic materials by using the constructed quantitative structure-activity relationship models of the impact sensitivity and the bulk modulus of the nitro energetic materials.
As shown in fig. 2, the method for machine learning estimation of sensitivity and mechanical property of energetic material and relationship thereof according to the embodiment of the present invention includes the following steps:
s101, acquiring and processing data, acquiring impact sensitivity and volume modulus values of 240 nitro compounds mainly containing nitro aromatic hydrocarbons, collecting 7 common characteristics such as molecular weight and crystal density, and calculating Dragon molecular descriptors of 240 substances by taking SMILES character strings as input;
s102, gradually screening out corresponding final characteristics by taking the impact sensitivity and the volume modulus as target properties, and establishing a QSPR (stress-response-relationship) relation model between the impact sensitivity and the volume modulus of the energetic material and the molecular structure of the energetic material by adopting ANN (artificial neural network) and SISSO (siso) methods;
s103, establishing a correlation formula of the sensitivity and the mechanical property of the energetic material by taking the impact sensitivity of 240 nitro energetic compounds as output and taking the atomic number, the molecular weight, the crystal density, the oxygen balance and the bulk modulus of CHON elements as characteristics, and analyzing the relationship between the impact sensitivity and the mechanical property of the nitro energetic material;
s104, comparing the difference of the ANN method and the SISSO method and the model performance of the combination of the two characteristics, calculating a 5-time cross validation result of the model after finding the optimal parameter for each model, and comparing the adaptability of the model with the characteristics selected according to experience and the molecular descriptor calculated according to the SMILES character string.
The data acquisition method provided by the embodiment of the invention comprises the following steps:
obtaining crystal structure files and SMILES character strings of 240 nitro energetic compounds, in MThe mechanical property data of the nitro energetic compound is obtained by adopting the molecular dynamics simulation of a Forcite module in S software; after the structure is optimized, a COMPASS force field is applied, Anderson temperature control and Parrinello pressure control are adopted under NPT ensemble and 295K temperature, the pressure is set to be 0.0001GPa, an atom-based and EWald addition method is respectively adopted under Van der Waals and electrostatic action, the truncation radius is 0.95nm, and truncation tail correction is carried out. The initial atom movement speed is determined according to Maxwell-Boltzman distribution, the solution of a Newton movement equation is established on the basic assumption that the periodic boundary condition and the time average are equivalent to the ensemble average, the integration adopts a Verlet method, the time step is 1fs, and the track is stored every 10 fs; after the system is balanced, the mechanical property analysis is carried out by adopting a simulation track of 1ns after the balance to obtain an elastic coefficient CijAnd (i, j is 1-6), and calculating to obtain the mechanical property parameters. Impact sensitivity values of 240 nitro energetic materials are calculated by adopting an impact sensitivity prediction empirical formula of the nitro energetic materials proposed by W.P.Lai et al in 2010.
The method for calculating the molecular descriptor provided by the embodiment of the invention comprises the following steps:
(1) calculating a molecular descriptor:
1666 molecular descriptors of each molecule are calculated online on the basis of SMILES character strings by adopting E-Dragon1.0 software; obtaining crystal density of nitro energetic compound from CSD (Cambridge crystal database), and obtaining molecular formula C from energetic materialaHbOcNdC, H, O, N atomic number and molecular weight of each substance are extracted;
wherein a, b, c and d respectively represent the atomic number of C, H, O, N elements in a molecule, M is relative molecular mass and has the unit of g/mol, OB is oxygen balance of energetic molecules and has the unit of g/g; the oxygen balance value calculation formula is as follows:
Figure BDA0002790027680000081
(2) the number of descriptors is reduced using statistical methods to obtain the required descriptors for building the QSPR model.
In step (2), the reduction of the number of descriptors by using a statistical method provided by the embodiment of the present invention to obtain a required descriptor for constructing a QSPR model includes the following steps:
1) all descriptors which contain error information and can not calculate exact numerical values are eliminated;
2) removing descriptors with more than 75% of samples having the same value;
3) omitting descriptors with a relative standard deviation RSD less than 0.05;
4) deleting the descriptors with the relevant Pearson coefficients r larger than 0.75, and removing the descriptors with smaller relevance with the target value when the relevance of the two descriptors is larger than 0.75;
5) and removing the descriptors with p values, namely probability values larger than 0.005 by adopting an MLR forward stepwise regression method.
The feature descriptors and the target properties for building the QSPR model provided by the embodiment of the invention are arranged into the following 4 data sets;
dataset-1: obtaining 14 molecular descriptors by taking the bulk modulus as a target property; 9 types of descriptors including structure descriptors, information exponents, edge adjacency exponents, BCUT descriptors, geometric descriptors, 3D-MorSE descriptors, GETAWAY descriptors, atom center fragments and molecular properties;
dataset-2: obtaining 17 molecular descriptors including 2D autocorrelation descriptor, geometric descriptor, RDF descriptor, 3D-Morse descriptor, WHIM descriptor, GETAWAY descriptor and 8 kinds of descriptors including molecular property by taking impact sensitivity as target property;
dataset-3: 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atomic number;
dataset-4: and combining and de-duplicating the descriptors screened twice respectively by taking the impact sensitivity and the volume modulus as target properties together to obtain 26 descriptors comprising 10 types of descriptors.
The method for constructing the quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic substance provided by the embodiment of the invention comprises the following steps:
adopting Dataset-1, Dataset-2, Dataset-3 and Dataset-4 as data sets, randomly dividing the data sets into two subsets, using 80% of data as a training set and 20% of data as a testing set; modeling 4 data sets including Dataset-1, Dataset-2, Dataset-3 and Dataset-4 by using SISSO and ANN methods to obtain a quantitative structure-activity relationship model of impact sensitivity and bulk modulus of 7 nitro energetic substances;
the quantitative structure-activity relationship models of the impact sensitivity and the volume modulus of the nitro energetic material are respectively as follows:
model-1: ANN model of bulk modulus with corresponding 14 molecular descriptors (Dataset-1);
model-2: an ANN model (Dataset-2) with impact sensitivity and corresponding 17 molecular descriptors;
model-3: an ANN model (Dataset-3) with 8 characteristics of impact sensitivity, bulk modulus, crystal density, oxygen balance, molecular weight and CHON atom number;
model-4: an ANN model of impact sensitivity and bulk modulus with corresponding 26 molecular descriptors (Dataset-4);
model-5: SISSO model of bulk modulus with 14 molecular descriptors (Dataset-1);
model-6: SISSO model of impact sensitivity with 17 molecular descriptors (Dataset-2);
model-7: impact sensitivity and bulk modulus, crystal density, oxygen balance, molecular weight, CHON atom number of 8 characteristics of SISSO model (Dataset-3).
In step S104, two features provided by the embodiment of the present invention are:
oxygen balance and related characteristics selected from the Dataset-3 according to chemical experience;
and the molecular descriptors in the Dataset-1, Dataset-2 and Dataset-4 are gradually screened out by adopting a statistical method according to target properties.
In step S104, comparing the differences in model performance between the two methods for comparing ANN and SISSO provided by the embodiment of the present invention and the two features, and comparing the adaptability between the two features and the model, the architecture of the optimal QSPR model obtained for each data set includes:
using root mean square errorRMSE, Pearson correlation coefficient R, and determination coefficient R2Comprehensively evaluating the performance of the training set and the test set of the 7 QSPR models;
the formula is as follows:
Figure BDA0002790027680000101
Figure BDA0002790027680000102
Figure BDA0002790027680000103
wherein N is the number of compounds in each data set, yi trueIs the true value, yi predIn order to predict the value of the model,
Figure BDA0002790027680000104
is the average of the true values of the samples,
Figure BDA0002790027680000105
is the average of the sample predictions.
The optimal QSPR model corresponding to the data set provided by the embodiment of the invention is as follows:
the best model with 14 descriptors as characteristics and the bulk modulus, namely the mechanical property as the target property is an ANN model (model-1); the best model featuring 17 descriptors with impact sensitivity as the target property is also ANN model (Mode-2).
The relational formula between the impact sensitivity and the mechanical property of the nitro energetic substance provided by the embodiment of the invention is as follows:
Figure BDA0002790027680000106
h50for impact sensitivity, a, b, c, d are each C, H, O, N elements in the moleculeThe number, M is the molecular weight, OB is the oxygen balance number, and K is the bulk modulus.
The technical solution of the present invention is further illustrated by the following specific examples.
Example 1:
1. introduction to the word
The invention aims to research the relationship between the impact sensitivity and the mechanical property of the nitro energetic substance and the quantitative relationship between the two and the molecular structure by adopting two methods, namely ANN and SISSO. The content of the study includes the following four parts.
1) And establishing a QSPR model of the mechanical property and the molecular structure of the energetic material. Nitro-compounds, as a High Energy Density Material (HEDM) widely used in civilian and military applications, are today still the most major and important part of explosives. Almost all energetic materials contain nitro groups (X-NO) attached to carbon, nitrogen or oxygen atoms2X ═ C, N or O), the nitro group providing the energetic molecule with the element nitrogen, ensuring its decomposition to N2Releases a large amount of energy and also provides oxygen element which is essential in the combustion or detonation process of the high-energy material. Reviewing the previous research, it can be found that besides the research on the properties of a plurality of classes of nitro-energetic substances (including nitroarenes, nitramines, aliphatic nitro compounds, nitrates and the like), a plurality of researchers only research on the nitro-aromatic hydrocarbon alone, and the nitro-aromatic hydrocarbon is greatly concerned by the researchers as an important component of the nitro-energetic substances. Therefore, 240 nitro compounds mainly comprising nitroarene are collected from a Cambridge crystal database, and contain N, O-element-containing groups such as amino, hydroxyl, carboxyl, alkoxy, amide and the like besides nitro. A relation model between the mechanical property of the nitro energetic substance and the molecular structure of the nitro energetic substance is constructed by using a Dragon molecular descriptor as model characteristics and adopting ANN and SISSO methods.
2) And establishing a QSPR relation model between the impact sensitivity of the energetic material and the molecular structure of the energetic material. The construction of the QSPR model is the same as that of the QSPR model in mechanical property, the descriptors of 240 molecules are firstly calculated, then the final characteristics are gradually screened out according to target properties, and finally a relational model of the descriptor characteristics and the impact sensitivity is established by adopting ANN and SISSO methods.
3) And establishing a correlation between the sensitivity and the mechanical property of the energetic material. The invention takes the impact sensitivity of 240 nitro energetic compounds as output, and the atomic number, molecular weight, crystal density, oxygen balance and bulk modulus of CHON element as characteristics to establish a relational expression of sensitivity and mechanical property, thereby analyzing the relationship existing between the two compounds from the theoretical point of view. Several characteristics other than bulk modulus were selected based on previous research experience. The atomic number and molecular weight of the CHON element are the most basic properties of a substance and are often used as features to predict detonation performance, density, auto-ignition temperature, and other properties. The crystal density is one of the most easily obtained properties of energetic materials and is also often used as a feature to predict detonation velocity and other related properties. While Kamlet and Adolph found a linear relationship between the logarithm of impact sensitivity of energetic materials with the same dissociation mechanism and oxygen equilibrium as early as 1979. It is worth mentioning that the impact sensitivity and bulk modulus data used in the present invention are obtained by calculation, not experimental values. With the rapid development of artificial intelligence and computer hardware, the impact sensitivity value of a substance can be obtained by a quantitative calculation or empirical formula method, the accuracy of the prediction results of various models is high, the reliability is high, the time and the cost of experimental research are greatly saved, and the method is widely applied to the design of novel energetic materials. Secondly, because the process caused by the impact is extremely complex, the experimental data of the impact sensitivity are greatly different due to different testing equipment, sample size, configuration and the like, and the measurement results are generally irreproducible and greatly different, the data obtained by the experiment is only used as an approximate indication of the sensitivity, and the integration of the experimental data under different conditions has certain difficulty. Therefore, the invention selects a calculation mode under the condition of lacking experimental data. In the prior art 18, CHON polynitro aromatic compounds are studied, and a correlation formula of impact sensitivity is obtained by using the molecular weight of explosive molecules and the number of each element atom, although the application result is well consistent with experimental data, the influence of different group connecting positions on substances is not considered. The empirical formula adopted by the invention is improved on the basis, and the correction factor of the position of the group is introduced, so that the prediction accuracy is further improved.
Although there are many ways to calculate the impact sensitivity value, there are still some limitations. For example, although the existing quantitative calculation method has high calculation accuracy, the process is complex and time-consuming, and is only suitable for prediction of a few specific molecules, and a large amount of sample calculation consumes huge calculation resources and time cost. The empirical formula method adopted by the invention usually needs to subdivide the nitro compound into several classes (such as polynitro aromatic hydrocarbon, nitramine, aliphatic polynitro compound, nitro heterocyclic compound and the like) according to the structural characteristics and then respectively construct a calculation formula, and when the method is applied, the correction term is usually calculated by checking the table of the structural information (such as the number of groups, the relative positions of the groups and the like) one by one, and large-scale calculation is time-consuming and labor-consuming. For another QSPR model method, a large number of samples can be easily predicted in a short time, but as mentioned above, the accuracy is still more improved, which is the objective of the present invention to establish a sensitivity model. With respect to mechanical properties, due to the difficulty in preparing samples, the assessment of the mechanical behaviour (elasticity, plasticity and fracture phenomena) of molecular crystals is often complicated. Therefore, at present, monomers such as HMX and energetic compounds such as energetic eutectics are generally studied by means of kinetic (MD) simulation, and various mechanical properties are usually calculated from elastic constants. However, the MD simulation has the defects of long calculation time, incapability of calculating a large number of samples in a short time and incapability of being applied to experiment of an uncomplicated substance, so it is necessary to establish a structural property relationship to reduce the prediction time cost of unknown samples or evaluate novel energetic materials.
4) The differences in model performance were compared between the two methods ANN and SISSO combined with the two features. One of the characteristics is the oxygen balance selected from the Dataset-3 according to the chemical experience, and the like; the other is the molecular descriptors (Dataset-1, Dataset-2 and Dataset-4) which are gradually screened out by adopting a statistical method according to the target properties without being screened according to manual experience. While the QSPR model is constructed by adopting ANN and SISSO methods to predict sensitivity and mechanical property, the invention also discusses and compares the two characteristics with the adaptability of the model.
2. Method of producing a composite material
2.1 database Collection
The invention obtains crystal structure files and SMILES character strings of 240 nitro energetic compounds from a Cambridge crystal database (CSD), and obtains the mechanical property data of the nitro energetic compounds by adopting the molecular dynamics simulation of a Forcite module in Materials Studio (MS) software. After the structure is optimized, a COMPASS force field is applied, Anderson temperature control and Parrinello pressure control are adopted under NPT ensemble and 295K temperature, the pressure is set to be 0.0001GPa, an atom-based and EWald addition method is respectively adopted for Van der Waals (vdW) and electrostatic interaction (Coulomb), the truncation radius is 0.95nm, and truncation tail correction is carried out. The initial atom movement speed is determined according to Maxwell-Boltzman distribution, the solution of a Newton movement equation is established on the basic assumption that the periodic boundary condition and the time average are equivalent to the ensemble average, the integration adopts a Verlet method, the time step is 1fs, and the track is stored every 10 fs. After the system is balanced, the mechanical property analysis is carried out by adopting a simulation track of 1ns after the balance to obtain an elastic coefficient CijAnd (i, j is 1-6), and then calculating to obtain mechanical property parameters. The tensile modulus (E), shear modulus (G), and bulk modulus (K) among mechanical properties, collectively referred to as engineering modulus, are generally used as criteria for evaluating the rigidity or hardness of a material, and the bulk modulus K studied in the present invention is an important basis for reflecting the breaking strength.
Because the existing impact sensitivity experimental data is lacked, and the data obtained by the experiment is only used as approximate indication of sensitivity due to poor reproducibility and the like, the invention adopts a reliable empirical formula provided by Lai and the like to calculate and obtain the sensitivity data according to the basic structure information of the nitro compound.
2.2 molecular descriptor computation
An important step in building QSPR models is to quantify the structural information of the molecule under study, and molecular descriptors, which are mathematical representations of the molecule, convert the chemical information in the structure into useful numbers, each taking into account a small portion of the total chemical information contained in the real molecule. The invention adopts E-Dragon1.0 software and calculates 1666 molecular descriptors online on the basis of SMILES character strings. Table 1 shows the 20 classes that the descriptor contains.
The crystal density of the nitro-energetic compound is obtained from CSD and the oxygen balance value is calculated using equation (1). In addition, C, H, O, N atomic number and molecular weight of each substance were extracted from the molecular formula,
Figure BDA0002790027680000121
the molecular formula of the CHON series explosive researched by the invention can be written as CaHbOcNdWherein a, b, c and d represent the atomic number of C, H, O, N elements in the molecule, M is the relative molecular mass (g/mol), and OB is the oxygen balance (g/g) of the energy-containing molecule.
2.3 reduction of Dragon descriptor
The present invention uses a statistical method to reduce the number of descriptors and obtain the required descriptors for building the QSPR model. Mainly comprises the following 5 steps:
(1) eliminating all descriptors containing wrong information (i.e. exact numerical values cannot be calculated);
(2) removing descriptors with more than 75% of samples having the same value;
(3) omitting descriptors with a Relative Standard Deviation (RSD) of less than 0.05;
(4) and deleting the descriptors with the relevant Pearson coefficients r larger than 0.75, and removing the descriptors with smaller relevance with the target value when the relevance of the two descriptors is larger than 0.75.
(5) Finally, removing descriptors with p value (probability value) larger than 0.005 by adopting an MLR forward stepwise regression method. Table 1 shows the results of the screening for properties targeted at bulk modulus and impact sensitivity for each step.
TABLE 1 reduction Process statistics for molecular descriptors with bulk modulus and impact sensitivity as target Properties, respectively
Figure BDA0002790027680000122
Figure BDA0002790027680000131
The method comprises the steps of obtaining 14 molecular descriptors by taking the bulk modulus as a target property, and obtaining 9 types of descriptors including a composition descriptor, an information index, an edge adjacency index, a BCUT descriptor, a geometric descriptor, a 3D-Morse descriptor, a GETAGAY descriptor, an atomic center segment and a molecular property as shown in Table 2. And obtaining 17 molecular descriptors with the impact sensitivity as the target property, wherein the 17 molecular descriptors comprise 2D autocorrelation descriptors, geometric descriptors, RDF descriptors, 3D-MorSE descriptors, WHIM descriptors, GETAGAY descriptors and 8 types of descriptors of molecular properties, as shown in Table 3. And combining and de-duplicating the descriptors screened twice by taking the impact sensitivity and the bulk modulus as target properties to obtain 26 descriptors, as shown in table 4, including 10 types of descriptors.
To this end, a total of four data sets are obtained.
Dataset-1: the volume modulus K corresponds to the 14 molecular descriptors screened out,
dataset-2: sensitivity to impact h50Corresponding to the screened 17 molecular descriptors,
dataset-3: impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight, CHON atom number and other 8 characteristics,
dataset-4: bulk modulus and impact sensitivity and the 26 molecular descriptors corresponding thereto.
TABLE 2 descriptor features screened for K as a target Property
Figure BDA0002790027680000132
TABLE 3 in h50Screening derived descriptor features for target properties
Figure BDA0002790027680000141
TABLE 4 in K and h50Screening derived descriptor features for target properties
Figure BDA0002790027680000142
2.4 construction of ANN and SISSO models
Determining independent filters and sparse operators (SISSO) is a novel data analysis method that has been developed based on compressed sensing, aiming at identifying low-dimensional descriptors that can capture the characteristics and functional attributes of underlying physical mechanisms, and has been successfully applied to many material science problems. SISSO can identify the best descriptor from a large number of combinations of features (physical characteristics), reduce the dimensionality of the large feature space, and determine problem-independent features, thereby further optimizing the feature space and ultimately obtaining a display resolution function for the underlying physical characteristics. The method is also applicable to small dataset models. For the regression problem, SISSO has the potential to transform a complex nonlinear problem into a linear problem. For the classification problem, besides calculating specific numerical values of certain properties, if an ideal effect can be achieved by adopting two combination characteristics, a visual and clear material diagram can be directly drawn, and the principle or the internal mechanism is further explored.
SISSO mainly comprises the following two steps: 1) a feature space (potential descriptors) is constructed. The algebra/function operators (such as addition, multiplication, exponentiation, power, root and the like) are combined with the original features in an iterative way, and each feature (a pair of features) is combined with each unary (binary) operator exhaustively in each iteration (addition and subtraction operation can be carried out only among the same kind of features after the features are classified, and the complexity of the combined features can be limited), so that an arbitrarily large feature space can be constructed. 2) The descriptor identifies one or more combined features that are best combined, selected from the constructed feature space. In the first step, the size of the feature space depends on the number of operators, the dimension of the initial feature and the number of iterations. In the second step, the complexity depends on the dimension of the final feature, the size of the feature subspace, and the type of model. In the modeling process, hyper-parameters such as iteration times q of feature construction, final feature dimension omega, feature subspace size SIS and the like need to be optimized according to comprehensive consideration such as model performance, time cost and the like. The model of the invention is mainly optimized with respect to the number of iterations q and the dimension Ω of the final feature.
An Artificial Neural Network (ANN) which has attracted attention in recent years is an information processing system based on the intelligence characteristics and structure of a simulated human brain, has the capabilities of parallel distributed processing and storage, high fault tolerance, self-organization and self-adaptation, is used for exploring and summarizing data laws as a method which is most commonly used for QSPR modeling, establishes quantitative mathematical models from 'cause' to 'effect', and makes great progress in the fields of biology, medicine, economy and the like. For the type of ANN, the present invention uses MLP-ANN. The parameter ranges are as follows, the algorithm: 'lbfgs', 'sgd', 'adam', activation function: 'identity', 'logistic', 'tanh', 'relu', number of cryptic neurons: and 2-20, selecting optimal parameters after optimization.
The main work of the invention is to adopt two methods of SISSO and ANN to model four data sets, the data sets are randomly divided into two subsets, 80% of the data are used as a training set, and 20% of the data are used as a testing set. The overall workflow is shown in fig. 2.
3. Results
3.1 optimal QSPR model per dataset
After the steps of data sorting, feature selection and model construction, the invention only selects the best model to test the performance. The invention builds 7 models on the basis of 4 data sets,
model-1: ANN model of bulk modulus with 14 molecular descriptors (Dataset-1),
model-2: the impact sensitivity with 17-molecule descriptor ANN model (Dataset-2),
model-3: an ANN model (Dataset-3) with 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atom number,
model-4: the ANN model of impact sensitivity and bulk modulus with 26 molecular descriptors (Dataset-4),
model-5: SISSO model of bulk modulus with 14 molecular descriptors (Dataset-1),
model-6: impact sensitivity with SISSO model of 17 molecular descriptors (Dataset-2),
model-7: impact sensitivity and bulk modulus, crystal density, oxygen balance, molecular weight, CHON atom number of 8 characteristics of SISSO model (Dataset-3).
For the performance of the training set and the test set of each QSPR model, the invention adopts Root Mean Square Error (RMSE), Pearson correlation coefficient (R) and decision coefficient (R)2) And (4) comprehensive evaluation, wherein the mathematical definition formulas are respectively given by formulas (2) to (4). Table 5 and table 6 give the model parameters and evaluation parameters for the 7 models.
Figure BDA0002790027680000161
Figure BDA0002790027680000162
Figure BDA0002790027680000163
Wherein N is the number of compounds in each data set, yi trueIs the true value, yi predIn order to predict the value of the model,
Figure BDA0002790027680000164
is the average of the true values of the samples,
Figure BDA0002790027680000165
is the average of the sample predictions.
TABLE 5 parameters for the optimal QSPR model for each data set
Figure BDA0002790027680000166
TABLE 6 statistical parameters of the optimal QSPR model for each data set
Figure BDA0002790027680000167
The method is characterized in that 14 molecular descriptors (9 types in total) are used as characteristics, and a relational Model (Dataset-1, Model-1) of the volume modulus (mechanical property) of the nitro energetic compound and the molecular structure thereof, R of a training set and a testing set is established by adopting ANN20.92 and 0.81 respectively, and the model has good prediction performance. The present invention also uses SISSO to Model the data set (Dataset-1, Model-5), but the results are not satisfactory, R of training set and testing set20.71 and 0.63, respectively, there is an under-fitting problem.
The method is characterized in that 17 molecular descriptors (totally 8 types) are used as features, and a relational Model (Dataset-2, Model-2) of impact sensitivity of the nitro energetic compound and the molecular structure thereof, R of a training set and a testing set is established by adopting ANN2The model prediction performance is good, namely 0.93 and 0.91 respectively. The present invention also adopts SISSO method to establish the data set Model (Dataset-2, Model-6), training set and test set R20.91 and 0.85 respectively, the performance was still good although the test set was slightly worse than Model-2. The present invention also compares the results of the present invention with those of the previous study, and the results are shown in table 7, which shows that the data set of the present invention is larger and the model prediction accuracy is highest.
TABLE 7 prediction of h50Performance comparison of different QSPR models
Figure BDA0002790027680000171
Despite the numerous factors influencing impact sensitivity, studies according to the invention have shown that it is possible to correlate the impact sensitivity of nitro energetic substances with their mechanical properties (Dataset-3, Model-7). R of the model training set and test set20.90 and 0.91, respectively. The relationship between the two is analyzed, and the result shows an analytic formula in a linear-like form, as shown in formula (5). In this formula, the collisionThe sensitivity to impact is a function of the number of CHON atoms, oxygen balance, molecular weight, and bulk modulus, all of which are directly accessible by the formula. As can be seen from the formula, for the isomer, when K is extremely small, the impact sensitivity h50The value of (c) will be large and strongly influenced by the fluctuation of the value of K. This is just a side view of the necessity of adding additives such as an adhesive to the PBX explosive to improve mechanical properties, and if the mechanical properties (such as bulk modulus) of the main explosive are high, the sensitivity is high and the safety is low. The performance of ANN Model (Dataset-3, Model-3) built by the same data set is almost the same as that of Model-7, R of training set and testing set20.91 and 0.89 respectively.
Figure BDA0002790027680000172
h50For impact sensitivity, a, b, c, d are the number of C, H, O, N elements in the molecule, M is the molecular weight, OB is the oxygen balance number, and K is the bulk modulus.
In addition, simple comparison of Model-1, Model-2 and Model-4 shows that the ANN Model takes sensitivity and mechanical property as output (Model-4) at the same time, and the performance is inferior to that of Model-1 and Model-2 which take one property as output alone, because the Model needs to give consideration to the accuracy of prediction of two target properties during parameter adjustment. Although the merging of feature building models to predict multiple properties simultaneously is beneficial to research to some extent, if accuracy is pursued, the single-output models are not separately established.
3.2 comparison of ANN and SISSO models
In order to objectively compare the sensitivity and the performance of the mechanical performance model, the invention respectively optimizes parameters of SISSO and ANN, and compares the cross validation results of 5 times of the same sample division of each data set. The ANN is still optimized in the following parameter ranges, the algorithm: 'lbfgs', 'sgd', 'adam', activation function: 'identity', 'logistic', 'tanh', 'relu', number of cryptic neurons: 2-20. SISSO model optimization iteration times q andthe final characteristic dimension omega is two parameters. For Dataset-1, Dataset-2, and Dataset-3, in FIGS. 3-5, the present invention compares the Root Mean Square Error (RMSE), Pearson correlation coefficient (R), and determinant coefficient (R) of the SISSO model for different iterations and the final feature dimension training set and test set2). In fig. 3, the performance of the training set is slightly improved with the increase of the final dimension and the iteration number, but the result of the test set shows that the performance difference is not large when the iteration number is 0-1, and overfitting occurs when the iteration number is 2. A clear trend is shown in fig. 4, where the model accuracy increases with the number of iterations, and the model accuracy increases with the number of iterations. As can be seen from fig. 5A, as the final dimension and the number of iterations increase, the performance gap of the training set is not large, but the performance of the test set gradually increases and the gap from the performance of the training set gradually decreases. And (3) selecting 3 data set cross validation model parameters by comprehensively considering the performance of the model, the complexity of the SISSO analytic formula and the time cost for constructing the model. SISSO and ANN final selection parameters are shown in tables 8-10.
Table 8 model structure and statistical parameters for 5-fold cross validation results for data set 1
Figure BDA0002790027680000181
For SISSO, the parameters a are q, b is SIS, and c is Ω. For ANN, the parameters a are the number of hidden layer neurons, b is the training algorithm, and c is the activation function.
Table 9 model structure and statistical parameters for 5-fold cross validation results for dataset 2
Figure BDA0002790027680000182
For SISSO, the parameters a are q, b is SIS, and c is Ω. For ANN, the parameters a are the number of hidden layer neurons, b is the training algorithm, and c is the activation function.
Table 10 model structure and statistical parameters for 5-fold cross-validation results for data set 3
Figure BDA0002790027680000183
For SISSO, the parameters a are q, b is SIS, and c is Ω. For ANN, the parameters a are the number of hidden layer neurons, b is the training algorithm, and c is the activation function.
Comparing the performance of Dataset-1, Dataset-2, Dataset-3 on SISSO and ANN models (Table 8-10), it can be seen that the mechanical properties and sensitivity data (Dataset-1 and Dataset-2) models characterized by molecular descriptors are superior to those of SISSO models, and the difference between the two is particularly significant on Dataset-1. Whereas, the sensitivity data (Dataset-3) model characterized by mechanical properties, etc. is superior to the ANN in the SISSO method. This may be due to the difference in the amount of information contained and the transparency of the information by the two features. Deep learning models built on large data bases can extract useful information from abstract generic features such as SMILES strings, while for small data greater accuracy is often achieved by using more specific features that depend on chemical ability (intuition) and domain expertise by hand. The physical significance of the descriptor calculated by SMILES is not as clear as the parameters such as element composition, crystal density, mechanical property, oxygen balance, molecular weight and the like, and the useful information contained in the parameters such as oxygen balance and the like is more transparent than the descriptor. The ANN has strong capability of processing the nonlinear model, is good at mining the internal relation among data, and is suitable for the complex nonlinear problem with slightly ambiguous physical meanings such as descriptors and the like. SISSO is relatively simple in principle, has high requirements for screening initial characteristics, and is suitable for constructing a model with more definite characteristic information. It is worth mentioning that SISSO has the potential to transform non-linearity problems into linearity problems. When the feature space is huge or highly correlated, SISSO is not limited by the traditional LASSO method, and the effectiveness of compressed sensing can be maintained while solving the problem of huge space. The only limitation is that certain computational conditions are required to process a huge feature space. But due to the proximity of the equations and the inevitable correlation between features (one or more features may be accurately described by the non-linear function of the remaining feature subset), the equations found by SISSO are not necessarily unique and the components of the descriptors may vary as the dimension of the final feature varies.
In the invention, based on the molecular descriptor calculated by E-Dragon and several common and easily obtained molecular structure information, 7 QSPR models (Model1-7) of impact sensitivity and bulk modulus (mechanical property) of the nitro energetic compound are established, so that the process of experimental research of energetic materials is favorably shortened, and the design and comprehensive evaluation of novel energetic compounds are favorably realized. The best Model with 14 descriptors as characteristics and bulk modulus (mechanical property) as target property is ANN Model (Dataset-1, Model-1), training set R20.92, test set 0.81. The optimal Model with 17 descriptors as features and impact sensitivity as target property is also ANN Model (Dataset-2, Model-2), training set R20.93, test set 0.91. Comparing the ANN sensitivity model of the present invention with similar models in the literature demonstrates that the model of the present invention has certain advantages in predicting impact sensitivity because of its highest accuracy.
In addition, the invention also proves that a certain relation exists between the impact sensitivity and the mechanical property, the two property relations of the nitro energetic compound are presented in the form of a mathematical formula, and the analysis of the formula shows that when the molecular formulas are the same, the volume modulus is smaller, the substance sensitivity is lower, and when the volume modulus is small enough, the sensitivity can be greatly reduced by the weak reduction range of the value, which just reflects the necessity of improving the mechanical property of the energetic material. Comparing the input characteristics and performances of the models, the invention draws the conclusion that the ANN model with strong function is good at mining the internal relation among data, and can extract the required information from the characteristics (such as molecular descriptors) with small information quantity and ambiguity and learn; whereas the SISSO method with a relatively simple principle is more suitable for the characteristic information with definite physical meaning.
The results of the present invention demonstrate that the method of the present invention is useful.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A machine learning estimation method for sensitivity, mechanical property and relation of energetic materials is characterized in that a molecular descriptor calculated by E-Dragon and several basic molecular structure information are used as characteristics, a quantitative structure-activity relation model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an artificial neural network and a method for determining independent screening and sparse operators, the relation between the impact sensitivity and the mechanical property of the nitro energetic materials is determined by utilizing the constructed quantitative structure-activity relation model of the impact sensitivity and the volume modulus of the nitro energetic materials, and the quantitative relation between the impact sensitivity and the mechanical property of the nitro energetic materials and the molecular structure is determined respectively.
2. The method for machine learning estimation of sensitivity and mechanical properties of energetic materials and their relationships of claim 1, wherein the method for machine learning estimation of sensitivity and mechanical properties of energetic materials and their relationships comprises:
acquiring and processing data, acquiring impact sensitivity and volume modulus values of 240 nitro compounds mainly containing nitro aromatic hydrocarbons, collecting characteristics of molecular weight and crystal density, and calculating Dragon molecular descriptors of 240 substances by taking SMILES character strings as input;
step two, gradually screening out corresponding final characteristics by taking the impact sensitivity and the volume modulus as target properties, and establishing a QSPR (quantitative correlation response) relation model between the impact sensitivity and the volume modulus of the energetic material and the molecular structure of the energetic material by adopting ANN (artificial neural network) and SISSO (siso) methods;
thirdly, establishing a correlation formula of the sensitivity and the mechanical property of the energetic material by taking the impact sensitivity of 240 nitro energetic compounds as output and the atomic number, the molecular weight, the crystal density, the oxygen balance and the bulk modulus of CHON elements as characteristics, and analyzing the relationship between the impact sensitivity and the mechanical property of the nitro energetic material;
and step four, comparing the difference of the performance of the ANN and SISSO methods and the combination model of the two characteristics, calculating a 5-time cross validation result of the model after finding the optimal parameter for each model, and comparing the adaptability of the model with the characteristics selected according to experience and the molecular descriptor calculated according to the SMILES character string.
3. The method for machine learning estimation of energetic material sensitivity and mechanical properties and their relationships of claim 2 wherein the data collection method comprises: obtaining crystal structure files and SMILES character strings of 240 nitro energetic compounds, and obtaining mechanical property data of the nitro energetic compounds in MS software by adopting molecular dynamics simulation of a Forcite module; after the structure is optimized, a COMPASS force field is applied, the Anderson temperature control and the Parrinello pressure control are adopted under the NPT ensemble and the temperature of 295K, the pressure is set to be 0.0001GPa, and the Van der Waals pressure and the static pressure are set to be 0.0001GPaRespectively adopting atom-based and EWald addition methods for electric action, taking a truncation radius of 0.95nm, and correcting a truncation tail; the initial atom movement speed is determined according to Maxwell-Boltzman distribution, the solution of a Newton movement equation is established on the basic assumption that the periodic boundary condition and the time average are equivalent to the ensemble average, the integration adopts a Verlet method, the time step is 1fs, and the track is stored every 10 fs; after the system is balanced, the mechanical property analysis is carried out by adopting a simulation track of 1ns after the balance to obtain an elastic coefficient CijAnd (i, j is 1-6), and calculating to obtain the mechanical property parameters.
4. The method for machine learning estimation of energetic material sensitivity and mechanical properties and their relationships of claim 1 wherein the molecular descriptor computation method comprises:
(1) calculating a molecular descriptor:
1666 molecular descriptors of each molecule are calculated online on the basis of SMILES character strings by adopting E-Dragon1.0 software; obtaining crystal density of nitro energetic compound from Cambridge crystal database CSD, and obtaining molecular formula C of energetic materialaHbOcNdC, H, O, N atomic number and molecular weight of each substance are extracted;
wherein a, b, c and d respectively represent the atomic number of C, H, O, N elements in a molecule, M is relative molecular mass and has the unit of g/mol, OB is oxygen balance of energetic molecules and has the unit of g/g; the oxygen balance value calculation formula is as follows:
Figure FDA0002790027670000021
(2) the number of descriptors is reduced using statistical methods to obtain the required descriptors for building the QSPR model.
5. The method for machine learning estimation of energetic material sensitivity and mechanical properties and their relationships according to claim 4, wherein in step (2), the reduction of the number of descriptors using statistical methods to obtain the required descriptors for building QSPR model comprises the following steps:
1) all descriptors which contain error information and can not calculate exact numerical values are eliminated;
2) removing descriptors with more than 75% of samples having the same value;
3) omitting descriptors with a relative standard deviation RSD less than 0.05;
4) deleting the descriptors with the relevant Pearson coefficients r larger than 0.75, and removing the descriptors with smaller relevance with the target value when the relevance of the two descriptors is larger than 0.75;
5) and removing the descriptors with p values, namely probability values larger than 0.005 by adopting an MLR forward stepwise regression method.
6. The method for machine learning estimation of energetic material sensitivity and mechanical properties and their relationships according to claim 4, characterized in that the feature descriptors and target properties used to construct the QSPR model after screening are sorted into the following 4 data sets;
dataset-1: obtaining 14 molecular descriptors by taking the bulk modulus as a target property; 9 types of descriptors including structure descriptors, information exponents, edge adjacency exponents, BCUT descriptors, geometric descriptors, 3D-MorSE descriptors, GETAWAY descriptors, atom center fragments and molecular properties;
dataset-2: obtaining 17 molecular descriptors including 2D autocorrelation descriptor, geometric descriptor, RDF descriptor, 3D-Morse descriptor, WHIM descriptor, GETAWAY descriptor and 8 kinds of descriptors including molecular property by taking impact sensitivity as target property;
dataset-3: 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atomic number;
dataset-4: and combining and de-duplicating the descriptors screened twice respectively by taking the impact sensitivity and the volume modulus as target properties together to obtain 26 descriptors comprising 10 types of descriptors.
7. The method for estimating the sensitivity and the mechanical property of the energetic material and the relation thereof by machine learning as claimed in claim 1, wherein the method for constructing the quantitative structure-activity relation model of the impact sensitivity and the bulk modulus of the nitro energetic material comprises the following steps: adopting Dataset-1, Dataset-2, Dataset-3 and Dataset-4 as data sets, randomly dividing the data sets into two subsets, using 80% of data as a training set and 20% of data as a testing set; modeling 4 data sets including Dataset-1, Dataset-2, Dataset-3 and Dataset-4 by using SISSO and ANN methods to obtain a quantitative structure-activity relationship model of impact sensitivity and bulk modulus of 7 nitro energetic substances;
the quantitative structure-activity relationship models of the impact sensitivity and the volume modulus of the nitro energetic material are respectively as follows:
model-1: ANN model of bulk modulus with corresponding 14 molecular descriptors, Dataset-1;
model-2: an ANN model of impact sensitivity and corresponding 17 molecular descriptors, Dataset-2;
model-3: an ANN model with 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atom number, Dataset-3;
model-4: the ANN model of impact sensitivity and bulk modulus with the corresponding 26 molecular descriptors, Dataset-4;
model-5: SISSO model of bulk modulus with 14 molecular descriptors, Dataset-1;
model-6: SISSO model of impact sensitivity and 17 molecular descriptors, Dataset-2;
model-7: SISSO model Dataset-3 with 8 characteristics of impact sensitivity, volume modulus, crystal density, oxygen balance, molecular weight and CHON atom number;
in the fourth step, the two characteristics are as follows:
oxygen balance and related characteristics selected from the Dataset-3 according to chemical experience;
gradually screening out molecular descriptors from the Dataset-1, Dataset-2 and Dataset-4 by adopting a statistical method according to target properties;
in step four, the difference of model performance of the combination of the two methods of ANN and SISSO and the two characteristics is compared, the adaptability of the two characteristics and the model is compared, and the architecture of the optimal QSPR model obtained for each data set comprises the following steps:
using root mean square error RMSE, Pearson correlation coefficient R and decision coefficient R2Comprehensively evaluating the performance of the training set and the test set of the 7 QSPR models;
the formula is as follows:
Figure FDA0002790027670000041
Figure FDA0002790027670000042
Figure FDA0002790027670000043
where N is the number of compounds in each data set, yi trueIs the true value, yi predIn order to predict the value of the model,
Figure FDA0002790027670000051
is the average of the true values of the samples,
Figure FDA0002790027670000052
the average value of the sample predicted values is taken;
the optimal QSPR model corresponding to the data set is as follows:
the optimal model with 14 descriptors as features and the bulk modulus as the target property is an ANN model-1; the optimal model with the impact sensitivity as the target property is also an ANN model Mode-2 by taking 17 descriptors as characteristics;
the relational formula between the impact sensitivity and the mechanical property of the nitro energetic material is as follows:
Figure FDA0002790027670000053
h50for impact sensitivity, a, b, c, d are the number of C, H, O, N elements in the molecule, M is the molecular weight, OB is the oxygen balance number, and K is the bulk modulus.
8. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of: the molecular descriptor and the molecular structure information calculated by E-Dragon are taken as characteristics, a quantitative structure-activity relationship model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an artificial neural network and a method for determining independent screening and sparse operators, and the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the molecular structure are determined by utilizing the constructed quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic materials.
9. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of: the molecular descriptor and the molecular structure information calculated by E-Dragon are taken as characteristics, a quantitative structure-activity relationship model of impact sensitivity and volume modulus of 7 nitro energetic materials is constructed based on an artificial neural network and a method for determining independent screening and sparse operators, and the relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the quantitative relationship between the impact sensitivity and the mechanical property of the nitro energetic materials and the molecular structure are determined by utilizing the constructed quantitative structure-activity relationship model of the impact sensitivity and the volume modulus of the nitro energetic materials.
10. A machine learning estimation system for sensitivity and mechanical properties of energetic materials and their relationships, which implements the method for machine learning estimation of sensitivity and mechanical properties and their relationships of energetic materials according to any one of claims 1 to 7, the system comprising:
the quantitative structure-activity relationship model construction module is used for constructing quantitative structure-activity relationship models of the impact sensitivity and the bulk modulus of 5 nitro energetic materials and the molecular structure thereof respectively based on an artificial neural network and a method for determining independent screening and sparse operators by taking the molecular descriptor calculated by the E-Dragon as a characteristic;
and the impact sensitivity and mechanical property relation determining module is used for determining the relation between the impact sensitivity and the mechanical property of the nitro energetic material by using a quantitative structure-activity relation model of the impact sensitivity and the volume modulus of 2 nitro energetic materials which is constructed by taking molecular structure information as characteristics.
CN202011311694.2A 2020-11-20 2020-11-20 Machine learning estimation method for sensitivity and mechanical property of energetic substance and relation of energetic substance Active CN112382350B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011311694.2A CN112382350B (en) 2020-11-20 2020-11-20 Machine learning estimation method for sensitivity and mechanical property of energetic substance and relation of energetic substance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011311694.2A CN112382350B (en) 2020-11-20 2020-11-20 Machine learning estimation method for sensitivity and mechanical property of energetic substance and relation of energetic substance

Publications (2)

Publication Number Publication Date
CN112382350A true CN112382350A (en) 2021-02-19
CN112382350B CN112382350B (en) 2023-07-28

Family

ID=74585953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011311694.2A Active CN112382350B (en) 2020-11-20 2020-11-20 Machine learning estimation method for sensitivity and mechanical property of energetic substance and relation of energetic substance

Country Status (1)

Country Link
CN (1) CN112382350B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312853A (en) * 2021-06-28 2021-08-27 南京玻璃纤维研究设计院有限公司 Density prediction method based on molecular dynamics and ridge regression algorithm
CN114049922A (en) * 2021-11-09 2022-02-15 四川大学 Molecular design method based on small-scale data set and generation model
CN114397420A (en) * 2021-12-17 2022-04-26 西安近代化学研究所 Method for determining compression potential energy of layered stacked energetic compound molecular crystal
CN115169083A (en) * 2022-06-17 2022-10-11 山东科技大学 Calculation method for pyrolysis kinetic parameters of polymer-based composite material
CN115169111A (en) * 2022-07-04 2022-10-11 中北大学 Random forest based energetic material mechanical property prediction method and storage device
CN115762658A (en) * 2022-11-17 2023-03-07 四川大学 Eutectic density prediction method based on graph convolution neural network

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020003016A1 (en) * 2000-06-27 2002-01-10 Guy Ampleman Insensitive melt cast explosive compositions containing energetic thermoplastic elastomers
CN101339181A (en) * 2008-08-14 2009-01-07 南京工业大学 Organic compound explosive characteristic prediction method based on genetic algorithm
CN105601457A (en) * 2016-02-17 2016-05-25 中北大学 ETN-DNT eutecticevaporate energetic material and preparation method thereof
CN106631639A (en) * 2017-01-06 2017-05-10 中国工程物理研究院化工材料研究所 Method for improving the surface wettability of energetic material and the mechanical property of explosive
CN106886615A (en) * 2015-12-10 2017-06-23 南京理工大学 A kind of analogy method of RDX Quito component containing energy compound
CN109283104A (en) * 2018-11-15 2019-01-29 北京理工大学 Product cut size is distributed on-line monitoring method in crystal solution in a kind of RDX preparation process
CN109411029A (en) * 2018-10-10 2019-03-01 西安近代化学研究所 A kind of energy-containing compound Performance Prediction system
CN109581870A (en) * 2018-11-27 2019-04-05 中国工程物理研究院化工材料研究所 The temperature in the kettle dynamic matrix control method of energetic material reaction kettle
CN110728047A (en) * 2019-10-08 2020-01-24 中国工程物理研究院化工材料研究所 Computer-aided design system for predicting energetic molecules based on machine learning performance
CN110867217A (en) * 2019-11-18 2020-03-06 西安近代化学研究所 Method for calculating crystallization morphology of energetic material in solution
CN110890135A (en) * 2019-11-18 2020-03-17 西安近代化学研究所 Prediction method of energetic N-oxide crystal structure
CN111429980A (en) * 2020-04-14 2020-07-17 北京迈高材云科技有限公司 Automatic acquisition method for material crystal structure characteristics

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020003016A1 (en) * 2000-06-27 2002-01-10 Guy Ampleman Insensitive melt cast explosive compositions containing energetic thermoplastic elastomers
CN101339181A (en) * 2008-08-14 2009-01-07 南京工业大学 Organic compound explosive characteristic prediction method based on genetic algorithm
CN106886615A (en) * 2015-12-10 2017-06-23 南京理工大学 A kind of analogy method of RDX Quito component containing energy compound
CN105601457A (en) * 2016-02-17 2016-05-25 中北大学 ETN-DNT eutecticevaporate energetic material and preparation method thereof
CN106631639A (en) * 2017-01-06 2017-05-10 中国工程物理研究院化工材料研究所 Method for improving the surface wettability of energetic material and the mechanical property of explosive
CN109411029A (en) * 2018-10-10 2019-03-01 西安近代化学研究所 A kind of energy-containing compound Performance Prediction system
CN109283104A (en) * 2018-11-15 2019-01-29 北京理工大学 Product cut size is distributed on-line monitoring method in crystal solution in a kind of RDX preparation process
CN109581870A (en) * 2018-11-27 2019-04-05 中国工程物理研究院化工材料研究所 The temperature in the kettle dynamic matrix control method of energetic material reaction kettle
CN110728047A (en) * 2019-10-08 2020-01-24 中国工程物理研究院化工材料研究所 Computer-aided design system for predicting energetic molecules based on machine learning performance
CN110867217A (en) * 2019-11-18 2020-03-06 西安近代化学研究所 Method for calculating crystallization morphology of energetic material in solution
CN110890135A (en) * 2019-11-18 2020-03-17 西安近代化学研究所 Prediction method of energetic N-oxide crystal structure
CN111429980A (en) * 2020-04-14 2020-07-17 北京迈高材云科技有限公司 Automatic acquisition method for material crystal structure characteristics

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
JIE XU 等: "QSPR studies of impact sensitivity of nitro energetic compounds using three-dimensional descriptors", 《JOURNAL OF MOLECULAR GRAPHICS AND MODELLING》 *
QIANQIAN DENG 等: "Probing impact of molecular structure on bulk modulus and impact sensitivity of energetic materials by machine learning methods", 《CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS》 *
张孟华: "溶剂和热诱导下炸药界面组装规律与性能研究", 《中国优秀硕士学位论文全文数据库 工程科技I辑》 *
钟汨: "固相硝基甲烷感度及其调控的理论研究", 《中国优秀硕士学位论文全文数据库 工程科技I辑》 *
钱博文: "含能材料的撞击感度等安全参数的定量构效关系研究", 《中国优秀硕士学位论文全文数据库 工程科技I辑》 *
马秀芳: "高聚物粘结炸药结构与性能的计算模拟研究", 《中国博士论文全文数据库 工程科技I辑》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312853A (en) * 2021-06-28 2021-08-27 南京玻璃纤维研究设计院有限公司 Density prediction method based on molecular dynamics and ridge regression algorithm
CN114049922A (en) * 2021-11-09 2022-02-15 四川大学 Molecular design method based on small-scale data set and generation model
CN114049922B (en) * 2021-11-09 2022-06-03 四川大学 Molecular design method based on small-scale data set and generation model
CN114397420A (en) * 2021-12-17 2022-04-26 西安近代化学研究所 Method for determining compression potential energy of layered stacked energetic compound molecular crystal
CN114397420B (en) * 2021-12-17 2023-12-12 西安近代化学研究所 Determination method for compression potential energy of layered stacked energetic compound molecular crystals
CN115169083A (en) * 2022-06-17 2022-10-11 山东科技大学 Calculation method for pyrolysis kinetic parameters of polymer-based composite material
CN115169083B (en) * 2022-06-17 2024-03-19 山东科技大学 Method for calculating pyrolysis kinetic parameters of polymer matrix composite
CN115169111A (en) * 2022-07-04 2022-10-11 中北大学 Random forest based energetic material mechanical property prediction method and storage device
CN115762658A (en) * 2022-11-17 2023-03-07 四川大学 Eutectic density prediction method based on graph convolution neural network

Also Published As

Publication number Publication date
CN112382350B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN112382350B (en) Machine learning estimation method for sensitivity and mechanical property of energetic substance and relation of energetic substance
Saltelli et al. <? ACS-CT-START-Insert?> Update 1 of:<? ACS-CT-END-Insert?> Sensitivity Analysis for Chemical Models
Henderson The bootstrap: a technique for data-driven statistics. Using computer-intensive analyses to explore experimental data
Yoo et al. Neural network reactive force field for C, H, N, and O systems
Lu et al. DP compress: A model compression scheme for generating efficient deep potential models
Maguid et al. Exploring the common dynamics of homologous proteins. Application to the globin family
Nurislamova et al. Mechanism reduction of chemical reaction based on sensitivity analysis: development and testing of some new procedure
Perriot et al. Reaction rates in nitromethane under high pressure from density functional tight binding molecular dynamics simulations
Mauger et al. Nuclear quantum effects in liquid water at near classical computational cost using the adaptive quantum thermal bath
Yu et al. The applications of deep learning algorithms on in silico druggable proteins identification
Thakker et al. Pushing the limits of rnn compression
Pannell et al. Application of transfer learning for the prediction of blast impulse
Barnes et al. Toward a predictive hierarchical multiscale modeling approach for energetic materials
Debusschere et al. Computational singular perturbation with non-parametric tabulation of slow manifolds for time integration of stiff chemical kinetics
Huang et al. Machine learning diffusion monte carlo forces
Li et al. Additive Multi-Index Gaussian process modeling, with application to multi-physics surrogate modeling of the quark-gluon plasma
Safiullina et al. Computational aspects of simplification of mathematical models of chemical reaction systems
Qian et al. Identifying the determining factors of detonation properties for linear nitroaliphatics with high-throughput computation and machine learning
Nuñez et al. Chespa: streamlining expansive chemical space evaluation of molecular sets
Harirchi et al. On sparse identification of complex dynamical systems: A study on discovering influential reactions in chemical reaction networks
CN115691699A (en) Method and system suitable for energy-containing compound data mining
Hawkins et al. Decisions with confidence: application to the conformation sampling of molecules in the solid state
Savelyev Assessment of the DNA partial specific volume and hydration layer properties from CHARMM Drude polarizable and additive MD simulations
Yang et al. Study on the prediction and inverse prediction of detonation properties based on deep learning
Guo et al. Discovery of high energy and stable prismane derivatives by the high-throughput computation and machine learning combined strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant