CN116776949A - Machine learning chemical exploration data processing method and system based on mineral control element restriction - Google Patents

Machine learning chemical exploration data processing method and system based on mineral control element restriction Download PDF

Info

Publication number
CN116776949A
CN116776949A CN202310790251.3A CN202310790251A CN116776949A CN 116776949 A CN116776949 A CN 116776949A CN 202310790251 A CN202310790251 A CN 202310790251A CN 116776949 A CN116776949 A CN 116776949A
Authority
CN
China
Prior art keywords
data
constraint
geological
loss function
chemical detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310790251.3A
Other languages
Chinese (zh)
Inventor
阴江宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Mineral Resources of Chinese Academy of Geological Sciences
Original Assignee
Institute of Mineral Resources of Chinese Academy of Geological Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Mineral Resources of Chinese Academy of Geological Sciences filed Critical Institute of Mineral Resources of Chinese Academy of Geological Sciences
Priority to CN202310790251.3A priority Critical patent/CN116776949A/en
Publication of CN116776949A publication Critical patent/CN116776949A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention discloses a machine learning chemical detection data processing method and system based on mineral control element restriction, and relates to the field of chemical detection abnormal data detection, wherein the method comprises the following steps: inputting the chemical detection original data of the research area into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining the multi-element geochemical anomaly of the research area according to the reconstruction error; the geochemistry anomaly identification model comprises a geology constraint variation automatic encoder network and a reconstruction error calculation module; the input data of the network is the chemical detection original data, the output data is the reconstructed chemical detection data, and the loss function is obtained by adding a loss function based on geological constraint to the loss function of the original variational automatic encoder network by regularization terms. The invention realizes the construction of a machine learning model with knowledge constraint and the effectiveness and accuracy of extracting the identification of the geochemical anomalies.

Description

Machine learning chemical exploration data processing method and system based on mineral control element restriction
Technical Field
The invention relates to the field of detection of chemical detection abnormal data, in particular to a machine learning chemical detection data processing method and system based on mineral control element restriction.
Background
In recent years, many scholars have started and focused on machine learning-based geochemical data mining and pattern recognition, with many innovative achievements. The geochemical anomaly identification is realized by using limited Boltzmann machine, kernel Mahalanobis distance, single-class support vector machine, isolated forest, gaussian mixture model, kohonen neural network, association rule algorithm, recommendation system algorithm, independent component analysis, measure learning and other algorithms. In addition, the machine learning method for identifying the geochemical anomalies is also compared, for example Li Cangbai and the like, provides a comparison study of a support vector machine, a random forest and an artificial neural network algorithm in geochemical anomaly extraction, and constructs a geochemical anomaly extraction flow based on a supervised machine learning method; zheng Zeyu et al provide comparative studies of unsupervised geochemical anomaly-based recognition algorithms (isolated forest and single class support vector machine) indicating that both algorithms can effectively recognize multiple geochemical anomalies, but the former perform slightly better in terms of data processing time consumption.
Nowadays, some colony intelligent optimization algorithms, such as ant colony algorithm and Bat algorithm, are also used for geochemical anomaly identification and extraction, and the identified anomaly information has higher spatial correlation with the geologic body related to the ore formation. These studies indicate that the population intelligent optimization algorithm is an effective geochemical anomaly identification method. In addition, there are hybrid methods such as combining the kalman filtering and blind extraction methods with a support vector machine, wherein the former is used for fusion of geochemical data and extraction of element combinations, and the latter is used for separation of geochemical anomalies from background information. The hybrid approach fully absorbs the advantages of both approaches and is a beneficial attempt at geochemical data processing.
The deep learning is used as a hierarchical machine learning algorithm with multi-level nonlinear transformation, is different from a common neural network, emphasizes that sample characteristics are learned and extracted through a deeper network model, solves the problem of global optimal solution by adopting a layer-by-layer training and back propagation mode, can learn a complex space-time coupling relation between multi-source mining information and a mineral deposit, can describe anomalies and modes which cannot be found by a conventional method, and has been applied to mining anomaly identification and mineral resource potential evaluation. The related scholars use a deep learning model to develop the identification of geochemical anomalies under complex geological conditions earlier, such as effectively identifying multiple geochemical anomalies by using a continuously limited boltzmann machine. On the basis of the model, a multi-element geochemical anomaly identification model based on a depth self-coding network is constructed. Based on the strong feature extraction capability of deep learning, a hybrid model combining a deep confidence network and a single-class support vector machine is further constructed, deep feature information extracted by the deep learning model is used as input of an abnormality detection algorithm of the single-class support vector machine, and multiple geochemical abnormalities are effectively extracted. Combining the depth self-coding network with a density-based clustering algorithm, extracting deep geochemical features by using the depth self-coding network as the input of the density clustering algorithm, and identifying and extracting the multivariate geochemical anomalies. Deep characteristic information of geochemical data is extracted by constructing a stack noise reduction self-coding network based on hierarchical clustering, and is used as input of an unsupervised isolated forest anomaly detection algorithm, so that geochemical anomalies of lead-zinc-silver polymetallic ores in the northwest region of Zhejiang are effectively extracted. The researches find that the deep learning model does not depend on the distribution assumption of the exploration geochemical data, can process complex and nonlinear space modes, and can identify anomalies which cannot be identified by the traditional method. Meanwhile, an attempt is made to use all geochemical variables, and a big data thinking and deep learning method are combined to fully consider the complexity and diversity of element combination, so that a new way is provided for describing a geochemical space mode with nonlinear characteristics and extracting implicit anomalies, and the method can be better applied to geochemical anomaly identification.
The above process of exploratory geochemical data based on machine learning and deep learning ignores the spatial characteristics of geochemical data. The deep learning algorithm has a complex structure and a plurality of parameters, so that unknown data interpretation capability is poor, and consistency of physical laws is lacking, which restricts further development of the deep learning model in mining anomaly extraction and mineral resource potential evaluation. However, the main disadvantage is that the intermediate process of the big data and the deep learning model is a 'black box', and the interrelation and the internal relation between elements are difficult to know, but the information has specific geological connotation and has important indication significance for mineral deposit causes and mineral exploration.
Therefore, by combining the characteristics of investigation geochemistry, the large data mining and integration of investigation geochemistry are developed through a machine learning and deep learning algorithm based on the restriction of ore control conditions under the research of an ore forming rule, and the realization of organic fusion of an ore finding model and machine learning, theoretical driving and data driving is an important innovation point in the future field. The conventional machine learning method based on pure data driving has better fitting to training data in scientific problem pushing, however, the unknown data has poor interpretation capability and lacks of consistency of physical laws due to complex machine learning or deep learning structure and more parameters and the black box in the middle process. Therefore, even though the deep learning model of the black box can reach higher prediction and classification precision after a series of model parameter adjustment, the deep learning model lacks the expression capability of physical mechanism and domain knowledge and cannot provide more reliable information, thereby severely restricting the further development of the deep learning model in mining anomaly extraction and mineral resource evaluation.
Disclosure of Invention
The invention aims to provide a machine learning chemical exploration data processing method and system based on mineral control element restriction, which adds a loss function based on geological constraint to a loss function of an original deep learning model by a regularization term so as to accelerate network convergence, enable network learning to have geological significance and achieve higher final result recognition accuracy.
In order to achieve the above object, the present invention provides the following solutions:
the invention provides a machine learning chemical detection data processing method based on mineral control element restriction, which comprises the following steps:
acquiring chemical detection original data of a research area;
inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining a multi-element geochemical anomaly of a research area according to the reconstruction error; the geochemical anomaly identification model comprises a geological constraint variation automatic encoder network and a reconstruction error calculation module;
the input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
Optionally, the determining of the loss function based on the geological constraint is:
selecting geological elements and analyzing a buffer area in a GIS environment;
respectively counting the number of mineral deposits in the buffer area at different buffer intervals, and calculating the spatial distribution density of the mineral deposits according to the number of the mineral deposits;
establishing a power law function of the spatial distribution density of the ore deposit and the distance of the buffer zone by taking the width of the buffer zone as an abscissa and the spatial distribution density of the ore deposit as an ordinate;
normalizing the power law function, and calculating constraint weights of different buffer areas on ore deposits;
and constructing a loss function based on geological constraints according to the constraint weights.
Optionally, selecting geological elements, and analyzing a buffer area in a GIS environment, which specifically comprises:
and determining the influence range of the ore control element according to the geological characteristics of the area where the ore deposit is located, and determining the buffer distance and the buffer distance of the buffer zone according to the influence range of the ore control element.
Optionally, taking the width of the buffer zone as an abscissa and the spatial distribution density of the ore deposit as an ordinate, establishing a power law function of the spatial distribution density of the ore deposit and the distance of the buffer zone, specifically comprising:
and drawing a double-logarithmic scatter diagram by taking the width of the buffer zone as an abscissa and the spatial distribution density of the deposit as an ordinate, and fitting the double-logarithmic scatter diagram into a straight line to obtain a power law function of the spatial distribution density of the deposit and the distance of the buffer zone.
Optionally, the loss function based on geological constraints is:
wherein loss is ρ To loss function based on geological constraints, x i For evidence layer data at position i, f (x i ) Is x i Is used to reconstruct the data of the (c) image,where ρ is the spatial distribution density of the deposit, which is the constraint weight at position i.
Optionally, the loss function of the geology constraint variation automatic encoder network is:
loss total =λloss VAE +λloss p
in the loss of total A loss function of the automatic encoder network for geologic constraint variation; loss of loss p Is a loss function based on geological constraints; loss of loss VAE The loss function of the automatic encoder network is the original variation, and lambda is a regularized term coefficient.
Optionally, the quantization index of the network parameter optimization of the geological constraint variation automatic encoder is an ROC curve and an area under the curve.
The invention also provides a machine learning chemical detection data processing system based on the restriction of the mine control element, which comprises:
the data acquisition module is used for acquiring chemical detection original data of the research area;
the geochemical anomaly identification module is used for inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining a multi-element geochemical anomaly of a research area according to the reconstruction error; the geochemical anomaly identification model comprises a geological constraint variation automatic encoder network and a reconstruction error calculation module;
the input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a machine learning model with the addition of mineral control element constraint, and combines with the actual ore formation geological background to realize the construction of the machine learning model with knowledge constraint and the effectiveness and accuracy of extracting geochemical anomaly identification.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a machine learning chemical detection data processing method based on mineral control element constraint provided by an embodiment of the invention;
FIG. 2 is an exemplary diagram of a deposit boundary buffer zone provided by an embodiment of the present invention;
FIG. 3 is a plot of buffer width versus spatial distribution density of mineral deposits for a pair of logarithmic scattergrams provided by an embodiment of the present invention;
FIG. 4 is a diagram of a geochemical anomaly identification framework based on a geology constraint variation automatic encoder network provided by an embodiment of the present invention;
fig. 5 is a schematic diagram of input/output data according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
When the data-driven machine learning method or the deep learning method is used for extracting the chemical detection abnormal information, the chemical detection abnormal information is basically completely learned and trained by virtue of geochemical data, but the finally extracted chemical detection abnormal information often lacks of geological interpretation, and other geological control information (rock and structure) cannot be well embodied.
In the invention, the organic fusion of knowledge drive (theoretical drive) and machine learning model (data drive) can ensure that a new mode and rule are learned from big data by utilizing a machine (deep) learning model under the condition of not neglecting the ore formation rule ore control mechanism and domain knowledge, mainly based on geological ore formation rule research, abnormal information extraction is carried out on chemical detection data by extracting ore control elements and adding constraint conditions of machine learning through digital quantization, so that expert knowledge is added in a data-driven machine learning method to realize fusion of data drive and knowledge drive, and the accuracy of abnormal extraction is improved.
Example 1
The embodiment provides a machine learning chemical exploration data processing method based on mineral control element restriction, which comprises the steps of firstly summarizing mineral formation elements according to regional mineral formation rules established by a former, modeling and quantifying the mineral formation rules according to an established power law function between the number of regional magma hot-liquid type mineral deposits and the distance between the mineral deposits and the mineral control elements, constructing the quantified mineral formation rules into a loss function of a deep learning model, training the deep learning model, and identifying by applying subsequent abnormal information.
As shown in fig. 1, the machine learning chemical detection data processing method based on the constraint of the mine control element provided in this embodiment includes:
step 100: and acquiring chemical detection raw data of the research area.
Step 200: and inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining the multi-element geochemical anomaly of the research area according to the reconstruction error. The geochemical anomaly identification model comprises a geology constraint variation automatic encoder network and a reconstruction error calculation module.
The input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
In this embodiment, the determining process of the loss function based on the geological constraint is:
(1) Geological elements are selected, buffer area analysis is carried out in a GIS environment, and the method comprises the following steps: and determining the possible influence range of the mine control element in the GIS environment according to the geological characteristics of the area where the deposit is located, thereby determining the buffer distance and the buffer distance of the buffer area. As shown in fig. 2, the formation contact zones were arranged in a column, and the buffer distance was set to 1km and the buffer distance was set to 10km.
(2) The number of deposits n in the buffer zone at different buffer intervals (e.g. 1km, 2 km, 3 km … km) is counted separately and the cumulative deposit number and the deposit spatial distribution density ρ are calculated.
Where d is the width of the buffer.
(3) Taking the width of the buffer zone as an abscissa and the spatial distribution density of the ore deposit as an ordinate, establishing a power law function of the spatial distribution density of the ore deposit and the distance of the buffer zone, wherein the power law function specifically comprises the following steps: and drawing a double-logarithmic scatter diagram by taking the width d of the buffer zone as an abscissa and the spatial distribution density rho of the deposit as an ordinate, fitting the double-logarithmic scatter diagram into a straight line, and further determining a power law function of the spatial distribution density of the deposit and the distance of the buffer zone as shown in fig. 3.
ρ=cx (2)。
Wherein c is a constant; alpha is the fractal dimension; x is the buffer distance.
The power law function can reveal that the spatial distribution density ρ of the ore deposit and the distance x between the ore deposit and the control ore elements (namely the buffer zone distance) are in a law of power law attenuation.
(4) And (3) carrying out normalization processing on the power law function formula in the step (3), and calculating constraint weights of different buffer areas on deposit formation.
Wherein ρ is max Is the maximum value of the spatial distribution density of the ore deposit. The closer the area is to the ore control element, the greater the potential of ore formation and the greater the constraint weight; the more distant the area from the mine control element, the less potential for mine formation and the less constraining weight. Preferentially ρ max =14,
(5) Based on the above weight functions, the following loss functions based on geological constraints are constructed.
Wherein x is i For evidence layer data at position i, f (x i ) Is x i Is used to reconstruct the data of the (c) image,is the weight at location i. If the non-uniformity prediction of the ore forming rule appears at a certain position of the research area, the position can be based on constraint weight +.>Error penalty terms of different degrees are set. For deep learning modelThe model forecast sets smaller error punishment items in the high anomaly area which is closer to the control ore element or in the low anomaly area which is farther from the control ore element. Conversely, a larger error punishment item is set for a low anomaly region closer to the controlled ore element or a high anomaly region farther from the controlled ore element.
In this embodiment, a loss function based on geological constraints is added to the loss function of the original variant automatic encoder network in regularization terms.
loss total =loss VAE +λloss p (5)。
In the loss of total A loss function of the automatic encoder network for geologic constraint variation; loss of loss p Is a loss function based on geological constraints; loss of loss VAE The loss function of the automatic encoder network is the original variation, and lambda is a regularized term coefficient.
Granite will now be described as an example.
In the formula, theta,respectively representing decoding parameters and encoding parameters; l is the sampling time of the latent variable z, f θ (z (i,l) ) Representing input x (i) Reconstruction of L geo_con Buffer zone (omega) representing granite boundary geo_con ) And the identified geochemical pattern (x (i) -f θ (z (i,l) ) This regularization term is a penalty term for geochemical pattern recognition of geologic body effects. Omega geo_con And can be estimated by:
wherein ρ is geo Represents the deposit density boundary corresponding to granite, epsilon is the buffer width of the granite boundary, alpha-2 corresponds to the slope, and C is a constant.
And weighing the specific gravity of the loss function of the original variational automatic encoder network and the loss function based on geological constraint through the regularization term coefficient lambda, and further punishing the non-uniformity prediction of the ore forming rule.
In this embodiment, the geochemical anomaly is identified based on the reconstructed chemical probe data, wherein the greater the reconstruction error, the more obvious the geochemical anomaly is proved
The method provided in this embodiment will be described using the identification of multiple geochemical anomalies in Gannan tungsten polymetallic ores as a case.
(1) And (5) a model frame.
A geologic constraint loss function constructed based on a power law function between the number of magma hot-liquid deposits and the distance of the deposits from the controlled mineral elements is added to a Variational Automatic Encoders (VAEs) network to construct a geochemical anomaly recognition framework based on the variational automatic encoders network, as shown in fig. 4. The VAE network is a generation model combining variation reasoning and deep learning, is similar to a self-coding network, assumes hidden variables to follow Gaussian distribution, establishes two probability density distribution models by using two neural networks, one is used for variation reasoning of original input data to generate variation probability distribution of hidden variables, is called an inference network, and the other is used for restoring and generating approximate probability distribution of the original data according to the generated variation probability distribution of hidden variables, is called a generation network, namely, utilizes the generation characteristic of a variation automatic encoder to reconstruct geochemical data, and identifies multi-element geochemical anomalies based on reconstruction errors.
The framework uses a probability encoder to simulate the distribution of hidden variables, expands the expressive force of a variational automatic encoder network, and can characterize the difference between data when processing normal data and abnormal data with the same average value. In addition, the frame is additionally provided with a loss function based on geological constraint, and the original variational automatic encoder network is adjusted to be dependent on the minimized K-L divergence, the reconstructed error loss function and the loss function based on geological constraint by means of the minimized Kullback-Leibler (K-L) divergence and the reconstructed error loss function. Based on the method, in the process of reconstructing input data, the framework considers the correlation between the spatial distribution rule of the ore deposit and the ore control factors, filters the result inconsistent with the regional ore formation rule, and not only can ensure the accuracy of geochemical anomaly identification, but also can ensure the consistency of the identification result and the ore formation rule.
(2) Model input.
The chemical detection original data related to the multi-metal ore mining of the research area is selected as the input of the geological constraint variation automatic encoder network, and one piece of data or a plurality of pieces of data can be input into the geological constraint variation automatic encoder network line by line or in batches at a time.
(3) Model training
The parameter adjusting method of the geology constraint variation automatic encoder network is similar to the depth self-encoding network, and parameters needing to be adjusted mainly are network depth and potential variable dimensions. The difference is that the quantization index of the depth self-coding network parameter optimization is reconstruction error, the geological constraint variation automatic encoder network selects ROC curve and Area Under Curve (AUC) as quantization index of parameter optimization, calculates AUC value of abnormal extraction, and can be seen that AUC value is maximum when network depth is 3, therefore, aiming at the example, network depth is set to 3, dimensions of different potential variables z are set, AUC value change of abnormal extraction result is calculated, AUC value is maximum when dimension of potential variable is 4, and the number of hidden layer nodes is 16,8,4,8 and 16 by combining symmetrical structure of the variation automatic encoder network.
In addition, the geological constraint contains an important parameter, namely, the regularization coefficient lambda weighing the specific gravity between the standard VAE loss and the geological constraint loss is larger, the stronger the correlation between the extracted anomaly and the granite is, the larger the correlation between the anomaly and the granite is as the lambda increases, when lambda=1, the identified anomaly has the highest fitting with the granite, and therefore the regularization coefficient lambda is set to be 1.
(4) And outputting a model.
And (3) encoding and decoding each piece of input data by utilizing a geological constraint variation automatic encoder network, further obtaining corresponding reconstruction data, operating the network, calculating a reconstruction error between the original data and the reconstruction data as a geochemical anomaly score, and finally obtaining the multi-element geochemical anomaly.
For example, in fig. 4, 5 chemical detection element values (W, sn, mo, bi and Ag, or a plurality of element values, 5 being used temporarily here) corresponding to each point are input, and the coding layer includes four hidden layers; the decoding layer and the encoding layer are symmetrical and also are four hidden layers, and then the output data is finally obtained through network layer-by-layer calculation, the output is data with the same dimension as the input, and the output is respectively corresponding to the reconstructed 5 chemical detection element values (W, sn, mo, bi and Ag, see figure 5).
The AUC value result shows that the model added with the constraint of the ore control element has better generalization precision than the traditional model, and the extracted chemical detection abnormality has higher correlation with the ore control element, and the result shows that the added condition constraint regularization term can effectively improve the interpretability and generalization of geochemistry pattern recognition by considering the additional valuable information related to the ore forming process, so that the machine learning model is endowed with good reasoning capability and has good consistency with geological priori knowledge.
Example two
In order to execute the corresponding method of the above embodiment to achieve the corresponding functions and technical effects, a machine learning chemical detection data processing system based on the restriction of the mine control element is provided below.
The embodiment provides a machine learning chemical detection data processing system based on mineral control element restriction, which comprises:
the data acquisition module is used for acquiring the chemical detection original data of the research area.
The geochemical anomaly identification module is used for inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining a multi-element geochemical anomaly of a research area according to the reconstruction error; the geochemical anomaly identification model comprises a geology constraint variation automatic encoder network and a reconstruction error calculation module.
The input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (8)

1. The machine learning chemical detection data processing method based on the restriction of the mine control element is characterized by comprising the following steps:
acquiring chemical detection original data of a research area;
inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining a multi-element geochemical anomaly of a research area according to the reconstruction error; the geochemical anomaly identification model comprises a geological constraint variation automatic encoder network and a reconstruction error calculation module;
the input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
2. The machine learning chemical prospecting data processing method based on the constraint of the mine control element according to claim 1, wherein the determining process of the loss function based on the geological constraint is as follows:
selecting geological elements and analyzing a buffer area in a GIS environment;
respectively counting the number of mineral deposits in the buffer area at different buffer intervals, and calculating the spatial distribution density of the mineral deposits according to the number of the mineral deposits;
establishing a power law function of the spatial distribution density of the ore deposit and the distance of the buffer zone by taking the width of the buffer zone as an abscissa and the spatial distribution density of the ore deposit as an ordinate;
normalizing the power law function, and calculating constraint weights of different buffer areas on ore deposits;
and constructing a loss function based on geological constraints according to the constraint weights.
3. The machine learning chemical detection data processing method based on mineral control element constraint of claim 2, wherein geological elements are selected, and buffer analysis is performed in a GIS environment, and the method specifically comprises the following steps:
and determining the influence range of the ore control element according to the geological characteristics of the area where the ore deposit is located, and determining the buffer distance and the buffer distance of the buffer zone according to the influence range of the ore control element.
4. A machine learning chemical prospecting data processing method based on mineral control element constraint according to claim 3, wherein the method is characterized in that the power law function of the spatial distribution density of the mineral deposit and the distance of the buffer area is established by taking the width of the buffer area as an abscissa and the spatial distribution density of the mineral deposit as an ordinate, and specifically comprises the following steps:
and drawing a double-logarithmic scatter diagram by taking the width of the buffer zone as an abscissa and the spatial distribution density of the deposit as an ordinate, and fitting the double-logarithmic scatter diagram into a straight line to obtain a power law function of the spatial distribution density of the deposit and the distance of the buffer zone.
5. The machine learning chemical exploration data processing method based on mineral control element constraint of claim 2, wherein the loss function based on geological constraint is:
wherein loss is ρ To loss function based on geological constraints, x i For evidence layer data at position i, f (x i ) Is x i Is used to reconstruct the data of the (c) image,where ρ is the spatial distribution density of the deposit, which is the constraint weight at position i.
6. The machine learning chemical prospecting data processing method based on mineral control element constraint of claim 1, wherein the loss function of the geological constraint variation automatic encoder network is:
loss total =loss VAE +λloss p
in the loss of total A loss function of the automatic encoder network for geologic constraint variation; loss of loss p Is a loss function based on geological constraints; loss of loss VAE The loss function of the automatic encoder network is the original variation, and lambda is a regularized term coefficient.
7. The machine learning chemical exploration data processing method based on mineral control element restriction of claim 1, wherein the quantitative index of the network parameter optimization of the geological constraint variation automatic encoder is ROC curve and area under the curve.
8. A machine learning chemical prospecting data processing system based on accuse ore deposit factor restriction, characterized by comprising:
the data acquisition module is used for acquiring chemical detection original data of the research area;
the geochemical anomaly identification module is used for inputting the chemical detection original data into a geochemical anomaly identification model to obtain a reconstruction error corresponding to the chemical detection original data, and determining a multi-element geochemical anomaly of a research area according to the reconstruction error; the geochemical anomaly identification model comprises a geological constraint variation automatic encoder network and a reconstruction error calculation module;
the input data of the geological constraint variation automatic encoder network is chemical sounding original data, the output data of the geological constraint variation automatic encoder network is the reconstructed chemical sounding data, and the loss function of the geological constraint variation automatic encoder network is obtained by adding a loss function based on geological constraint to the loss function of the original variation automatic encoder network in a regularization term; the reconstruction error calculation module is used for calculating a reconstruction error according to the chemical detection original data and the corresponding reconstructed chemical detection data.
CN202310790251.3A 2023-06-30 2023-06-30 Machine learning chemical exploration data processing method and system based on mineral control element restriction Pending CN116776949A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310790251.3A CN116776949A (en) 2023-06-30 2023-06-30 Machine learning chemical exploration data processing method and system based on mineral control element restriction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310790251.3A CN116776949A (en) 2023-06-30 2023-06-30 Machine learning chemical exploration data processing method and system based on mineral control element restriction

Publications (1)

Publication Number Publication Date
CN116776949A true CN116776949A (en) 2023-09-19

Family

ID=87991101

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310790251.3A Pending CN116776949A (en) 2023-06-30 2023-06-30 Machine learning chemical exploration data processing method and system based on mineral control element restriction

Country Status (1)

Country Link
CN (1) CN116776949A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216576A (en) * 2023-10-26 2023-12-12 山东省地质矿产勘查开发局第六地质大队(山东省第六地质矿产勘查院) Graphite gold ore prospecting method based on Gaussian mixture clustering analysis
CN118035847A (en) * 2024-04-10 2024-05-14 山东司南地理信息有限公司 Data extraction method and system based on geological mineral exploration

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191076A (en) * 2021-04-25 2021-07-30 中国地质大学(武汉) Method for constructing deep learning loss function based on mineralization law
US20210256392A1 (en) * 2020-02-10 2021-08-19 Nec Laboratories America, Inc. Automating the design of neural networks for anomaly detection
CN116346384A (en) * 2021-12-24 2023-06-27 兴唐通信科技有限公司 Malicious encryption flow detection method based on variation self-encoder

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210256392A1 (en) * 2020-02-10 2021-08-19 Nec Laboratories America, Inc. Automating the design of neural networks for anomaly detection
CN113191076A (en) * 2021-04-25 2021-07-30 中国地质大学(武汉) Method for constructing deep learning loss function based on mineralization law
CN116346384A (en) * 2021-12-24 2023-06-27 兴唐通信科技有限公司 Malicious encryption flow detection method based on variation self-encoder

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张国芳等: ""基于变分自编码器的日线损率异常检测研究"", 《华东师范大学学报(自然科学版)》, no. 5, pages 5 - 6 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216576A (en) * 2023-10-26 2023-12-12 山东省地质矿产勘查开发局第六地质大队(山东省第六地质矿产勘查院) Graphite gold ore prospecting method based on Gaussian mixture clustering analysis
CN117216576B (en) * 2023-10-26 2024-03-29 山东省地质矿产勘查开发局第六地质大队(山东省第六地质矿产勘查院) Graphite gold ore prospecting method based on Gaussian mixture clustering analysis
CN118035847A (en) * 2024-04-10 2024-05-14 山东司南地理信息有限公司 Data extraction method and system based on geological mineral exploration

Similar Documents

Publication Publication Date Title
CN116776949A (en) Machine learning chemical exploration data processing method and system based on mineral control element restriction
Chang et al. Lithofacies identification using multiple adaptive resonance theory neural networks and group decision expert system
Ghiasi-Freez et al. Improving the accuracy of flow units prediction through two committee machine models: an example from the South Pars Gas Field, Persian Gulf Basin, Iran
Rui et al. TOC content prediction based on a combined Gaussian process regression model
Matinkia et al. Prediction of permeability from well logs using a new hybrid machine learning algorithm
CN108897975A (en) Coalbed gas logging air content prediction technique based on deepness belief network
Xie et al. A coarse-to-fine approach for intelligent logging lithology identification with extremely randomized trees
Li et al. Feedback on a shared big dataset for intelligent TBM Part I: Feature extraction and machine learning methods
Feng Uncertainty analysis in well log classification by Bayesian long short-term memory networks
CN116665067B (en) Ore finding target area optimization system and method based on graph neural network
Xiongyan et al. Computational intelligent methods for predicting complex ithologies and multiphase fluids
Zoveidavianpoor A comparative study of artificial neural network and adaptive neurofuzzy inference system for prediction of compressional wave velocity
Yang et al. Oil logging reservoir recognition based on TCN and SA-BiLSTM deep learning method
Shirazy et al. K-means clustering and general regression neural network methods for copper mineralization probability in Chahar-Farsakh, Iran
Tan et al. Evaluation of complex petroleum reservoirs based on data mining methods
Feng et al. Application of Bayesian generative adversarial networks to geological facies modeling
Zhu et al. An automatic identification method of imbalanced lithology based on Deep Forest and K-means SMOTE
Yuan et al. Lithology identification by adaptive feature aggregation under scarce labels
Xu et al. An interpretable graph attention network for mineral prospectivity mapping
Noh et al. Explainable deep learning for supervised seismic facies classification using intrinsic method
Goliatt et al. Performance of evolutionary optimized machine learning for modeling total organic carbon in core samples of shale gas fields
Zhou et al. Sequential data-driven cross-domain lithology identification under logging data distribution discrepancy
Chen et al. A high-performance voting-based ensemble model of graph convolutional extreme learning machines for identifying geochemical anomalies related to mineralization
Pourreza et al. Estimation of geomechanical units using petrophysical logs, core and supervised intelligent committee machine method to optimize exploration drilling operations
Liu et al. Lithofacies identification of shale formation based on mineral content regression using LightGBM algorithm: A case study in the Luzhou block, South Sichuan Basin, China

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination