CN116151107A - Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt - Google Patents

Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt Download PDF

Info

Publication number
CN116151107A
CN116151107A CN202310110434.6A CN202310110434A CN116151107A CN 116151107 A CN116151107 A CN 116151107A CN 202310110434 A CN202310110434 A CN 202310110434A CN 116151107 A CN116151107 A CN 116151107A
Authority
CN
China
Prior art keywords
olivine
data
identification model
potential
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310110434.6A
Other languages
Chinese (zh)
Other versions
CN116151107B (en
Inventor
薛胜超
牛云云
王庆飞
张小豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences Beijing
Original Assignee
China University of Geosciences Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences Beijing filed Critical China University of Geosciences Beijing
Priority to CN202310110434.6A priority Critical patent/CN116151107B/en
Publication of CN116151107A publication Critical patent/CN116151107A/en
Application granted granted Critical
Publication of CN116151107B publication Critical patent/CN116151107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Geophysics And Detection Of Objects (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a system and electronic equipment for identifying the ore potential of lithology type nickel cobalt, which are used for obtaining olivine training data by acquiring olivine data and preprocessing the olivine data; constructing a classification and regression tree based on a random forest to form an initial mining potential identification model; model training is carried out on the initial mining potential identification model according to the olivine training data, and a first mining potential identification model is obtained; performing accuracy testing on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model; the integral step of identifying the lithology-type nickel-cobalt mining potential through the target mining potential identification model can effectively utilize sample characteristics to identify the olivine construction environment and the rock mining state, so that the mining potential of nickel-cobalt in the construction environment where the olivine is located is judged. The invention can be widely applied to the technical field of geological exploration.

Description

Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt
Technical Field
The invention relates to the technical field of geological exploration, in particular to a method and a system for identifying the ore-forming potential of magma type nickel cobalt and electronic equipment.
Background
The basic-super basic rock rich in olivine is an important carrier of a global magma copper-nickel-cobalt sulfide deposit, wherein key metal elements such as nickel, cobalt and the like are important resources, and the material has irreplaceable important application in new materials, new energy sources, information technology and other emerging industries. However, the prediction of the mineralisation potential of basic-superbasic rock bodies associated with such metal ores is a common problem facing the world. The olivine records the behavior mechanism of nickel and cobalt through the composition change in the process of the magma evolution, so that the utilization of the olivine composition to decide the mineralization potential of metals such as nickel, cobalt and the like in basic-super basic rock has a great prospect.
The current global field of science has presented the feature of data-intensive magma products, and especially olivine information of one of important rock-making minerals is continuously tending to be completely rich. Based on the traditional machine learning algorithm, the basalt structural environment based on olivine chemical components is judged by utilizing a logistic regression, random forest, naive Bayes and multi-layer perceptron traditional machine learning method, compared with a judgment chart analysis method widely used by academia and based on binary charts and ternary charts, the judgment accuracy and the working reliability are improved, and inherent defects of the judgment chart, such as experience theory and subjectivity, lack of strict theory, judgment contradiction and application limitation, are avoided.
Although the work has greatly advanced compared with the discriminant graph method, when the traditional machine learning method is used for discriminating the high-dimensional characteristic data, the discrimination precision is very dependent on the characteristic importance analysis of the characteristic engineering on the data sample, so that the data combination with the greatest influence on the improvement of the model discrimination precision is selected, the characteristics in modeling are reduced, and the better discrimination precision is obtained. Notably, the research result is limited to basalt only, and the production environment of large layered rock mass and small-scale basic-super basic invasive rock mass which are widely developed from the Taigu universe to the developing universe worldwide is not solved. Therefore, the discrimination of the structural environment of the olivine component for various types of basic-superbasic rocks is an important point for further work. More importantly, no research results are realized at present to identify the nickel-cobalt ore formation potential of rock mass in different construction environments by utilizing the global olivine component, and the investigation and selection areas of the key metal elements are severely restricted.
Disclosure of Invention
In view of the above, the embodiment of the invention provides an accurate identification method, system and electronic equipment for the ore-forming potential of magma type nickel cobalt.
The embodiment of the invention provides a method for identifying the ore potential of magma type nickel cobalt, which comprises the following steps: obtaining olivine data and preprocessing the olivine data to obtain olivine training data; constructing a classification and regression tree based on a random forest to form an initial mining potential identification model; model training is carried out on the initial mining potential identification model according to the olivine training data, and a first mining potential identification model is obtained; performing accuracy testing on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model; and identifying the lithology-type nickel-cobalt ore-forming potential through the target ore-forming potential identification model.
Optionally, the acquiring olivine data and preprocessing the olivine data to obtain olivine training data includes: performing standardization processing on the olivine data to obtain first olivine data; performing oversampling treatment on the first olivine data by using an SMOTE oversampling method to obtain second olivine data; and encoding the data tag of the second olivine data to obtain olivine training data.
Optionally, the constructing a classification and regression tree based on the random forest to form an initial mineralisation potential identification model includes: constructing a plurality of initial classification and regression trees based on the random forest; all of the initial classification and regression trees of the construct are combined into an initial mineralisation potential identification model.
Optionally, the model training the initial mining potential identification model according to the olivine training data to obtain a first mining potential identification model includes: self-help replacement sampling is carried out on the olivine training data through a self-help sampling method, and the sampled sample data form a training set; expanding an initial classification and regression tree of the initial minescence potential identification model according to a training set to obtain a first classification and regression tree; and repeating the steps of expanding the initial classification and regression tree of the initial ore forming potential identification model according to the training set until all initial classification and regression tree expansion in the initial ore forming potential identification model are completed, and obtaining a first ore forming potential identification model.
Optionally, expanding the initial classification and regression tree of the initial mineralisation potential identification model according to a training set to obtain a first classification and regression tree, including: randomly selecting a subset containing one or more features from the training set to obtain a first subset; calculating a base index of the first subset; determining optimal dividing features and optimal binary dividing points according to the first subset and the base index; determining node division of the initial classification and regression tree according to the optimal division characteristics and the optimal binary segmentation points; repeating the step of randomly removing a subset comprising one or more features from the training set to obtain a first subset until a predetermined stopping condition is met to obtain a first classification and regression tree.
Optionally, the performing accuracy testing on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model includes: carrying out ore potential prediction on the out-of-bag data of the olivine training data by adopting the first ore potential identification model; calculating the recognition accuracy of the mining potential recognition model; and correcting the target mining potential identification model according to the identification accuracy.
Optionally, the method further comprises: equidistant coordinate points are selected on a two-dimensional plane, and the selected coordinate points are used as new data; carrying out nickel-cobalt mining potential identification on the new data by adopting the target mining potential identification model to obtain an identification result; generating a contour line according to the coordinate points and the identification result; and combining the olivine sample label and the contour line to generate a two-dimensional decision boundary.
The embodiment of the invention also provides a magma type nickel-cobalt ore potential identification system, which comprises the following steps: the first module is used for acquiring olivine data and preprocessing the olivine data to obtain olivine training data; the second module is used for constructing classification and regression trees based on the random forest to form an initial mining potential identification model; the third module is used for carrying out model training on the initial mining potential identification model according to the olivine training data to obtain a first mining potential identification model; the fourth module is used for testing the accuracy of the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model; and the fifth module is used for identifying the lithology-type nickel-cobalt mining potential through the target mining potential identification model.
The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory; the memory is used for storing programs; the processor executes the program to implement the method as described above.
Embodiments of the present invention also provide a computer storage medium storing a program that is executed by a processor to implement the method as described above.
The embodiment of the invention has the following beneficial effects: obtaining olivine training data by acquiring olivine data and preprocessing the olivine data; constructing a classification and regression tree based on a random forest to form an initial mining potential identification model; model training is carried out on the initial mining potential identification model according to the olivine training data, and a first mining potential identification model is obtained; performing accuracy testing on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model; the integral step of identifying the lithology-type nickel-cobalt mining potential through the target mining potential identification model can effectively utilize sample characteristics to identify the olivine construction environment and the rock mining state, so that the mining potential of nickel-cobalt in the construction environment where the olivine is located is judged.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the overall steps of a method of an embodiment of the present invention;
FIG. 2 is a flowchart of the overall steps of a particular method of an embodiment of the present invention;
FIG. 3 is a flow chart of model training steps of an embodiment of the present invention;
FIG. 4 is a flow chart of data visualization steps of an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the problem that the nickel-cobalt ore potential of rock mass in different constructional environments is lack of identification by utilizing olivine components in the prior art, the embodiment of the invention provides a method for identifying the nickel-cobalt ore potential of magma type, which comprises the following steps: obtaining olivine data and preprocessing the olivine data to obtain olivine training data; constructing a classification and regression tree based on a random forest to form an initial mining potential identification model; model training is carried out on the initial mining potential identification model according to olivine training data, and a first mining potential identification model is obtained; performing accuracy testing on the first ore forming potential identification model according to the out-of-bag data of the olivine training data to obtain a target ore forming potential identification model; and identifying the ore-forming potential of the magma type nickel cobalt through a target ore-forming potential identification model.
Specifically, olivine is one of the important rock-making minerals, and referring to table 1, table 1 is an olive Dan Chengyin, rock mass type and mineralization type table:
TABLE 1
Figure BDA0004076601490000041
Figure BDA0004076601490000051
According to table 1, the method establishes a random classification model for olivine classification problems of different discrimination criteria, and combines discrimination results of a plurality of learners to achieve the purpose of well identifying the different discrimination criteria; wherein the different criteria described may be the construction environment or the rock mass mineralization status.
The causative type of olivine can be classified into a-GB, P-GB, PCB-LMI, PCB-SI, POB-LMI, POB-SI, ph-CB-V, ph-MORB, ph-OB-SI, ph-OLIP-V, ph-OLIP-VK according to the formation unit-construction unit of the rock mass and the difference of rock mass types; wherein the rock mass forming age-construction unit can comprise a green rock belt of a pseudo-classic, a green rock belt of a metaclassic, a claton and a mountain making belt, a developing ocean middle ridge, a claton, a mountain making belt and a ocean Dahua province; rock mass types may include layered rock masses, small rock masses, and volcanic rocks.
The mineralization status of olivine-related basic-super basic rock bodies can be classified into a-GB-B, A-GB-M, P-GB-B, PCB-LMI-B, PCB-LMI-M, PCB-SI-M, POB-LMI-B, POB-LMI-M, POB-SI-B, POB-SI-M, ph-CB-SI-B, ph-CB-SI-M, ph-CB-V-B, ph-MORB-B, ph-OB-SI-B, ph-OB-SI-M, ph-OLIP-V-B, ph-OLIP-VK-B, depending on the mineralization type; wherein M is mineral-containing, B is lean ore. Based on the mineralization status of olivine, the nickel cobalt ore potential of the olivine ore environment can be identified.
Olivine data are obtained from a magma type nickel cobalt ore forming environment, and the convergence rate of the model can be accelerated by adopting a data preprocessing algorithm aiming at the difference between olivine characteristics. By using a self-help sampling method, an existing sample data training and evaluating model can be utilized to identify the nickel-cobalt ore forming potential of the ore forming environment under different classification standards through the classification and identification of olivine components.
Referring to fig. 1 and 2, fig. 1 is a flowchart of overall steps of a method according to an embodiment of the present invention, and fig. 2 is a flowchart of overall steps of a specific method according to an embodiment of the present invention, where the embodiment of the present invention specifically includes steps S100 to S500 as follows:
s100, obtaining olivine data and preprocessing the olivine data to obtain olivine training data.
Specifically, during the training phase, olivine in different ore-forming environments is collected, olivine data of the olivine is obtained, and the obtained olivine data contains Fo, ni (nickel), siO of the olivine 2 (silica), feO (ferrous oxide), mgO (magnesium oxide), cr 2 O 3 Index data of 8 component indexes of (chromium oxide), mnO (manganese oxide) and CaO (calcium oxide); the Fo value, also called the Mg value, is an indicator of the Mg (magnesium) rich degree of olivine, fo=100×mg/(mg+fe). It should be noted that, due to the geochemical behaviors of elements similar to cobalt and nickel, cobalt and nickel can show highly consistent symbiotic behavior in the process of smelting sulfide of a magma system into ore, so that the mineralization of nickel can be ascertained and the enrichment condition of cobalt can be cleared at the same time. Preprocessing olivine data includes the following steps S110 to S130:
and S110, performing standardization processing on the olivine data to obtain first olivine data.
Specifically, after obtaining the olivine data, the olivine data is normalized, which may be: and calculating the difference between each attribute data of the olivine and the average value of the corresponding attribute data, dividing the difference by the variance of the corresponding attribute data, gathering the data near 0, wherein the variance is 1, so that the interference of the large-value characteristic in the data on model training can be eliminated, and the olivine data obtained after processing is the first olivine data.
And S120, performing oversampling processing on the first olivine data through an SMOTE oversampling method to obtain second olivine data.
Specifically, because of the sample differences of different types in the first olivine data, there is an imbalance between sample types, the SMOTE oversampling method is used to oversample the types of the first olivine data with few samples. The SMOTE calculates the distance from each sample in the minority class to all samples in the minority sample set by taking the euclidean distance as a standard to obtain k neighbor samples, wherein the value of k is usually 5, and in other cases, other values can be also adopted. And randomly selecting one sample from k neighbor samples, and randomly selecting one point on a connecting line between the selected neighbor sample and the original sample in a feature space as a newly synthesized minority sample, namely second olivine data, so that the sample quantity of each type in the first olivine data is equal, and the situation that the model is excessively focused on the type with large sample quantity during training is reduced.
S130, encoding the data tag of the second olivine data to obtain olivine training data.
In particular, the data tag of the second olivine data is encoded, i.e. the tag of the second olivine data is standardized. The coding scheme is as follows: the number of labels of the four classification standards is 12 and 18,3,3 respectively, and the labels in the four classification standards are correspondingly coded by {0, … and 11}, {0, … and 17}, {0,1 and 2}, and {0,1 and 2 }.
S200, constructing a classification and regression tree based on the random forest to form an initial mining potential identification model.
Specifically, random forests are essentially an ensemble learning (ensemble learning) that collectively accomplishes the learning task by building and combining multiple individual learners (individual learner). Individual learners are typically generated from a training set by a learning algorithm, which is homogeneous (homogeneous) if only learners of the same type are contained in the set of individual learners, otherwise heterogeneous (heterogeneous), the individual learners in homogeneous integration also being called base learner (base learner), and the corresponding algorithm being called base learning algorithm. According to the generation mode of the individual learners, the integration algorithm can be divided into two types, namely, strong dependency relationship existing between the individual learners and no strong dependency relationship. The integration algorithm with strong dependency must be a serialization algorithm generated in series; and integrated algorithms that do not have strong dependencies are generatable in parallel.
The base learner of the embodiments of the present invention is a classification and regression tree.
Step S200 includes the following steps S210 to S220:
s210, constructing a plurality of initial classification and regression trees based on the random forest.
In particular, the classification and regression tree (CART, classification and regression tree) is one type of binary decision tree, with at most two subtrees per node of the classification and regression tree, so it will be appreciated that the initial classification and regression tree has only one root node.
S220, combining all initial classification and regression trees of the construction into an initial mining potential identification model.
Specifically, all initial classification and regression trees constructed are collectively referred to as an initial minerality recognition model, and therefore it is understood that a plurality of initial classification and regression trees are included in the initial minerality recognition model.
And S300, performing model training on the initial mining potential identification model according to olivine training data to obtain a first mining potential identification model.
Specifically, in the process of model training of the initial mineralisation potential identification model, different training sets are used for training, and the different training sets can be obtained by sampling from olivine training data through a self-help sampling method. In the initial mining potential identification model, selecting super-parameters n_identifiers of the model by a five-fold cross validation method and a network searching method, wherein n_identifiers are the number of decision trees in a random forest, and further obtaining the optimal super-parameters of the target mining potential identification model through training and experiments, wherein the optimal super-parameters represent the number of the decision trees of the target mining potential identification model; the olivine data sample is identified by the target ore potential model with the optimal super parameters, so that the performance of the target ore potential identification model can be optimal. The five-fold cross validation method is a method for equally dividing data into 5 equal parts, taking 1 part for testing each experiment, and taking the rest 4 parts for training, and carrying out the experiment 5 times for averaging; the grid search algorithm is a method for circularly traversing the possibility of each super-parameter and selecting the value of the better super-parameter. Through the combination of the five-fold worse verification method and the grid search method, the super-parameter optimal value of the target mining potential identification model can be obtained.
Referring to fig. 3, fig. 3 is a flowchart of a model training step according to an embodiment of the present invention, and step S300 includes the following steps S310 to S330:
s310, performing self-help replacement sampling on olivine training data by a self-help sampling method, and forming a training set from the sampled sample data.
Specifically, a put-back sample is performed from the olivine training data containing m olivine samples, i.e. a sample is randomly taken and recorded in the sample set, and then put back into the olivine training data set, so that the sample may still be selected at the next sampling. After m times of sampling, a sampling set containing m samples is obtained and used as a training set. The self-help sampling method can sample a plurality of different training sets from olivine training data, and is beneficial to model training.
Furthermore, it should be noted that, due to random sampling at the time of sampling, in the initial data set, a portion of samples may be sampled multiple times, and a portion of samples may never be sampled, and accordingly, the following calculation formula for the number of samples is not adopted:
Figure BDA0004076601490000071
wherein m is the number of olivine samples in the olivine training data.
From the calculation formula of the number of samples not adopted, about 36.8% of the samples are not collected, and the samples are out-of-bag samples (out-of-bag data) and can be used for evaluating the trained mining potential identification model.
It can be understood that according to the classification and the regression tree number T in the initial ore potential recognition model, T training sets containing m samples can be obtained by sampling through a self-help sampling method, the training sets contain d features, and the training sets are used for training the initial ore potential recognition model; and a verification set consisting of the sample outside the bag can be obtained for evaluating the trained mineralisation potential identification model.
S320, expanding an initial classification and regression tree of the initial mining potential identification model according to the training set to obtain a first classification and regression tree.
Specifically, embodiments of the present invention use a gini index (gini index) minimization criterion to select the optimal olivine feature for each node while determining the optimal binary cut point for that feature. Step S320 includes the following steps S321 to S325:
s321, randomly selecting a subset containing one or more features from the training set to obtain a first subset.
Specifically, for each node of the first classification and regression tree, a subset of n (n < d) features is randomly selected from the training set of d features as the first subset.
S322, calculating the base index of the first subset.
Specifically, for a given training set D, its base index expression is:
Figure BDA0004076601490000081
wherein K is the number of olivine types, C k Gini (D) represents the uncertainty of training set D for the subset of olives Dan Yangben belonging to the k-th class in training set D. Under the condition of the characteristic A, the base index of the training set D is as follows:
Figure BDA0004076601490000082
wherein D is 1 And D 2 Is divided into two parts, D, of an olivine sample set D according to whether the olivine feature A takes a certain possible value a 1 ={(x,y)∈D|A(x)=a},D 2 =D-D 1 Gini (D, a) represents the number of samples after division by a (x) =a,uncertainty of set D.
S323, determining the optimal dividing characteristic and the optimal binary dividing point according to the first subset and the base index.
Specifically, from a first subset of n olivine features, the base index of each feature and all possible cut points for each feature are calculated separately, and the feature a with the smallest base index is selected * As an optimal partitioning feature, a * As the optimal binary cut point for this feature.
S324, node division of the initial classification and the regression tree is determined according to the optimal division characteristics and the optimal binary segmentation points.
Specifically, the optimal division feature and the optimal binary segmentation point are taken as root nodes, and the training set is distributed to two child nodes of the root nodes.
S325, repeating the step of randomly removing a subset containing one or more features from the training set to obtain a first subset until a predetermined stopping condition is met, and obtaining a first classification and regression tree.
Specifically, the steps S321 to S324 are recursively repeated with the first subset as the training set until a predetermined stopping condition is satisfied, thereby obtaining a first classification and regression tree.
S330, repeating the steps of expanding the initial classification and the regression tree of the initial mining potential identification model according to the training set until all the initial classification and the regression tree expansion in the initial mining potential identification model are completed, and obtaining a first mining potential identification model.
Specifically, steps S321 to S320 are repeated to construct a plurality of initial classification and regression trees, and the initial classification and regression trees form a first mining potential identification model.
S400, testing the accuracy of the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model.
Specifically, step S400 includes the following steps S410 to S430:
s410, carrying out ore forming potential prediction on the out-of-bag data of the olivine training data by adopting a first ore forming potential identification model.
Specifically, the first mining potential identification model is adopted to conduct mining potential prediction on the out-of-bag data of the olivine training data. During prediction, the out-of-bag data are input into a first ore forming potential identification model, decision judgment is carried out on the out-of-bag data by initial classification and regression trees in the first ore forming potential identification model so as to obtain a prediction result of each classification and regression tree, the number of votes obtained by each result is calculated on the prediction result of each classification and regression tree through a voting method, the result with the highest number of votes is selected as a final prediction result, and the calculation formula of the voting method is as follows:
Figure BDA0004076601490000091
wherein x is training set data divided from olivine, h i For the ith classification and regression tree, each classification and regression tree is derived from a set of class labels { c 1 ,c 2 ,c 3 ,...,c N Predicted a class label in }, then
Figure BDA0004076601490000092
The category label c for the ith classification and regression tree j The output of the above is an N-dimensional vector. Thus, by summing the prediction results of all the classification and regression trees, the argmax function is used to return the class index inx with the largest vector median, namely the finally predicted class c inx
S420, calculating the recognition accuracy of the mining potential recognition model.
Specifically, calculating the recognition accuracy of the mining potential recognition model, wherein the recognition accuracy is quantified by errors and precision, and when the errors and the precision are within a preset range, model training is completed, and a target mining potential recognition model is obtained; when the error and the precision are not within the preset range, the process proceeds to step S430, where the model correction is performed.
S430, correcting the target mining potential identification model according to the identification accuracy.
Specifically, according to the identification accuracy, the optimal value of the super parameter n_identifiers (n_identifiers are the number of decision trees in the random forest) of the model can be obtained through a five-fold cross-validation and grid search method, and finally the target mining potential identification model with the optimal value of the super parameter as the number of classification and regression trees in the random forest is obtained.
S500, identifying the lithology-type nickel-cobalt ore-forming potential through a target ore-forming potential identification model.
Specifically, after the target ore potential recognition model is trained, a piece of olivine collected in any ore forming environment is recognized through the target ore potential recognition model, so that the nickel cobalt ore forming potential in the ore forming environment is determined, and the recognition process can be completed by using the recognition mode described in the step S410, which is not repeated here.
The method for identifying the ore potential of the magma type nickel cobalt in the embodiment of the invention can also comprise the following visual treatment:
referring to fig. 4, fig. 4 is a flowchart illustrating a data visualization step according to an embodiment of the present invention, first, the olivine data is subjected to a dimension reduction operation, i.e., mapping from high-dimensional data to two-dimensional data, by a dimension reduction algorithm. And then, taking the two-dimensional data obtained by mapping as a training set to train the olivine classification model. And selecting coordinate points on the two-dimensional plane equidistantly, taking the selected coordinate points as new data, and completing discrimination on the new data by using a trained classification model. And drawing a contour line according to the discrimination result of the selected coordinate point and the model on the new data. And finally, projecting the olivine data sample label onto a two-dimensional plane, drawing a two-dimensional decision boundary, and completing high-dimensional data classification visualization.
The embodiment of the invention also provides a magma type nickel-cobalt ore potential identification system, which comprises the following steps: the first module is used for acquiring olivine data and preprocessing the olivine data to obtain olivine training data; the second module is used for constructing an initial mining potential identification model of the classification and regression tree-based learner based on the random forest; the third module is used for carrying out model training on the initial mining potential identification model according to olivine training data to obtain a first mining potential identification model; the fourth module is used for testing the accuracy of the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model; and the fifth module is used for identifying the lithology-type nickel-cobalt mining potential through the target mining potential identification model.
The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory; the memory is used for storing programs; the processor executes the program to implement the method as described above.
The embodiment of the invention also provides a computer storage medium, wherein the storage medium stores a program, and the program is executed by a processor to realize the method.
The embodiment of the invention has the following beneficial effects:
1. the method is different from the traditional discrimination method based on the binary diagram or the ternary diagram, breaks through the limitation of the model based on the traditional method on the input dimension, can effectively utilize sample characteristics, and can identify the olivine construction environment and the rock mass ore formation state, so that the nickel-cobalt ore formation potential in the construction environment where the olivine is located is judged.
2. Fo, ni (nickel), siO on olivine using initial nickel cobalt mineralisation potential identification model 2 (silica), feO (ferrous oxide), mgO (magnesium oxide), cr 2 O 3 The data of 8 component indexes of MnO (manganese oxide) and CaO (calcium oxide) are fully learned, a model for accurately distinguishing the olivine structure environment and mineralization state is trained, and the accurate prediction of the olivine structure environment and rock mass mineralization potential is realized.
3. The embodiment of the invention also provides a visual tool for the identification effect of the target mining potential identification model, and the classification effect of the random forest model on the high-dimensional data is more intuitively and effectively observed.
The following is an application scenario of the embodiment of the present invention:
firstly, obtaining olivine data and preprocessing the olivine data to obtain olivine training data; based on random forests, constructing 350 classification and regression trees to form an initial mining potential identification model; model training is carried out on the initial mining potential identification model according to olivine training data, and a first mining potential identification model is obtained; performing accuracy testing on the first ore forming potential identification model according to the out-of-bag data of the olivine training data to obtain a target ore forming potential identification model; and identifying the rock-magma type nickel-cobalt ore potential under different classification standards through a target ore potential identification model, wherein the set classification standards are a construction environment, a mineralization state, a rock mass type and a rock type. Through testing, the accuracy of olive Dan Panbie under different classification standards in the embodiment of the invention is shown in table 2:
TABLE 2
Classification criterion Accuracy rate of
Construction environment 99%
Mineralization state 99%
Rock mass type 100%
Rock type 99%
Wherein, according to the mineralization state of the olivine, the nickel cobalt ore forming potential of the olivine ore forming environment can be identified.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims (10)

1. The method for identifying the ore potential of the magma type nickel cobalt is characterized by comprising the following steps of:
obtaining olivine data and preprocessing the olivine data to obtain olivine training data;
constructing a classification and regression tree based on a random forest to form an initial mining potential identification model;
model training is carried out on the initial mining potential identification model according to the olivine training data, and a first mining potential identification model is obtained;
performing accuracy testing on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model;
and identifying the lithology-type nickel-cobalt ore-forming potential through the target ore-forming potential identification model.
2. The method for identifying potential of lithology of nickel cobalt according to claim 1, wherein the steps of obtaining olivine data and preprocessing the olivine data to obtain olivine training data comprise:
performing standardization processing on the olivine data to obtain first olivine data;
performing oversampling treatment on the first olivine data by using an SMOTE oversampling method to obtain second olivine data;
and encoding the data tag of the second olivine data to obtain olivine training data.
3. The method for identifying potential of nickel cobalt ore formation by magma according to claim 1, wherein the constructing classification and regression trees based on random forests to form an initial potential of ore formation identification model comprises:
constructing a plurality of initial classification and regression trees based on the random forest;
all of the initial classification and regression trees of the construct are combined into an initial mineralisation potential identification model.
4. The method for identifying potential of nickel cobalt ore formation in magma according to claim 1, wherein the model training the initial potential of ore formation identification model according to the olivine training data to obtain a first potential of ore formation identification model comprises:
self-help replacement sampling is carried out on the olivine training data through a self-help sampling method, and the sampled sample data form a training set;
expanding an initial classification and regression tree of the initial minescence potential identification model according to a training set to obtain a first classification and regression tree;
and repeating the steps of expanding the initial classification and regression tree of the initial ore forming potential identification model according to the training set until all initial classification and regression tree expansion in the initial ore forming potential identification model are completed, and obtaining a first ore forming potential identification model.
5. The method for identifying potential of nickel cobalt ore formation in magma according to claim 4, wherein expanding the initial classification and regression tree of the initial potential of ore formation identification model according to the training set to obtain a first classification and regression tree comprises:
randomly selecting a subset containing one or more features from the training set to obtain a first subset;
calculating a base index of the first subset;
determining optimal dividing features and optimal binary dividing points according to the first subset and the base index;
determining node division of the initial classification and regression tree according to the optimal division characteristics and the optimal binary segmentation points;
repeating the step of randomly removing a subset comprising one or more features from the training set to obtain a first subset until a predetermined stopping condition is met to obtain a first classification and regression tree.
6. The method for identifying the mining potential of lithology nickel cobalt according to claim 1, wherein the performing the accuracy test on the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain the target mining potential identification model comprises:
carrying out ore potential prediction on the out-of-bag data of the olivine training data by adopting the first ore potential identification model;
calculating the recognition accuracy of the mining potential recognition model;
and correcting the target mining potential identification model according to the identification accuracy.
7. The method for identifying the ore potential of magma type nickel cobalt according to claim 1, wherein the method further comprises:
equidistant coordinate points are selected on a two-dimensional plane, and the selected coordinate points are used as new data;
carrying out nickel-cobalt mining potential identification on the new data by adopting the target mining potential identification model to obtain an identification result;
generating a contour line according to the coordinate points and the identification result;
and combining the olivine sample label and the contour line to generate a two-dimensional decision boundary.
8. A magma type nickel-cobalt mining potential identification system, comprising:
the first module is used for acquiring olivine data and preprocessing the olivine data to obtain olivine training data;
the second module is used for constructing classification and regression trees based on the random forest to form an initial mining potential identification model;
the third module is used for carrying out model training on the initial mining potential identification model according to the olivine training data to obtain a first mining potential identification model;
the fourth module is used for testing the accuracy of the first mining potential identification model according to the out-of-bag data of the olivine training data to obtain a target mining potential identification model;
and the fifth module is used for identifying the lithology-type nickel-cobalt mining potential through the target mining potential identification model.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program implements the method of any one of claims 1 to 7.
10. A computer storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the method of any one of claims 1 to 7.
CN202310110434.6A 2023-02-02 2023-02-02 Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt Active CN116151107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310110434.6A CN116151107B (en) 2023-02-02 2023-02-02 Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310110434.6A CN116151107B (en) 2023-02-02 2023-02-02 Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt

Publications (2)

Publication Number Publication Date
CN116151107A true CN116151107A (en) 2023-05-23
CN116151107B CN116151107B (en) 2023-09-05

Family

ID=86373103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310110434.6A Active CN116151107B (en) 2023-02-02 2023-02-02 Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt

Country Status (1)

Country Link
CN (1) CN116151107B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717575A (en) * 2018-04-25 2018-10-30 许昌学院 A kind of Bootstrap statistics estimating method of fractal sequences parameter Estimation
CN109711597A (en) * 2018-11-14 2019-05-03 东莞理工学院 A kind of Copper-nickel Sulfide Ore Deposit metallogenic prognosis method based on stratified random forest model
CN114739977A (en) * 2022-04-13 2022-07-12 重庆大学 Method and system for extracting oil paper insulation aging spectral characteristics based on random forest method
CN115148299A (en) * 2022-07-15 2022-10-04 中国地质大学(北京) XGboost-based ore deposit type identification method and system
CN115331752A (en) * 2022-07-22 2022-11-11 中国地质大学(北京) Method capable of adaptively predicting quartz forming environment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717575A (en) * 2018-04-25 2018-10-30 许昌学院 A kind of Bootstrap statistics estimating method of fractal sequences parameter Estimation
CN109711597A (en) * 2018-11-14 2019-05-03 东莞理工学院 A kind of Copper-nickel Sulfide Ore Deposit metallogenic prognosis method based on stratified random forest model
CN114739977A (en) * 2022-04-13 2022-07-12 重庆大学 Method and system for extracting oil paper insulation aging spectral characteristics based on random forest method
CN115148299A (en) * 2022-07-15 2022-10-04 中国地质大学(北京) XGboost-based ore deposit type identification method and system
CN115331752A (en) * 2022-07-22 2022-11-11 中国地质大学(北京) Method capable of adaptively predicting quartz forming environment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QIUBING REN .ETC: "Tectonic discrimination of olvine in basalt using data mining techniques based on major elements: a comparative study from multiple perspectives", 《BIG EARTH DATA》, vol. 3, pages 8 - 25, XP093066390, DOI: 10.1080/20964471.2019.1572452 *
刘月高 等: "新疆北山早二叠岩浆型铜镍硫化物矿床综合信息勘察模式", 《矿床地质》, vol. 38, no. 3, pages 644 - 666 *
薛胜超 等: "造山带岩浆铜镍硫化物矿床的混染模式-以天山-北山二叠纪铜镍矿为例", 《矿床地质》, vol. 41, no. 1, pages 1 - 20 *

Also Published As

Publication number Publication date
CN116151107B (en) 2023-09-05

Similar Documents

Publication Publication Date Title
CN111079836B (en) Process data fault classification method based on pseudo label method and weak supervised learning
Gandhi et al. Towards data mining based decision support in manufacturing maintenance
CN106599230A (en) Method and system for evaluating distributed data mining model
CN108416373A (en) A kind of unbalanced data categorizing system based on regularization Fisher threshold value selection strategies
Mohr et al. Fast and informative model selection using learning curve cross-validation
Shafiq et al. Retracted: Scientific programming using optimized machine learning techniques for software fault prediction to improve software quality
Bergmeir Enhanced machine learning and data mining methods for Analysing large hybrid electric vehicle fleets based on load spectrum data
Wever et al. Automating multi-label classification extending ml-plan
Tornede et al. Coevolution of remaining useful lifetime estimation pipelines for automated predictive maintenance
Winkler et al. Advanced genetic programming based machine learning
Eggensperger et al. Surrogate Benchmarks for Hyperparameter Optimization.
Wu et al. An uncertainty-oriented cost-sensitive credit scoring framework with multi-objective feature selection
CN110264392A (en) A kind of strongly connected graph detection method based on more GPU
CN116151107B (en) Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt
CN117036060A (en) Vehicle insurance fraud recognition method, device and storage medium
CN116910526A (en) Model training method, device, communication equipment and readable storage medium
CN113204498B (en) Method and apparatus for generating fuzzy test driver for closed source function library
CN114819344A (en) Global space-time meteorological agricultural disaster prediction method based on key influence factors
Ansari et al. Analysis of Suitable Approaches for Data Mining Algorithms
Bäuerle et al. Training de-confusion: an interactive, network-supported visual analysis system for resolving errors in image classification training data
Grégoire et al. Innovative multidisciplinary method using Machine Learning to define human behaviors and environments during the Caune de l’Arago (Tautavel, France) Middle Pleistocene occupations
JP2017091083A (en) Information processing device, information processing method, and program
Dominguez et al. A classification and data visualization tool applied to human migration analysis
Tariq et al. Time efficient end-state prediction through hybrid trace decomposition using process mining
Patra et al. Inductive learning including decision tree and rule induction learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant