CN106250676B - Element geochemistry survey data preferred method based on information gain-ratio - Google Patents

Element geochemistry survey data preferred method based on information gain-ratio Download PDF

Info

Publication number
CN106250676B
CN106250676B CN201610575934.7A CN201610575934A CN106250676B CN 106250676 B CN106250676 B CN 106250676B CN 201610575934 A CN201610575934 A CN 201610575934A CN 106250676 B CN106250676 B CN 106250676B
Authority
CN
China
Prior art keywords
data
grid
ratio
geochemistry
information gain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610575934.7A
Other languages
Chinese (zh)
Other versions
CN106250676A (en
Inventor
王新华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences Beijing
Original Assignee
China University of Geosciences Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences Beijing filed Critical China University of Geosciences Beijing
Priority to CN201610575934.7A priority Critical patent/CN106250676B/en
Publication of CN106250676A publication Critical patent/CN106250676A/en
Application granted granted Critical
Publication of CN106250676B publication Critical patent/CN106250676B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The present invention is susceptible to subjective bias problem for the analysis of existing geochemistry data, it is proposed that one kind accurately analytical geochemistry data can be based on information gain-ratio element geochemistry survey data preferred method:Obtain geochemistry data in enumeration district;White space grid matrix is built in enumeration district and draws corresponding calculating grid matrix;Data are screened, wrong data is rejected, remaining data is passed through in interpolated projections to calculating grid matrix;By known orefield coordinate projection to calculating in grid matrix, mineral products grade is indicated in the corresponding calculating grid for knowing orefield using respective markers, expression unknown mark is filled in the unknown orefield of correspondence calculates grid;Several grids containing mine and several unknown grids are randomly selected as training data;Each element and corresponding ore-bearing potential information gain-ratio are calculated, mine advantageous elements are selected to;Enumeration district (ED) geochemistry data is carried out at mine advantageous elements preferred.The present invention can accurately analyze geochemistry data.

Description

Element geochemistry survey data preferred method based on information gain-ratio
Technical field
The present invention relates to a kind of element geochemistry survey data preferred method based on information gain-ratio.
Background technology
The analysis of geochemical data be to mineral exploration during a very important link, existing exploration Geochemistry data analysis method is that sediments/soil/ground vapour sample data after test is projected to flat square to sit In mark system, isogram or the Spring layer using threshold as boundary delineation single element is made, to the analyses of data mainly according to It is by virtue of experience accomplished manually by worker.This data analysing method is very high to survey personnel's skill requirement, and this The method that kind data analysing method can not form system is exchanged and is taught, and causes each worker can only be by previous Experience carries out subjective analysis and judgement.Since different operating personnel knowledge experience is horizontal different, it is this by subjective analysis and There is difference when different staff judges the mineral products of areal in the data analysing method of judgement.And it is selecting It selects in a region into mine advantageous elements evaluation criterion disunity, directly affects exploration and evaluation of mineral resources effect, therefore there is an urgent need for A kind of quantization preferred method for exploration geochemistry data.
Invention content
In view of the above-mentioned problems, the present invention provides a kind of element earth based on information gain-ratio of accurate judgement metallogenic factors Chemical survey data preferred method.
In order to achieve the above objectives, the present invention is based on the element geochemistry survey data preferred method of information gain-ratio includes Following steps:
Step 1:Selected enumeration district, obtains the geochemistry data in enumeration district;
Step 2:The space lattice matrix of blank is built in enumeration district;According to the space lattice matrix draw calculation grid Matrix;
Step 3:The data obtained in step 1 are screened, the data of mistake are rejected;Remaining data are passed through into interpolation It projects in the calculating grid matrix built in step 2, formation element property grid matrix
Step 4:By in known orefield coordinate projection to the element property grid matrix in step 3, by known mine The mineral products grade in the place of production is divided into several ranks, is filled in the calculating grid in orefield known to correspondence and indicates mineral products grade rank Label fills in expression unknown mark in the calculating grid in the unknown orefield of correspondence, forms mineral products grade grid matrix;
Step 5:If randomly selected from the mineral products grade grid matrix of step 4 several corresponding orefields grid containing mine and The unknown grid in dry corresponding unknown orefield is as training data;
Step 6:The information gain-ratio of each element and corresponding ore-bearing potential is calculated, element of the information gain-ratio preceding 30% is chosen As at mine advantageous elements;
Step 7:Enumeration district (ED) geochemistry data is carried out preferably at mine advantageous elements according to being chosen in step 6.
Further, the method that geochemistry data is obtained in the step 1 is as follows:Selection 1:200000-1:50000 engineer's scales Arrange geochemistry survey grid in enumeration district, according in the chemical survey grid acquisition enumeration district sediments or soil it is geochemical Imitate product, is detected to the sediments or pedogeochemistry sample, obtains geochemistry data.
Further, further include data conversion step before or after the step 3;The data conversion step will be complete Portion's data are converted into the data of Normal Distribution by data transformation.
Further, the sizing grid of the space lattice matrix of the blank built in the step 2 be 10m*10m~ 1000m*1000m。
Further, the ratio of the grid cell containing mine in the step 5 and unknown grid cell is 1:5~5:1, it is described Grid cell containing mine and the preferred ratio of unknown grid cell are 1:1.
Further, the data transformation is converted using Box-Cox, and the Box-Cox transform methods are as follows:YiIt is original I-th of data in data, i-th of data after transformation are Yi(λ);It is described when λ is more than 0 or is less than 0When λ is equal to 0, the Yi(λ)=ln (Yi);Wherein, the λ is to make the maximum constant of likelihood function.
Further, the computational methods of the information gain-ratio in the step 6 are as follows:Wherein, institute It statesIt is describedIt is describedIn above-mentioned formula, | D | indicate that the tuple number in data set D, m indicate attribute A Value number, DiIndicate that the corresponding tuple-set of i-th of value of attribute A, v are the value number for indicating generic attribute, pjIt indicates Tuple is the probability of j classes.
Further, further include into mine grade judgment step after step 7, described at mine grade judgment step includes as follows Process:
1, the data of mine advantageous elements are extracted into, the isogram of mine advantageous elements is depicted as;
2, according to the crossing instances of each isogram at mine favorable factor judge into mine position and at mine grade.
The present invention is based on the element geochemistry survey data preferred method of information gain-ratio, using calculating information gain-ratio Method judge that each element to the influence at mine grade, is capable of judging for simple, intuitive.The present invention is based on information gain-ratios Element geochemistry survey data preferred method form the computational methods and data processing method of set of system, be convenient for work It the exchange of personnel and teaches, the efficiency of geochemical can be improved to a certain extent.The present invention is based on information gain-ratios Element geochemistry survey data preferred method mine favorable factor is filtered out into, entirely by the COMPREHENSIVE CALCULATING to each element Face considers the influence of each element, and therefore, validity height, strong applicability, accuracy are good.Pass through the member based on information gain-ratio Plain geochemistry data preferred method provides succinct directly preferably square at mine advantageous elements for geochemistry mineral exploration Method reduces the dependence to staff personal experience and Different Individual subjective consciousness to evaluating the shadow brought at mine advantageous elements Ring, accomplish the evaluation each element of objective reality reconnoitre in area at mine profitability, progress Exploration guide element it is preferred.Rely on Computer generation for artificial selection reconnoitre area's geochemistry data preferably with evaluation, enhance the objectivity of evaluation, improve data The efficiency and quality of processing, highly practical, accuracy is high, and it is convenient to realize, processing is quick.
Description of the drawings
Fig. 1 is the operating procedure of the geochemical data preferred method the present invention is based on information gain-ratio;
Fig. 2 is that the present invention is based on be filled in the embodiment 1 of the geochemical data preferred method of information gain-ratio The calculating grid matrix of mineral products grade level flag and unknown mark;
Fig. 3 is the partial enlarged view of Fig. 2;
Fig. 4 is that the present invention is based on each in the embodiment 1 of the geochemical data preferred method of information gain-ratio The sequence of the information gain-ratio of element;
Fig. 5 is the distribution mode for calculating grid matrix.
Specific implementation mode
The present invention will be further described with reference to the accompanying drawings of the specification.
Embodiment 1
As shown in Figs 1-4, the operation of the geochemical data preferred method based on information gain-ratio of the present embodiment Process is as follows:
1, Gansu Province goldfield 1 is collected:50000 sediments geochemistry datas include 18 kinds of elements altogether:Ag、 As, Au, Bi, Cd, Co, Cr, Cu, Hg, Mo, Ni, Pb, Rb, Sb, Sn, Ti, W, Zn, 264 sq-kms of covering reconnoitre area's area.
2, the space lattice matrix of blank is built according to 100m*100m, area's spatial dimension is reconnoitred in covering;According to space lattice Matrix draw calculation grid matrix M.It is as shown in Figure 5 to calculate grid matrix.
3, the geochemistry data obtained in 1 is examined, the data of mistake are rejected, remaining geochemistry data to be calculated leads to It crosses to each unit prime number according to space Kriging regression is carried out, geochemistry data to be calculated is projected to and is calculated in grid matrix, Formation element property grid matrix, each cell is 18 dimension attributes comprising each element property in element property grid matrix Vector;Use mi,jIndicate the cell in element property grid matrix;mi,j=| Agi,j Asi,j Aui,j Bii,j … Sni,j Tii,j Wi,j Zni,j|;
4, by known orefield coordinate projection to the element property grid matrix in 3, contain mine according to known orefield Grade height, 3,2,1 is filled in corresponding element property grid, will fill in 0 in the corresponding element property grid in unknown orefield, And this value is attached to the vectorial m of grid matrixi,jIn, as the 19th attribute, in this way, element property grid matrix is formed Mineral products grade grid matrix.Unknown orefield refers to the place whether position is in unknown state containing mineral products.
5, as shown in Figures 2 and 3, randomly selected from 4 mineral products grade grid matrix 60 corresponding orefields containing mine Grid cell and the unknown grid cell in 60 unknown orefields of correspondence are as training data;In the calculation, unknown orefield regards To be free of mineral products.
6, the information gain-ratio of each element and corresponding ore-bearing potential is calculated, the information gain of each element and corresponding ore-bearing potential is calculated The method of rate is as follows:The computational methods of information gain-ratio in the step 6 are as follows:Wherein, describedIt is describedIt is describedIn above-mentioned formula, | D | indicate that the tuple number in data set D, m indicate attribute A Value number, DiIndicate that the corresponding tuple-set of i-th of value of attribute A, v are the value number for indicating generic attribute, pjIt indicates Tuple is the probability of j classes.The information gain-ratio sequence for calculating each element and corresponding ore-bearing potential finished is as shown in Figure 4.Choose letter The element for ceasing ratio of profit increase preceding 30% is used as into mine advantageous elements.
7, it according to information gain-ratio ranking results, selects preceding 30% at mine advantageous elements, it is preferred in 18 kinds of elements W, six kinds of elements of Au, Sb, Sn, Ag, Hg are top-priority at mine advantageous elements in working as this regional exploration.
8, according to the data of six kinds of elements in 7, the isogram of six kinds of elements is drawn, in drawing process, if The Spring layer of two or more element intersects, then the intersection region is possible to be minerogenetic province, in the zone of intersection What is occurred is more at mine advantageous elements, also, the containing at mine advantageous elements in the position higher at mine possibility of the position Amount is higher, and the possibility which high-grade mineral products occurs is higher.
In the above-mentioned methods, in order to further enhance the precision of calculating, data can calculated into calculating grid matrix Data conversion process is carried out before projection or to data after projection, the data transformation is converted using Box-Cox, the Box-Cox Transform method is as follows:YiFor i-th of data in initial data, i-th of data after transformation are Yi(λ);When λ is more than 0 or is less than It is described when 0When λ is equal to 0, the Yi(λ)=ln (Yi);Wherein, the λ is to keep likelihood function maximum Constant.
Element geochemistry survey data preferred method of the present embodiment based on information gain-ratio, using calculating information gain The method of rate judges that each element to the influence at mine grade, is capable of judging for simple, intuitive.The present embodiment is increased based on information The element geochemistry survey data preferred method of beneficial rate forms the computational methods and data processing method of set of system, is convenient for It the exchange of staff and teaches, the efficiency of geochemical can be improved to a certain extent.The present embodiment is based on information It is advantageous to filter out into mine by the COMPREHENSIVE CALCULATING to each element for the element geochemistry survey data preferred method of ratio of profit increase Factor considers the influence of each element comprehensively, and therefore, validity height, strong applicability, accuracy are good.
More than, only presently preferred embodiments of the present invention, but scope of protection of the present invention is not limited thereto, and it is any to be familiar with sheet In the technical scope disclosed by the present invention, the change or replacement that can be readily occurred in should all be covered those skilled in the art Within protection scope of the present invention.Therefore, the scope of protection of the present invention shall be subject to the scope of protection defined by the claims.

Claims (6)

1. a kind of element geochemistry survey data preferred method based on information gain-ratio, it is characterised in that:The method packet Include following steps:
Step 1:Selected enumeration district, obtains the geochemistry data in enumeration district;
Step 2:The space lattice matrix of blank is built in enumeration district;According to the space lattice matrix draw calculation grid square Battle array;
Step 3:The data obtained in step 1 are screened, the data of mistake are rejected;Remaining data are passed through into interpolated projections Into the calculating grid matrix built in step 2, formation element property grid matrix
Step 4:By in known orefield coordinate projection to the element property grid matrix in step 3, by known orefield Mineral products grade be divided into several ranks, filled in the calculating grid in orefield known to correspondence indicate mineral products grade rank mark Note fills in expression unknown mark in the calculating grid in the unknown orefield of correspondence, forms mineral products grade grid matrix;
Step 5:The grid containing mine in several corresponding orefields and several right is randomly selected from the mineral products grade grid matrix of step 4 Answer the unknown grid in unknown orefield as training data;
Step 6:The information gain-ratio of each element and corresponding ore-bearing potential is calculated, element conduct of the information gain-ratio preceding 30% is chosen At mine advantageous elements;
Step 7:Enumeration district (ED) geochemistry data is carried out preferably at mine advantageous elements according to being chosen in step 6;
The computational methods of information gain-ratio in the step 6 are as follows:Wherein, describedIt is describedIt is describedIn above-mentioned formula, | D | indicate that the tuple number in data set D, m indicate attribute A Value number, DiIndicate the corresponding tuple-set of i-th of value of attribute A, | Di| indicate the tuple of attribute i-th of value of A Number, v are the value number for indicating generic attribute, pjIndicate that tuple is the probability of j classes.
2. the element geochemistry survey data preferred method based on information gain-ratio as described in claim 1, it is characterised in that: The method that geochemistry data is obtained in the step 1 is as follows:Selection 1:200000-1:50000 engineer's scales arrange the earth in enumeration district Chemical survey grid, according to the sediments or pedogeochemistry sample in the chemical survey grid acquisition enumeration district, to the water It is that deposit or pedogeochemistry sample are detected, obtains geochemistry data.
3. the element geochemistry survey data preferred method based on information gain-ratio as described in claim 1, it is characterised in that: Further include data conversion step before or after the step 3;The data conversion step becomes total data by data Change the data for being converted into Normal Distribution.
4. the element geochemistry survey data preferred method based on information gain-ratio as described in claim 1, it is characterised in that: The sizing grid of the space lattice matrix of the blank built in the step 2 is 10m*10m~1000m*1000m.
5. the element geochemistry survey data preferred method based on information gain-ratio as described in claim 1, it is characterised in that: The ratio of grid cell containing mine and unknown grid cell in the step 5 is 1:5~5:1, the grid cell containing mine with not Know that the preferred ratio of grid cell is 1:1.
6. the element geochemistry survey data preferred method based on information gain-ratio as claimed in claim 3, it is characterised in that: The data transformation is converted using Box-Cox, and the Box-Cox transform methods are as follows:YiFor i-th of number in initial data According to i-th of data after transformation are Yi(λ);It is described when λ is more than 0 or is less than 0When λ is equal to 0, institute State Yi(λ)=ln (Yi);Wherein, the λ is to make the maximum constant of likelihood function.
CN201610575934.7A 2016-07-20 2016-07-20 Element geochemistry survey data preferred method based on information gain-ratio Expired - Fee Related CN106250676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610575934.7A CN106250676B (en) 2016-07-20 2016-07-20 Element geochemistry survey data preferred method based on information gain-ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610575934.7A CN106250676B (en) 2016-07-20 2016-07-20 Element geochemistry survey data preferred method based on information gain-ratio

Publications (2)

Publication Number Publication Date
CN106250676A CN106250676A (en) 2016-12-21
CN106250676B true CN106250676B (en) 2018-08-14

Family

ID=57613571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610575934.7A Expired - Fee Related CN106250676B (en) 2016-07-20 2016-07-20 Element geochemistry survey data preferred method based on information gain-ratio

Country Status (1)

Country Link
CN (1) CN106250676B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106908855B (en) * 2017-02-23 2019-08-30 中国地质大学(武汉) A method of geochemistry element combinations are selected based on GIS spatial analysis
CN113326784A (en) * 2021-06-01 2021-08-31 中国石油天然气集团有限公司 Mineral resource detection method, system and equipment
CN115759815B (en) * 2022-11-03 2023-11-03 中国科学院广州地球化学研究所 Investigation method for judging zebra copper ore type by using crust maturity index

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038680A (en) * 2007-04-29 2007-09-19 中国地质大学(北京) Method for finding mine with cubic forecast model based on three-dimensional modeling
CN101055631A (en) * 2006-04-10 2007-10-17 中国地质大学(武汉) Space data fuzzy evidence weight analysis method
CN101694671A (en) * 2009-10-27 2010-04-14 中国地质大学(武汉) Space weighted principal component analyzing method based on topographical raster images
WO2014179832A1 (en) * 2013-05-08 2014-11-13 Technological Resources Pty Ltd A method of, and a system for, controlling a drilling operation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055631A (en) * 2006-04-10 2007-10-17 中国地质大学(武汉) Space data fuzzy evidence weight analysis method
CN101038680A (en) * 2007-04-29 2007-09-19 中国地质大学(北京) Method for finding mine with cubic forecast model based on three-dimensional modeling
CN101694671A (en) * 2009-10-27 2010-04-14 中国地质大学(武汉) Space weighted principal component analyzing method based on topographical raster images
WO2014179832A1 (en) * 2013-05-08 2014-11-13 Technological Resources Pty Ltd A method of, and a system for, controlling a drilling operation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
西秦岭寨上金矿原生晕地球化学特征及成矿预测;刘新会等;《黄金科学技术》;20120831;第20卷(第4期);第7-15页 *

Also Published As

Publication number Publication date
CN106250676A (en) 2016-12-21

Similar Documents

Publication Publication Date Title
Long et al. A density management diagram for even-aged ponderosa pine stands
CN106250676B (en) Element geochemistry survey data preferred method based on information gain-ratio
CN107943880A (en) A kind of susceptibility of geological hazards based on analytic hierarchy process (AHP) improves appraisal procedure
Siipilehto et al. Reliability of the predicted stand structure for clear-cut stands using optional methods: airborne laser scanning-based methods, smartphone-based forest inventory application Trestima and pre-harvest measurement tool EMO
CN106355011A (en) Geochemical data element sequence structure analysis method and device
CN106372277A (en) Variation function model optimization method in forest site index spatial-temporal estimation
CN104899448B (en) A kind of self-adapting compensation method of the static localization scheme of Ensemble Kalman Filter
CN108875806A (en) False forest fires hot spot method for digging based on space-time data
CN113360587B (en) Land surveying and mapping equipment and method based on GIS technology
CN111984701A (en) Method, device, equipment and storage medium for predicting village settlement evolution
Aldi et al. University Student Satisfaction Analysis on Academic Services by Using Decision Tree C4. 5 Algorithm (Case Study: Universitas Putra Indonesia “YPTK” Padang)
Grenier et al. Accuracy assessment method for wetland object-based classification
D'Amore et al. Early human peopling of Sicily: Evidence from the Mesolithic skeletal remains from Grotta d'Oriente
Parasiewicz et al. Comparison of MesoHABSIM with two microhabitat models (PHABSIM and HARPHA)
Fraser et al. Decadal change in soil chemistry of northern hardwood forests on the White Mountain National Forest, New Hampshire, USA
CN117171533B (en) Real-time acquisition and processing method and system for geographical mapping operation data
Voegele et al. Microstratigraphic analysis of fossil distribution in the lower Hornerstown and upper Navesink formations at the Edelman Fossil Park, NJ
Pinheiro et al. Quantitative pedology to evaluate a soil profile collection from the Brazilian semi-arid region
CN114255247A (en) Hilly land block depth segmentation and extraction method based on improved Unet + + network model
CN111967677A (en) Prediction method and device for unconventional resource dessert distribution
Marsh et al. Guidance on ecological responses and hydrological modelling for low-flow water planning
Maltauro et al. Spatial multivariate optimization for a sampling redesign with a reduced sample size of soil chemical properties
Hatami Carbon estimation of individual trees using high laser density data of airborne lidar: A case study in Bois Noir, France
Siebert Exploring Forest Structural Complexity: Scale Effects and Metric Relationships in New England Hardwood Forests
CN115860487B (en) Method for evaluating local vegetation change risk based on vegetation stability risk index

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180814

Termination date: 20210720

CF01 Termination of patent right due to non-payment of annual fee