CN116595399B - Analysis method for inconsistent element correlation problem in coal - Google Patents
Analysis method for inconsistent element correlation problem in coal Download PDFInfo
- Publication number
- CN116595399B CN116595399B CN202310704173.0A CN202310704173A CN116595399B CN 116595399 B CN116595399 B CN 116595399B CN 202310704173 A CN202310704173 A CN 202310704173A CN 116595399 B CN116595399 B CN 116595399B
- Authority
- CN
- China
- Prior art keywords
- data
- coal
- pivot
- coordinate
- coordinates
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000003245 coal Substances 0.000 title claims abstract description 68
- 238000004458 analytical method Methods 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 claims abstract description 92
- 238000012545 processing Methods 0.000 claims abstract description 12
- 239000011159 matrix material Substances 0.000 claims abstract description 11
- 238000007417 hierarchical cluster analysis Methods 0.000 claims abstract description 3
- 238000006243 chemical reaction Methods 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000011426 transformation method Methods 0.000 claims description 10
- 230000014509 gene expression Effects 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 4
- 238000005065 mining Methods 0.000 description 12
- 238000010219 correlation analysis Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 6
- 229910052751 metal Inorganic materials 0.000 description 6
- 108010034145 Helminth Proteins Proteins 0.000 description 5
- 244000000013 helminth Species 0.000 description 5
- 238000007619 statistical method Methods 0.000 description 5
- 235000013619 trace mineral Nutrition 0.000 description 5
- 239000011573 trace mineral Substances 0.000 description 5
- 239000000470 constituent Substances 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 229910052717 sulfur Inorganic materials 0.000 description 2
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004380 ashing Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000556 factor analysis Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 235000010755 mineral Nutrition 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 229910021654 trace metal Inorganic materials 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of analysis of elemental data in coal, and discloses an analysis method for the problem of inconsistent elemental correlation in coal, which comprises the following steps: step one, performing data processing on elements in coal by a method based on symmetrical pivot coordinates; step two, a method for forming a weighted pivot coordinate by adding a weight coefficient to construct a symmetrical pivot coordinate; step three, improving a weighted pivot coordinate method, constructing an orthogonal coordinate system, calculating a weighting coefficient of coordinates, defining a variation matrix, obtaining a weight value relation, and calculating a weighted pivot coordinate; and fourthly, performing hierarchical clustering analysis on the element data in the processed coal. Therefore, the analysis method for the inconsistent problem of the element correlation in the coal can solve the inconsistent problem under two references, and the analysis and interpretation of the geochemistry of the coal can be intuitively carried out again according to the clustered result.
Description
Technical Field
The invention relates to the technical field of analysis of elemental data in coal, in particular to an analysis method for the problem of inconsistent elemental correlation in coal.
Background
The occurrence state of elements in coal is of great significance to coal exploitation and research on mineral sources rich in coal. The elements in coal are mainly divided into two major categories, namely major elements and minor elements. The research significance of trace elements in coal mainly comprises 3 aspects, wherein the trace elements are often rich in precious strategic metal elements. The metal elements in the coal have great influence on the industrial application and environment, the content distribution rule and occurrence state of the metal elements are researched, and the metal elements have important theoretical guiding significance and practical significance for correctly evaluating the positive and negative effects of the metal elements in the coal on the industrial application of the coal, preventing the adverse effect of harmful trace metal elements in the coal on the environment and protecting the ecological environment. For example, finkelman considers that determining the presence of an element in coal helps to evaluate the effect of the element on the environment. The occurrence state of an element refers to the physical and chemical state of the element in a certain process of geochemical migration and the combination characteristics of the symbiotic element. From the geochemical point of view, the existence state of an element mainly refers to the binding state of the element, that is, the existence form of the element.
The methods for exploring the occurrence states of elements in coal can be roughly classified into physical experiment methods, chemical experiment methods, and mathematical analysis methods. The mathematical analysis methods mainly include correlation analysis (and ash correlation analysis, various sulfur correlation analysis, and macroelement correlation analysis), aggregation-like analysis, factor analysis, discriminant analysis, and the like. Correlation analysis and aggregation-like analysis among them are very common methods. There are several standard methods that require ashing of the coal prior to geochemical analysis. However, researchers are often interested in the compositional properties of whole coal, not its ash. The geochemical given sample data for any coal can be converted to each other on either the ash basis or the whole coal basis of its analysis. The composition data used by researchers may differ based on the measurement results of the same set of samples, single variable (mean, variance, distribution, etc.) and double variable (correlation coefficient, etc.), which may vary significantly. These differences are not true, but rather are "artifacts" created by the constituent nature of most geochemical constituent data. Since the composition data are forced to be constant sums, e.g. 100% or 1000000ppm, they possess curvilinear properties making the euclidean principle on which most statistical tests depend unsuitable, leading to erroneous results. The application of some conversion methods allows representation in euclidean space without fear of producing mathematically inconsistent results.
With the continuous and deep research, many students at home and abroad explore the occurrence characteristics, material sources, enrichment mechanisms and causes of the metal elements enriched in coal and byproducts in the migration process. The correlation analysis method is a method for researching the correlation degree between two or more groups of elements, and the occurrence state of the elements in the coal can be judged by using the correlation analysis method, namely, the occurrence state is judged according to the ash yield of the coal and the correlation coefficient between the trace element and the macroelement content. For example, when the occurrence state is determined based on the pearson correlation coefficient between the element and ash in the coal, positive correlation means that the element is in an inorganic bonding state, and negative correlation means that the element is in an organic bonding state.
The trace element formation in coal is important from both a scientific and environmental point of view, since the trace element behavior in coal depends not only on its content but also on its chemical morphology or formation. Statistical methods are one of the most commonly used indirect methods of interpreting element occurrence patterns. Many researchers have shown that the sum of the elemental levels in coal is a constant. The airchison indicates that the data with such a definite and constraint is component data. The localization and confinement of elements in coal appears at the full coal level: the sum of the element content (excluding the organic C, H, N and S) and the loss on ignition (LOI, loi=100% -ashields) is 100%; the ash content at the ash reference is expressed as: the sum of the contents of the macroelements and the microelements is 100 percent.
In 1866, the concept of component data was proposed, but there was little progress in the processing method of component data for a long time thereafter. In 1897 Person in the article discussing the problem of spurious correlations indicates that the processing of the component data is relatively complex and that direct correlation analysis of the component data may result in erroneous results. When component data is processed, the relationship between the components cannot be ignored. The traditional statistical analysis method mainly analyzes unrestricted data, and the direct analysis of the component data can obtain erroneous results. Until 1986, aitchison pointed out that the space in which the component data was located was a simplex space, which differs from the euro space in that: data in euclidean space can take values in the real domain, while fractional data in simplex space can be limited by definite and constraint. The limitation of the definition and the constraint makes the dimension component data actually only represent dimension information, for example, the three-dimensional component data space can be actually represented by only two-dimensional space, and the space is an equilateral triangle with side length of 1; the two-dimensional component data actually represents only one-dimensional information, and the space to be expanded is a line segment with a side length of 1. Typically, the distance metric commonly used in Euclidean space is Euclidean distance, and Aitchison considers the distance metric in simplex space to be Acheson distance. When analyzing component data using statistical analysis methods in euclidean space, the component data is converted into euclidean space, that is, the constraint of definite and constraint between the component data is eliminated.
For the component data conversion method, in 1986, aitchison proposed an asymmetric logarithmic transformation method (alr), which selects any one of the data as a denominator, the transformed data can overcome the constraint and the constraint of the component data and the transformed data takes a value in the real domain. On the basis of an asymmetric logarithmic ratio conversion method, aitchison also provides a symmetric logarithmic ratio conversion method (clr), which uses a geometric mean as a denominator of logarithmic ratios, but the sum of the converted data is zero and the definite sum constraint is not overcome. In 2002, wang Huiwen, liu Jiang et al propose a spherical coordinate transformation method which, unlike the two above-mentioned transformation methods, allows the presence of zero values, which makes the application of the method broader. In 2003, the asymmetric logarithmic transformation method and the symmetric logarithmic transformation method are improved on the basis of the component data geometry, and an equidistant logarithmic transformation method (ilr) is proposed, and the core of the method is to define new data by using standard orthonormal basis, so that the acheson distance of two variables in simplex space is equal to the euclidean distance after transformation into euclidean space.
Filzmoser considers a special case of the equidistant log-ratio conversion method as a pivot coordinate method (PC) when studying the component data conversion method. In 2009, filzmoser indicated that the statistical results obtained when component data were averaged and analyzed by variance were affected by component data determination and constraints. In 2010, filzmoser also reached the same conclusion when correlation analysis was performed on the constituent data. Meanwhile, filzmoser provides a Stability method (Stability) for measuring the relation between component data according to an equidistant logarithmic ratio conversion method. In 2013, geboy applied the stability method to elements in the bond Creek coal in the united states, which indicated that stability was not an entirely measure of correlation, and was greatly affected by the difference between the two data. In 2017, hron proposed a weighted pivot coordinate method (WPC) according to the pivot coordinate method, which introduces a weight coefficient based on the pivot coordinate method, but the method changes the data from the original dimension to the dimension as in the pivot coordinate method. A symmetrical pivot coordinate method (SPC) was proposed in the same year, which represents the strength of association between each data and the other data by selecting a specific orthogonal coordinate. 2021, hron et al have also proposed a weighted symmetric pivot coordinate method (WSPC) that assigns a coordinate system to each data, each coordinate system representing information for the data based on the logarithmic ratio of the current data to the other data. The weighted symmetric pivot coordinate method also introduces a weight coefficient, which reduces the weight corresponding to data with larger variance, thereby inhibiting their effect on other data.
WSPC has several drawbacks, including the following:
processing of high-dimensional data is difficult: the WSPC algorithm is a sample point-based projection method that requires the selection of a few representative sample points and then the projection of all data points onto the orthogonal axes of these sample points. When the data dimension is high, it becomes more difficult to select representative sample points, and errors may occur in the results obtained after projection.
The scalability is not enough: the WSPC algorithm is difficult to handle for large-scale data because it requires processing each sample point and comparing it with the center point, and such a calculation amount increases with the increase of the data size, resulting in insufficient scalability of the algorithm.
Is sensitive to noise and outliers: the WSPC algorithm is a projection method based on sample points, and when noise or abnormal values exist in the sample points, the result of the algorithm can be greatly influenced, so that the dimension reduction result is inaccurate. Therefore, when using the WSPC algorithm, attention is paid to the quality problem of the data.
Disclosure of Invention
The invention aims to provide an analysis method for the problem of inconsistent element correlation in coal, and solves the problem in the background technology.
In order to achieve the above object, the present invention provides an analysis method for a problem of inconsistent correlation of elements in coal, comprising the steps of:
step one, performing data processing on elements in coal by a method based on symmetrical pivot coordinates;
step two, a method for forming a weighted pivot coordinate by adding a weight coefficient to construct a symmetrical pivot coordinate;
step three, improving a weighted pivot coordinate method, constructing an orthogonal coordinate system, calculating a weighting coefficient of coordinates, defining a variation matrix, obtaining a weight value relation, and calculating a weighted pivot coordinate;
and fourthly, performing hierarchical clustering analysis on the element data in the processed coal.
Preferably, in the first step, data processing analysis is carried out on elements in the coal by a method based on symmetrical pivot coordinates;
the elements in the coal belong to the component data, and for the D-dimensional component data x= (x) 1 ,...,x D ) Two data form a logarithmic ratio, D data form togetherGroup log ratios, the log numbers are as follows:
ξ ij represents a logarithmic ratio, x i Represents the ith data of D-dimensional component data, x i Represents the j-th data of the D-dimensional component data.
Preferably, the symmetrical pivot coordinate method is generated in the selection of a special orthogonal basis of the equidistant logarithmic transformation method, and the expression is as follows:
Z i representing pivot coordinates, D representing D-dimensional data, x i Represents the ith data of D-dimensional component data, k represents the data of which number, x k Represents a kth element;
for x 1 Is expressed as:
Z i representing pivot coordinates, D representing D-dimensional data, x 1 Represents the 1 st data of D-dimensional component data, i represents the data of which number, x 2 Represents the 2 nd element;
element x 1 The logarithmic ratios with other elements are respectively:
x 1 represents the 1 st data of D-dimensional component data, x 2 Represents the 2 nd data of D-dimensional component data, x 3 Represents the 3 rd data of the D-dimensional component data, x D Represents the D-th data of the D-dimensional component data.
Preferably, in the third step, a weight coefficient is added to each logarithmic ratio by adding a weight coefficient to construct a symmetrical pivot coordinate and providing a weighted pivot coordinate method, namely:
α 2 ...α D representing a weight coefficient;
wherein alpha is 2 +...+α D Weight w in the weighted pivot coordinate method i According to alpha in the formula k Further calculations are performed for Z in the weighted pivot coordinate method i Converting x in coordinates 1 The corresponding weights are expressed as:
w i represents a weight set, w i =(w 1i ,...,w iD )',Representing a weight coefficient;
directed to x i The weight set is as follows:
is a weight coefficient.
Preferably, in the fourth step, the weighted symmetrical pivot coordinate method is further improved based on the weighted pivot coordinate method, and the weighted pivot coordinate method constructs an orthogonal coordinate system, wherein the first two coordinatesAnd->Representing component data x= (X) 1 ,...,x n ) X in the middle 1 And x 2 Weight coefficient of>And->Is defined as follows, wherein the gamma, alpha and beta average component data X are calculated:
gamma, alpha and beta are all intermediate variables for calculating normalized weights;
wherein, for the variation matrix definition of the component data:
t represents a variation matrix, T represents a vector, x i Represents the i-th element, x j Represents a j-th element;
for x 1 Normalized weight alpha of (a) * And x 2 Normalized weight beta of (2) * Expressed as:
the values of gamma and c are according to alpha * And beta * Further performing calculation:
the relationship between the finally obtained weight values is:
weighted symmetrical pivot coordinates are calculated based on the above expression.
Preferably, in the fourth step, hierarchical clustering is performed on the element data in the processed coal, and the occurrence state of the element in the coal is analyzed through the combination of the clustering result graphs.
Therefore, the analysis method for the inconsistent element correlation problem in the coal has the following beneficial effects:
(1) The invention finds a method capable of intuitively solving the problem of consistency by means of computer technologies such as machine learning, and the like, can solve the problem of inconsistency under two references, and intuitively re-analyzes and interprets the geochemistry of the coal according to clustered results.
(2) The method adopted by the invention has more reasonable geochemical interpretation, reduces or eliminates the unreal results reflected in the low-quality data set by the traditional statistical method, so that the method can really provide help for the occurrence pattern analysis of the coal, improves the interpretability of the mathematical statistical analysis result, and can further provide valuable reference information for the exploitation and utilization of the coal.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a schematic flow chart of a method for analyzing a problem of inconsistent element correlation in coal according to the present invention;
FIG. 2 is a graph of elemental data processing in a large moat mining area coal in accordance with an embodiment of the present invention;
FIG. 3 is a graph of elemental data processing in coal in an African helminth mining area according to an embodiment of the present invention;
FIG. 4 is a graph of clustering results for a large moat mining area according to an embodiment of the present invention;
fig. 5 is a graph of clustering results of an ajar helminth mining area according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further described below through the attached drawings and the embodiments.
Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "disposed," "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Examples
FIG. 1 is a schematic flow chart of a method for analyzing a problem of inconsistent element correlation in coal according to the present invention; FIG. 2 is a graph of elemental data processing in a large moat mining area coal in accordance with an embodiment of the present invention; FIG. 3 is a graph of elemental data processing in coal in an African helminth mining area according to an embodiment of the present invention; FIG. 4 is a graph of clustering results for a large moat mining area according to an embodiment of the present invention; fig. 5 is a graph of clustering results of an ajar helminth mining area according to an embodiment of the present invention.
As shown in fig. 1, the method for analyzing the inconsistent correlation problem of elements in coal according to the invention comprises the following steps:
and step one, carrying out data processing on elements in the coal by a method based on symmetrical pivot coordinates.
The elements in the coal belong to component data, and the method for solving the component data mainly comprises logarithmic transformation. The conventional component data conversion method including asymmetric logarithmic ratio data conversion, symmetric logarithmic ratio conversion, equidistant logarithmic ratio conversion, weighted symmetric pivot coordinates and stability method is studied, and the asymmetric logarithmic ratio conversion method is improved according to the conventional component data conversion method in the present invention. The invention provides a brand new coal geological composition data analysis method based on a symmetrical pivot coordinate method.
For D-dimensional component data x= (x 1 ,...,x D ) Two data form a logarithmic ratio, D data form togetherGroup log ratios, which are expressed as linear combinations with each other, the log ratios are as follows:
ξ ij represents a logarithmic ratio, x i Represents the ith data of D-dimensional component data, x i Represents the j-th data of the D-dimensional component data.
Redundant information exists in these logarithmic ratios due to the definite and constraint of the constituent data. In fact, only D-1 sets of logarithmic ratios are needed to infer correlations between all data. However, the selection of the D-1 set of log ratios is difficult, as is the case with the asymmetric log ratio conversion method, which sacrifices one-dimensional data arbitrarily, and the equidistant log ratio conversion method sacrifices the last-dimensional data. The pivot coordinate method is similar to the equidistant logarithmic transformation method in solving the problem.
The symmetrical pivot coordinate method is generated in the selection of a special orthogonal basis of the equidistant logarithmic transformation method, and the expression is as follows:
Z i representing pivot coordinates, D representing D-dimensional data, x i Represents the ith data of D-dimensional component data, k represents the data of which number, x k Representing the kth element.
For example, the calculation of the symmetrical pivot method, which is described in the calculation data x 1 When corresponding conversion data, x 1 Is contained only in the coordinate z 1 But not in other coordinates. If it is an analysis of another part, e.g. of x 2 Of interest, then by combining x 2 Placed at the first position of all data, for x 2 And the rest of the data are subjected to z i And (5) pivot coordinate transformation. In this way, a D-1 dimensional pivot coordinate can be constructed for the component data, which are all rotations relative to each other, with only the first coordinate being used to interpret the respective portion. The biggest difference between the pivot coordinate method and the equidistant logarithmic ratio conversion method is that the dimension reduction is not realized, the pivot coordinate still obtains D dimension, the conversion method also belongs to a one-to-many conversion method, and the conversion is complex when the conversion is carried out back to the element data space in the coal.
For x 1 In turn, can be expressed as:
Z i representing pivot coordinates, D representing D-dimensional data, x 1 Represents the 1 st data of D-dimensional component data, i represents the data of which number, x 2 Representing element 2.
If only for element x 1 Element x 1 The logarithmic ratios with other elements are respectively:
x 1 represents the 1 st data of D-dimensional component data, x 2 Represents the 2 nd data of D-dimensional component data, x 3 Represents the 3 rd data of the D-dimensional component data, x D Represents the D-th data of the D-dimensional component data.
And step two, a method for forming the weighted pivot coordinates by adding weight coefficients to construct the symmetrical pivot coordinates.
The two expressions are found to be very similar by the above formula, differing only by one factor. The symmetrical pivot coordinates are constructed by adding weight coefficients and a weighted pivot coordinate method is provided, wherein a weight coefficient is added to each logarithmic ratio, namely:
α 2 ...α D representing a weight coefficient;
wherein alpha is 2 +...+α D Weight w in the weighted pivot coordinate method i According to alpha in the formula k Further performing calculation, aiming at x in the conversion coordinates in the weighted pivot coordinate method 1 The corresponding weights are expressed as:
w i represents a weight set, w i =(w 1i ,...,w iD )',Representing a weight coefficient;
directed to x i The weight set is as follows:
is a weight coefficient.
And thirdly, improving a weighted pivot coordinate method, constructing an orthogonal coordinate system, calculating a weighting coefficient of the coordinate, defining a variation matrix, obtaining a weight value relation, and calculating the weighted pivot coordinate.
The core of the weighted pivot coordinate method is to introduce weight coefficients to represent the different importance between the data components, the purpose of which is to construct a orthonormal coordinate system.
Further improving the weighted symmetrical pivot coordinate method based on the weighted pivot coordinate method, the weighted pivot coordinate method constructs an orthogonal coordinate system, wherein the first two coordinatesAnd->Representing component data x= (X) 1 ,...,x n ) X in the middle 1 And x 2 Is used for the weighting coefficients of (a). />And->Wherein, gamma, alpha and beta can be calculated from the component data X:
gamma, alpha and beta are all intermediate variables for calculating normalized weights.
First, according to the elemental composition data x= (X) 1 ,...,x n ) The first two elements x of (2) 1 And x 2 The first two coordinates of the weighted symmetrical pivot coordinates are calculated, where there are n elements in total, and the correspondence between the coordinates is determined. Two elements involved in the calculation of the association (e.g. x i And x j ) Assigned to position x 1 And x 2 . Variation matrix of component data set X is constructed and addedThe basis of weight symmetry pivot coordinate weights. The definition of the variation matrix is as follows:
t represents a variation matrix, T represents a vector, x i Represents the i-th element, x j Representing the j-th element.
For x 1 Normalized weight alpha of (a) * And x 2 Normalized weight beta of (2) * Expressed as:
the values of gamma and c are according to alpha * And beta * Further performing calculation:
the relationship between the finally obtained weight values is:
weighted symmetrical pivot coordinates are calculated based on the above expression.
And step four, carrying out data analysis on the element data in the processed coal, and calculating out pearson correlation coefficients of the two mining areas.
Taking the analysis of elemental data from large moat mining areas as an example, fig. 1.
Taking the analysis of the elemental data of the America helminth mining area as an example, fig. 2.
Hierarchical clustering is carried out on the pearson correlation coefficients obtained in the two mining areas, and occurrence states of elements in the coal are analyzed through combination of clustering result graphs, as shown in fig. 4 and 5.
Therefore, the analysis method for the inconsistent problem of the element correlation in the coal can solve the inconsistent problem under two references, and the analysis and interpretation of the geochemistry of the coal can be intuitively carried out again according to the clustered result.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.
Claims (1)
1. An analysis method for a problem of inconsistent element correlation in coal is characterized by comprising the following steps: the method comprises the following steps:
step one, performing data processing on elements in coal by a method based on symmetrical pivot coordinates;
the elements in the coal belong to the component data, and for the D-dimensional component data x= (x) 1 ,...,x D ) Two data form a logarithmic ratio, D data form togetherGroup log ratios, the log numbers are as follows:
ξ ij represents a logarithmic ratio, x i Represents the ith data of D-dimensional component data, x i Represents the j-th data of the D-dimensional component data;
the symmetrical pivot coordinate method is generated in the selection of a special orthogonal basis of the equidistant logarithmic transformation method, and the expression is as follows:
Z i representing pivot coordinates, D representing D-dimensional data, x i Represents the ith data of D-dimensional component data, k represents the data of which number, x k Represents a kth element;
the symmetrical pivot coordinate method is used for calculating data x 1 When corresponding conversion data, x 1 Is contained only in the coordinate z 1 Not in other coordinates, if another part is analyzed, e.g. for x 2 Of interest, then by combining x 2 Placed at the first position of all data, for x 2 And the rest of the data are subjected to z i Pivot coordinate transformation, constructing D-1 dimensional pivot coordinates for the component data, which are all rotations relative to each other, wherein only the first coordinate is used to interpret the respective portion;
for x 1 Is expressed as:
Z i representing pivot coordinates, D representing D-dimensional data, x 1 Represents the 1 st data of D-dimensional component data, i represents the data of which number, x 2 Represents the 2 nd element;
element x 1 The logarithmic ratios with other elements are respectively:
x 1 represents the 1 st data of D-dimensional component data, x 2 Represents the 2 nd data of D-dimensional component data, x 3 Represents the 3 rd data of the D-dimensional component data, x D D data representing D-dimensional component data;
step two, a method for forming a weighted pivot coordinate by adding a weight coefficient to construct a symmetrical pivot coordinate;
in the second step, a method for constructing symmetrical pivot coordinates by adding weight coefficients to form weighted pivot coordinates is adopted, and one weight coefficient is added to each logarithmic ratio:
α 2 ...α D representing a weight coefficient;
wherein alpha is 2 +...+α D Weight w in the weighted pivot coordinate method i According to alpha in the formula k Calculation is performed for Z in the weighted pivot coordinate method i Converting x in coordinates 1 The corresponding weights are expressed as:
w i represents a weight set, w i =(w 1i ,...,w iD )',Representing a weight coefficient;
directed to x i The weight set is as follows:
is a weight coefficient;
step three, improving a weighted pivot coordinate method, constructing an orthogonal coordinate system, calculating a weighting coefficient of coordinates, defining a variation matrix, obtaining a weight value relation, and calculating a weighted pivot coordinate;
in the third step, the first step is performed,
the weighted symmetrical pivot coordinate method is improved on the basis of the weighted pivot coordinate method, and the weighted pivot coordinate method constructs an orthogonal coordinate system, wherein the first two coordinatesAnd->Representing component data x= (X) 1 ,...,x n ) X in the middle 1 And x 2 Weight coefficient of>And->Is defined as follows, wherein the gamma, alpha and beta average component data X are calculated:
gamma, alpha and beta are all intermediate variables for calculating normalized weights;
wherein, for the variation matrix definition of the component data:
t represents a variation matrix, T represents a vector, x i Represents the i-th element, x j Represents a j-th element;
for x 1 Normalized weight alpha of (a) * And x 2 Normalized weight beta of (2) * Expressed as:
the values of gamma and c are according to alpha * And beta * Further performing calculation:
the relationship between the finally obtained weight values is:
calculating weighted symmetrical pivot coordinates based on the above expression;
step four, performing hierarchical clustering analysis on the element data in the processed coal;
and fourthly, hierarchical clustering is carried out on the element data in the processed coal, and the occurrence state of the elements in the coal is analyzed through the combination of the clustering result graphs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310704173.0A CN116595399B (en) | 2023-06-14 | 2023-06-14 | Analysis method for inconsistent element correlation problem in coal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310704173.0A CN116595399B (en) | 2023-06-14 | 2023-06-14 | Analysis method for inconsistent element correlation problem in coal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116595399A CN116595399A (en) | 2023-08-15 |
CN116595399B true CN116595399B (en) | 2024-01-05 |
Family
ID=87608085
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310704173.0A Active CN116595399B (en) | 2023-06-14 | 2023-06-14 | Analysis method for inconsistent element correlation problem in coal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116595399B (en) |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011174871A (en) * | 2010-02-25 | 2011-09-08 | Keio Gijuku | Correlation evaluation method, correlation evaluating device, operation reproducing device |
CN103226736A (en) * | 2013-03-27 | 2013-07-31 | 东北电力大学 | Method for predicting medium and long term power load based on cluster analysis and target theory |
JP2014157491A (en) * | 2013-02-15 | 2014-08-28 | Toyo Tire & Rubber Co Ltd | Cluster analysis method, cluster analysis apparatus and computer program |
CN109100411A (en) * | 2018-06-14 | 2018-12-28 | 中国矿业大学 | A kind of chemometrics application method for coal Soluble Organic Matter |
CN109191001A (en) * | 2018-09-21 | 2019-01-11 | 常州工学院 | Evaluation in Education Quality method based on principal component analysis |
CN109657733A (en) * | 2018-12-28 | 2019-04-19 | 中国农业科学院农业质量标准与检测技术研究所 | Variety discriminating method and system based on constituent structure feature |
CN112182730A (en) * | 2020-10-27 | 2021-01-05 | 湖北工业大学 | Industrial building legacy value comprehensive evaluation method based on analytic hierarchy process |
CN112446591A (en) * | 2020-11-06 | 2021-03-05 | 太原科技大学 | Evaluation system for student comprehensive capacity evaluation and zero sample evaluation method |
CN112668622A (en) * | 2020-12-22 | 2021-04-16 | 中国矿业大学(北京) | Analysis method and analysis and calculation device for coal geological composition data |
CN112801135A (en) * | 2020-12-31 | 2021-05-14 | 浙江浙能镇海发电有限责任公司 | Fault line selection method and device for power plant service power system based on characteristic quantity correlation |
CN113139247A (en) * | 2021-04-19 | 2021-07-20 | 北京工业大学 | Mechanical structure uncertainty parameter quantification and correlation analysis method |
CN113254497A (en) * | 2021-05-19 | 2021-08-13 | 中国地质大学(北京) | Comprehensive analysis and anomaly extraction method based on geochemical composition data |
CN113592957A (en) * | 2021-08-06 | 2021-11-02 | 北京易航远智科技有限公司 | Multi-laser radar and multi-camera combined calibration method and system |
CN114530249A (en) * | 2022-02-15 | 2022-05-24 | 北京浩鼎瑞生物科技有限公司 | Disease risk assessment model construction method based on intestinal microorganisms and application |
CN115270861A (en) * | 2022-07-20 | 2022-11-01 | 武汉理工大学 | Product composition data monitoring method and device, electronic equipment and storage medium |
CN115565623A (en) * | 2022-10-19 | 2023-01-03 | 中国矿业大学(北京) | Method and system for analyzing coal geological components, electronic equipment and storage medium |
CN115619304A (en) * | 2021-07-15 | 2023-01-17 | 中国科学院沈阳计算技术研究所有限公司 | Logistics node site selection planning method based on clustering algorithm |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416686B (en) * | 2018-01-30 | 2021-10-19 | 中国矿业大学 | Ecological geological environment type division method based on coal resource development |
GB2585258B (en) * | 2019-01-30 | 2022-10-19 | Bruker Daltonics Gmbh & Co Kg | Mass spectrometric method for determining the presence or absence of a chemical element in an analyte |
JP7388137B2 (en) * | 2019-11-07 | 2023-11-29 | オムロン株式会社 | Integrated analysis method, integrated analysis device, and integrated analysis program |
US20230130034A1 (en) * | 2021-10-14 | 2023-04-27 | Cgg Services Sas | System and method for probability-based determination of stratigraphic anomalies in a subsurface |
-
2023
- 2023-06-14 CN CN202310704173.0A patent/CN116595399B/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011174871A (en) * | 2010-02-25 | 2011-09-08 | Keio Gijuku | Correlation evaluation method, correlation evaluating device, operation reproducing device |
JP2014157491A (en) * | 2013-02-15 | 2014-08-28 | Toyo Tire & Rubber Co Ltd | Cluster analysis method, cluster analysis apparatus and computer program |
CN103226736A (en) * | 2013-03-27 | 2013-07-31 | 东北电力大学 | Method for predicting medium and long term power load based on cluster analysis and target theory |
CN109100411A (en) * | 2018-06-14 | 2018-12-28 | 中国矿业大学 | A kind of chemometrics application method for coal Soluble Organic Matter |
CN109191001A (en) * | 2018-09-21 | 2019-01-11 | 常州工学院 | Evaluation in Education Quality method based on principal component analysis |
CN109657733A (en) * | 2018-12-28 | 2019-04-19 | 中国农业科学院农业质量标准与检测技术研究所 | Variety discriminating method and system based on constituent structure feature |
CN112182730A (en) * | 2020-10-27 | 2021-01-05 | 湖北工业大学 | Industrial building legacy value comprehensive evaluation method based on analytic hierarchy process |
CN112446591A (en) * | 2020-11-06 | 2021-03-05 | 太原科技大学 | Evaluation system for student comprehensive capacity evaluation and zero sample evaluation method |
CN112668622A (en) * | 2020-12-22 | 2021-04-16 | 中国矿业大学(北京) | Analysis method and analysis and calculation device for coal geological composition data |
CN112801135A (en) * | 2020-12-31 | 2021-05-14 | 浙江浙能镇海发电有限责任公司 | Fault line selection method and device for power plant service power system based on characteristic quantity correlation |
CN113139247A (en) * | 2021-04-19 | 2021-07-20 | 北京工业大学 | Mechanical structure uncertainty parameter quantification and correlation analysis method |
CN113254497A (en) * | 2021-05-19 | 2021-08-13 | 中国地质大学(北京) | Comprehensive analysis and anomaly extraction method based on geochemical composition data |
CN115619304A (en) * | 2021-07-15 | 2023-01-17 | 中国科学院沈阳计算技术研究所有限公司 | Logistics node site selection planning method based on clustering algorithm |
CN113592957A (en) * | 2021-08-06 | 2021-11-02 | 北京易航远智科技有限公司 | Multi-laser radar and multi-camera combined calibration method and system |
CN114530249A (en) * | 2022-02-15 | 2022-05-24 | 北京浩鼎瑞生物科技有限公司 | Disease risk assessment model construction method based on intestinal microorganisms and application |
CN115270861A (en) * | 2022-07-20 | 2022-11-01 | 武汉理工大学 | Product composition data monitoring method and device, electronic equipment and storage medium |
CN115565623A (en) * | 2022-10-19 | 2023-01-03 | 中国矿业大学(北京) | Method and system for analyzing coal geological components, electronic equipment and storage medium |
Non-Patent Citations (4)
Title |
---|
基于成分数据作球坐标变换的非负可变权系数的确定;赵勤等;重庆工商大学学报(自然科学版);第36卷(第4期);89-94 * |
对数比变换和偏最小二乘法在地球化学组合异常提取中的应用――以湘西北铅锌矿为例;王琨;肖克炎;丛源;;物探与化探(第01期);141-148 * |
成分数据的组合预测;张晓琴;陈佳佳;原静;;应用概率统计(第03期);307-316 * |
机器学习在煤的地球化学中的应用;许娜等;煤炭学报;第47卷(第5期);1895-1907 * |
Also Published As
Publication number | Publication date |
---|---|
CN116595399A (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Esbensen et al. | Multivariate data analysis: in practice: an introduction to multivariate data analysis and experimental design | |
Anselin et al. | Exploratory spatial data analysis linking SpaceStat and ArcView | |
Li et al. | On the norm of dominant difference for many-objective particle swarm optimization | |
CN106971205A (en) | A kind of embedded dynamic feature selection method based on k nearest neighbor Mutual Information Estimation | |
CN106897774B (en) | Multiple soft measurement algorithm cluster modeling methods based on Monte Carlo cross validation | |
Bai et al. | The amplituhedron and the one-loop Grassmannian measure | |
CN104598569A (en) | Association rule-based MBD (Model Based Definition) data set completeness checking method | |
Inselberg | Visualization and data mining of high-dimensional data | |
CN116595399B (en) | Analysis method for inconsistent element correlation problem in coal | |
Ding et al. | The emergence of the representation of style in design | |
CN113011086B (en) | Estimation method of forest biomass based on GA-SVR algorithm | |
Pagan et al. | Investigating the impact of data scaling on the k-nearest neighbor algorithm | |
CN117909873A (en) | Method and device for realizing big data processing facing data security, processor and computer readable storage medium thereof | |
CN105183804B (en) | A kind of cluster method of servicing based on ontology | |
CN116151107B (en) | Method, system and electronic equipment for identifying ore potential of magma type nickel cobalt | |
Usman et al. | Multi-Level Mining and Visualization of Informative Association Rules. | |
CN105975909A (en) | Fingerprint classification method based on fractal dimension and fingerprint three-level classification method | |
McLaughlin et al. | Unsupervised Classification with a Family of Parsimonious Contaminated Shifted Asymmetric Laplace Mixtures | |
Hasan et al. | Piecemeal Clustering: a Self-Driven Data Clustering Algorithm | |
Martín et al. | Generalized quantum similarity in atomic systems: A quantifier of relativistic effects | |
Zhao et al. | Mining Technological Innovation Talents Based on Patent Index using t-SNE Algorithms: Take the Field of Intelligent Robot as an Example | |
Liang et al. | A graph modeling and matching method for sketch-based garment panel design | |
Ren et al. | A data mining approach to study the significance of nonlinearity in multistation assembly processes | |
CN113807393B (en) | Clustering method based on multi-attribute non-negative matrix factorization | |
CN114117251B (en) | Intelligent context-Bo-down fusion multi-factor matrix decomposition personalized recommendation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |