CN112634121B

CN112634121B - Rapid processing method for big data in oversized metal in-situ analyzer

Info

Publication number: CN112634121B
Application number: CN202011343868.3A
Authority: CN
Inventors: 袁良经; 贾云海; 张翘楚; 于雷; 张纯岩; 史玉涛
Original assignee: Ncs Testing Technology Co ltd
Current assignee: Ncs Testing Technology Co ltd
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2024-01-30
Anticipated expiration: 2040-11-26
Also published as: CN112634121A

Abstract

The invention belongs to the technical field of oversized metal in-situ analysis, and particularly relates to a method for rapidly processing big data in an oversized metal in-situ analyzer based on a GPU parallel operation technology of a CUDA platform. According to the invention, on the oversized metal in-situ analyzer integrating processing, scanning and characterization, single spark integral spectrum intensity data are converted into a data structure based on the GPU parallel computing technology requirement of the CUDA platform, the processing efficiency of big data is greatly improved through parallel operation and algorithm optimization based on the CPU+GPU heterogeneous platform, and finally, the distribution characterization of the oversized metal in-situ analyzer is realized, the processing data amount is large, the computing speed is high, the accuracy is high, the instantaneity is strong, and all data results are completed within 5 minutes.

Description

Rapid processing method for big data in oversized metal in-situ analyzer

Technical Field

The invention belongs to the technical field of oversized metal in-situ analysis, and particularly relates to a method for rapidly processing big data in an oversized metal in-situ analyzer based on a GPU parallel operation technology of a CUDA platform.

Background

In order to realize the distribution characterization of each element on the metal surface, the problems that the point analysis (direct-reading spectrum technology, drilling sampling wet chemical analysis technology and the like) and the small-area scanning technology (small-area in-situ scanning analysis technology) are not replaced by the point instead of the surface are avoided. The ultra-large-size metal in-situ analyzer integrates processing, scanning and characterization, and the distribution characterization of each element on the ultra-large-size metal surface is realized by adopting a single spark integration technology through one-to-one correspondence of positions and contents on the ultra-large-size metal surface. However, in the in-situ characterization technology of oversized metal samples, mass data (GB grade) is brought due to the large analysis surface. How to quickly perform distribution characterization calculation of each element on mass data, such as intensity content conversion, matrix interference operation, third element interference operation, highest content appearance position, lowest content appearance position, surface data sorting, statistical segregation degree, inclusion signal threshold value and the like. The problems of point analysis or small-area distribution characterization analysis cannot be too large, but for the oversized sample characterization technology, the limitation of the traditional calculation method is broken through due to the huge amount of original information, and how to ensure the instant expression display of the characterization result is a scientific difficulty to be solved.

In recent years, with the continuous development and innovation of GPU (Graphic Processing Unit, graphics processor) technology, the parallel computing capability of GPUs is becoming more and more important. Because of the popularization of GPU, NVIDIA company provides a CUDA architecture-based high-cost performance parallel operation platform, and the processing efficiency of big data can be greatly improved through parallel operation and algorithm optimization on the basis of CPU+GPU heterogeneous platform. However, the technology is mostly used for image processing, AI technology, network big data operation processing and the like, and is rarely applied in the traditional chemical field.

Under the above conditions, the method can be adopted to convert the single spark integral spectrum intensity data into a data structure based on the GPU parallel computing technology requirement of the CUDA platform on the oversized metal in-situ analyzer integrating processing, scanning and characterization, greatly improve the processing efficiency of the oversized data through parallel operation and algorithm optimization on the basis of the CPU+GPU heterogeneous platform, and finally realize the distribution characterization of the oversized metal in-situ analyzer.

Disclosure of Invention

Aiming at the technical problems, the invention aims to provide a method for rapidly processing big data in an oversized metal in-situ analyzer based on the GPU parallel operation technology of a CUDA platform, which has the advantages of large data processing quantity (GB level), high calculation speed and high instantaneity, can complete all data results within 5 minutes, and is suitable for surface element distribution characterization of large-area (the length is more than 100mm and the width is more than 100 mm) samples.

In order to achieve the above object, the present invention provides the following technical solutions:

a method for rapidly processing big data in an oversized metal in-situ analyzer comprises the following steps:

s1, performing full coverage scanning on the surface of an oversized sample by using a full-automatic segregation analyzer in a line scanning mode to obtain L scanning absolute intensity files, wherein L is the number of scanning lines, and each scanning absolute intensity file comprises C _n A plurality of spectral analysis channels, each spectral analysis channel including N _i Single spark data; drawing a calibration working curve by adopting the intensity ratio-content ratio: (C) _i /C _r )＝a(I _i /I _r ) ³ +b(I _i /I _r ) ² +c(I _i /I _r ) +d; wherein C is _i To analyze channel element content; c (C) _r Is the channel element content of the matrix; i _i To analyze the absolute intensity of the channel element; i _r Absolute intensity of matrix channel element; a is a cubic term coefficient; b is a quadratic term coefficient; c is a first order term coefficient; d is a constant term coefficient; i is the analysis channel number; r is the serial number of the basic channel;

s2, converting the calibration working curve obtained in the step S1 into a coefficient matrix CM [ E ] _n ,8]，E _n To analyze the number of elements; the 8 parameters are respectively an analysis channel i, a basic channel r, a cubic term coefficient a, a quadratic term coefficient b, a first order term coefficient c, a constant term coefficient d, a curve content ratio upper limit and a curve content ratio lower limit;

S3, in the parallel operation process of the GPU, 8 streams are adopted to carry out synchronous parallel operation, the number of times of circulation is obtained by dividing the number L of the scanning intensity files by an integer obtained by 8, all the scanning absolute intensity files are sequentially fed into the 8 streams according to the rule that one scanning absolute intensity file corresponds to one stream, the excessive scanning absolute intensity files wait for the next circulation, and the number of streams adopted in the last circulation is the number of the scanning absolute intensity files which are not calculated; 8 scanning absolute intensity files in 8 streams are calculated in parallel in each cycle, and 8 operations are sequentially carried out in each stream, wherein the specific operation steps are as follows:

s3.1, analyzing the absolute intensity I of the channel element _i Coefficient matrix CM [ E ] converted from calibration operation curve obtained in step S2 _n ,8]Reading in GPU memory according to the stream;

s3.2, converting the absolute intensity into an intensity ratio;

R＝I _i /I _r wherein R is the intensity ratio of analysis channel elements, I _i To analyze the absolute intensity of the channel element; i _r Absolute intensity of matrix channel element;

s3.3, converting the intensity ratio into a content ratio;

coefficient matrix CM [ E ] transformed according to the calibration working curve obtained in step S2 _n ,8]Converting the intensity ratio R obtained in step S3.2 into a content ratio CR, cr=c _i /C _m ；C _i To analyze channel element content; c (C) _m Is the channel element content of the matrix; performing out-of-limit processing on the calculated content ratio exceeding the upper limit and the lower limit of the curve content ratio;

s3.4, converting the content ratio into content;

calculating the content ratio CR of all analysis channels _i According to the rule that the sum of the element contents of all analysis channels is 100%, the channel content C of the matrix is calculated _m Then according to C _i ＝C _m ×CR _i Calculating the element content RC of each spectrum analysis channel element _i ；

S3.5, performing third element interference correction on the element content;

calculating the element content after the third element interference correction according to the following formula;

wherein A is _k To sum the interference coefficients, M _k To multiply the interference coefficient C _{i school} Analysis channel element content after third element interference correction, C _k For the content of interfering elements, RC _i Element content before correction for interference;

s3.6, two-dimensional conversion of the intensity line scanning data;

the absolute intensity I of analysis channel element obtained in time sequence _i Line two-dimensional array IntM projected to corresponding area of scanned single line data file according to random uniform distribution principle _i [Rows _i ,Cols]，Rows _i Representing the number of rows of the two-dimensional array, coll representing the number of columns of the two-dimensional array, and using the three-dimensional array IntM in the actual calculation process _i [C _n ,Rowsi,Cols]To represent the two-dimensional distribution of each analysis channel, C _n The number of the spectrum analysis channels;

s3.7, converting the intensity time sequence array into a space position array;

IntM for each analysis channel obtained in step S3.6 _i [Rowsi,Cols]The array is projected to the whole two-dimensional array IntM [ Rows, colls ] of the analysis surface according to the position information of the array on the analysis surface]In the actual calculation process, for the intensity distribution array of the whole analysis surface, a three-dimensional array IntM [ C ] is used _n ,Rows,Cols]I.e. a three-dimensional matrix of channel intensity distribution to represent the two-dimensional distribution of each channel, C _n The number of the spectrum analysis channels;

s3.8, two-dimensional conversion of content line scanning data;

element content C after correction of the third element interference of each row obtained in time sequence in step S3.5 _{i school} Line two-dimensional array ConM projected to corresponding area of scanned single line data file according to random uniform distribution principle _i [Rowsi,Cols]In the actual calculation process, a three-dimensional array ConM is used _i [E _n ,Rows _i ,Cols]To represent the two-dimensional distribution of each channel, E _n To analyze the number of elements;

s3.9, converting the content time sequence array into a space position array;

the line two-dimensional array ConM of the step S3.8 _i [Rows _i ,Cols]Projecting the position information according to the coordinates of the analysis surface to a two-dimensional array ConM [ Rows, coll ] of the whole analysis surface]In order to obtain the content distribution array of whole analysis surface, in the actual calculation process a three-dimensional array ConM [ E ] _n ,Rows,Cols]I.e. a three-dimensional matrix of element content distribution to represent the two-dimensional distribution of each channel, E _n To analyze the number of elements;

s4, calculating the number C of spectrum analysis channels _n Dividing the obtained integer by 8 to obtain the circulation times, and obtaining the channel intensity distribution three-dimensional matrix IntM [ C ] obtained in the step S3.7 _n ,Rows,Cols]The 8 streams are fed by channels respectively, each stream is responsible for one channel data ConM _i [Rowsi,Cols]The number of the streams adopted in the last cycle is the number of the remaining uncomputed channels; each cycle calculation relies on 8 streams to calculate a plurality of intensity parameter values for 8 channels, each stream being subjected to the following 3 steps in turn:

s4.1, reading an intensity distribution array IntM [ Rows, colls ] of a channel in each stream;

s4.2, carrying out Hill sorting on an intensity distribution array IntM [ Rows, coll ] to calculate a plurality of parameter values;

s4.3, for an intensity distribution array IntM [ Rows, colls ]]Iterative calculation of the inclusion signal threshold value, wherein the threshold value calculation mode is as follows: INT (INT) _inc ＝INT _avg +3×INT _sd ；INT _avg INT for all signal intensity averages of channel _sd INT for absolute standard deviation of all signal intensities of channel _inc Is a threshold value; after eliminating the signal exceeding the threshold value, calculating the average value and the absolute standard deviation, recalculating the threshold value, and repeatedly iterating until the amount of the eliminated data is less than 3, wherein INT is used at the moment _inc The inclusion signal threshold value of the channel is obtained; the data higher than the threshold value in all signal data are statistically analyzed, the frequency of the occurrence of the intensity signal of each intensity segment is calculated, and the element inclusion signal statistical distribution matrix IntD is output _i ,256]，D _i For the number of sparks occurring in each signal segment;

s5, calculating E _n Dividing the obtained integer by 8 to obtain cyclic times, and dividing element content distribution three-dimensional matrix ConM [ E ] _n ,Rows,Cols]8 streams are fed per element, each stream being responsible for one element data ConM [ Rows, colls ]]The number of streams adopted in the last cycle is the number of remaining uncomputed elements; each cycle calculation relies on 8 streams to calculate a plurality of content parameter values for 8 channels, and the specific implementation steps are as follows:

s5.1, reading a content distribution array ConM [ Rows, colls ] of an element in each stream;

s5.2, performing Hill sorting on a content distribution array ConM [ Rows, coll ] to calculate a plurality of parameter values;

s5.3, calculating standard deviation by adopting a quarter bit distance method to calculate the statistical segregation degree of the content calculated by the steady variation coefficient; carrying out line statistical analysis on the content matrix, calculating the occurrence times of each content segment, and outputting an element content distribution matrix ConD [ E ] _n ,256]，E _n To analyze the number of elements;

s6, carrying out channel intensity distribution three-dimensional matrix IntM [ C ] obtained in step S3.7 _n ,Rows,Cols]Element content distribution three-dimensional matrix ConM [ E ] obtained in step S3.9 _n ,Rows,Cols]Step S4, each intensity and content parameter value required to be calculated in the step S5 is written into the CPU memory from the GPU memory;

and S7, displaying a data result on the CPU in a graph or chart form, wherein the data result is segregation, inclusion content and morphology.

In step S3.2, GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，C _n )，N _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

In the steps S3.3, S3.4 and S3.5, GPU thread setting (1024,1,1), thread block setting (1024, (N) _i +1023)/1024)，E _n )，N _i For each line per lightNumber of single spark data obtained by spectrum analysis channel E _n To analyze the number of elements.

In the steps S3.6 and S3.7, GPU thread set (32, 1), thread block set ((Rowsi+1)/32, (Colls+1)/32, C) _n )，C _n The number of channels is analyzed for the spectrum.

In steps S3.8 and S3.9, GPU thread setup (32, 1), thread block setup ((Rows) _i +1)/32，(Cols+1)/32，E _n )，E _n To analyze the number of elements.

In the step S4, the intensity parameter values include a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% score, a 0.5% score, a 2.5% score, a 1/4 score, a 3/4 score, a 97.5% score, a 99.5 score, a 99.865 score, a maximum occurrence position, a minimum occurrence position, an inclusion threshold, and an inclusion intensity signal distribution.

In step S4.1, GPU thread set (32, 32), thread block set ((rows+1)/32, (cols+1)/32).

In the steps S4.2 and S4.3, GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，C _n )，N _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

In the step S5, the content parameter values include a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% score, a 0.5% score, a 2.5% score, a 1/4 score, a 3/4 score, a 97.5% score, a 99.5 score, a 99.865 score, a maximum value occurrence position, a minimum value occurrence position, a content distribution, a statistical segregation degree, and a statistical porosity.

In the steps S5.2 and S5.3, GPU thread setup (1024,1,1), thread block setup (1024, (N) _i +1023)/1024)，E _n )，N _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements.

All data results for the method were completed within 5 minutes.

The scanning area of the method is 100-1000 mm in length and 100-500 mm in width.

Compared with the prior art, the invention has the beneficial effects that:

according to the invention, on an oversized metal in-situ analyzer integrating processing, scanning and characterization, namely 'a large-scale metal component segregation degree analyzer' disclosed in the prior application No.201911374898.8, single spark integral spectrum intensity data are converted into a data structure based on the GPU parallel computing technical requirement of a CUDA platform, the processing efficiency of the large data is greatly improved through parallel operation and algorithm optimization based on the CPU+GPU heterogeneous platform, and finally, the distribution characterization of the oversized metal in-situ analyzer is realized, the processing data amount is large, the computing speed is high, the accuracy is high, the instantaneity is strong, and all data results are completed within 5 minutes.

Drawings

FIG. 1 is a schematic structural diagram of a full-automatic segregation analyzer integrating processing, scanning analysis and result characterization into a whole, which is adopted by the invention;

FIG. 2 is a diagram of a single spark data structure;

FIG. 3 is a two-dimensional data GPU memory map;

FIG. 4 is a three-dimensional data GPU memory map;

FIG. 5 is a GPU calculation;

FIG. 6 is a CPU calculation result;

FIG. 7 is a scan of the surface of a sample of 1000mm by 500mm oversized in the example;

FIG. 8 is a flow chart of a parallel operation of scanning absolute intensity files;

FIG. 9 is a flow chart of the parallel operation of the intensity distribution array;

FIG. 10 is a flow chart of the parallel operation of the content distribution array.

Wherein the reference numerals are as follows:

1X shaft

2. Horizontal sample stage

3Z shaft

4. Cutting tool

5. Tool magazine

6. Sample surface processing module

7W shaft

8. Segregation degree analysis module

9. Sample to be measured

10 Y-axis

Detailed Description

The invention will be further described with reference to the drawings and examples.

The invention establishes a method for rapidly processing big data in an oversized metal in-situ analyzer based on a GPU parallel operation technology of a CUDA platform, which comprises the following steps:

s1, performing full coverage scanning on the surface of an oversized sample by using a full-automatic segregation analyzer in a line scanning mode to obtain L scanning absolute intensity files, wherein L is the number of scanning lines, and each scanning absolute intensity file comprises C _n A plurality of spectral analysis channels, each spectral analysis channel including N _i Single spark data; drawing a calibration working curve by adopting the intensity ratio-content ratio: (C) _i /C _r )＝a(I _i /I _r ) ³ +b(I _i /I _r ) ² +c(I _i /I _r ) +d; wherein C is _i To analyze channel element content; c (C) _r Is the channel element content of the matrix; i _i To analyze the absolute intensity of the channel element; i _r Absolute intensity of matrix channel element; a is a cubic term coefficient; b is a quadratic term coefficient; c is a first order term coefficient; d is a constant term coefficient; i is the analysis channel number; r is the sequence number of the basic channel.

As shown in fig. 1, the invention adopts a full-automatic segregation analyzer integrating processing, scanning analysis and result characterization, and the full-automatic segregation analyzer comprises a high-precision three-dimensional numerical control workbench, a sample surface processing module, a segregation degree analysis module and a result characterization module; the high-precision three-dimensional numerical control workbench comprises a horizontal sample stage 2 which precisely moves in the directions of a horizontal X axis 1 and a Y axis 10 and is used for fixing a sample 9 to be tested, and a Z axis 3 and a W axis 7 which are parallel to each other and perpendicular to the Y axis plane of the X axis. The sample surface processing module 6 and the segregation degree analysis module 8 are respectively arranged on the Z axis 3 and the W axis 7 of the high-precision three-dimensional numerical control workbench in a vertically movable way and are positioned at the horizontal positionAbove the sample 9 to be measured on the sample stage 2. The sample surface processing module 6 comprises a tool 4 for surface processing on the surface of a sample 9 to be measured. The full-automatic segregation analyzer adopts a single spark digital integration technology of a full-digital solid-state spark light source to collect the spectrum intensity of each channel, adopts a line scanning mode to realize full-coverage scanning on the surface of a sample, and adopts a scanning analysis area as an X Y rectangular area, the unit is mm, the line spacing D, the unit is mm, the scanning speed is V, the unit is mm/s, the single spark collection frequency Q, the scanning line number is L=Y/D, and the number of single spark data obtained by each line of each spectrum analysis channel is N _i Total full area single spark n=n=x/v×q _i X L, single spark intensity data stored for each row, spectral analysis channel number C _n Total amount of working curve (number of analysis elements) E _n And drawing a calibration working curve by taking the matrix channel number r and the matrix channel as a reference channel and adopting an intensity ratio-content ratio, wherein the maximum sum and interference number of each analysis element is 4, the maximum multiplication interference number of each analysis element is 4, and the single spark data structure is shown in figure 2.

S2, converting the calibration working curve obtained in the step S1 into a coefficient matrix CM [ E ] _n ,8]，E _n To analyze the number of elements; the 8 parameters are respectively an analysis channel i, a basic channel r, a cubic term coefficient a, a quadratic term coefficient b, a first order term coefficient c, a constant term coefficient d, a curve content ratio upper limit and a curve content ratio lower limit.

S3, in the parallel operation process of the GPU, 8 streams (Stream 1-Stream 8) are adopted for synchronous parallel operation, the number of loops is obtained by dividing the number L of the scanning intensity files by an integer obtained by 8, all the scanning absolute intensity files are sequentially fed into the 8 streams according to the rule that one scanning absolute intensity file corresponds to one Stream, the excessive scanning absolute intensity files wait for the next loop, and the number of streams adopted in the last loop is the number of the scanning absolute intensity files which are not calculated; 8 scanned absolute intensity files in 8 streams are calculated in parallel per cycle, and 8 operations are performed in each stream in turn. The flow chart is shown in fig. 8, and the specific operation steps are as follows:

S3.1, analyzing the absolute intensity I of the channel element _i Coefficient matrix CM [ E ] converted from calibration operation curve obtained in step S2 _n ,8]And reading the GPU memory in a streaming manner.

S3.2, converting the absolute intensity into an intensity ratio;

R＝I _i /I _r wherein R is the intensity ratio of analysis channel elements, I _i To analyze the absolute intensity of the channel element; i _r Absolute intensity of matrix channel element.

This step GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，C _n ) The GPU memory map is shown in fig. 3.N (N) _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

S3.3, converting the intensity ratio into a content ratio;

coefficient matrix CM [ E ] transformed according to the calibration working curve obtained in step S2 _n ,8]Converting the intensity ratio R obtained in step S3.2 into a content ratio CR, cr=c _i /C _m ；C _i To analyze channel element content; c (C) _m Is the channel element content of the matrix; and carrying out-of-limit treatment on the calculated content ratio exceeding the upper limit and the lower limit of the curve content ratio.

This step GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，E _n ) The GPU memory map is shown in fig. 3.N (N) _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements.

S3.4, converting the content ratio into content;

Calculating the content ratio CR of all analysis channels _i According to the rule that the sum of the element contents of all analysis channels is 100%, the channel content C of the matrix is calculated _m Then according to C _i ＝C _m ×CR _i Calculating the element content RC of each spectrum analysis channel element _i 。

This step GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，E _n ) The GPU memory map is shown in fig. 3.N (N) _i Single obtained for each line per spectral analysis channelNumber of spark data, E _n To analyze the number of elements.

S3.5, performing third element interference correction on the element content;

wherein A is _k To sum the interference coefficients, M _k To multiply the interference coefficient C _{i school} Analysis channel element content after third element interference correction, C _k For the content of interfering elements, RC _i The pre-element content is corrected for interference.

This step GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，E _n ) The GPU memory map is shown in fig. 3.N (N) _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the total number of elements.

S3.6, two-dimensional conversion of the intensity line scanning data;

the absolute intensity I of analysis channel element obtained in time sequence _i Line two-dimensional array IntM projected to corresponding area of scanned single line data file according to random uniform distribution principle _i [Rows _i ,Cols]，Rows _i The column number of the two-dimensional array is represented by Cols, the array corresponds to the surface data of the covered area, the data is converted from the one-dimensional array to the two-dimensional array for each analysis channel, the intensity distribution of the analysis surface at the position corresponding to the scanning line is realized, and because each analysis channel corresponds to one two-dimensional array, the three-dimensional array IntM is used in the actual calculation process _i [C _n ,Rowsi,Cols]To represent the two-dimensional distribution of each analysis channel, C _n The number of channels is analyzed for the spectrum.

This step GPU thread set (32, 1), thread block set ((Rowsi+1)/32, (Colls+1)/32, C) _n )，C _n For the number of spectral analysis channels, the GPU memory map is shown in fig. 3.

S3.7, converting the intensity time sequence array into a space position array;

IntM for each analysis channel obtained in step S3.6 _i [Rowsi,Cols]The array is projected to the whole two-dimensional array IntM [ Rows, colls ] of the analysis surface according to the position information of the array on the analysis surface]And obtaining an intensity distribution array of the whole analysis surface. And the conversion from the one-dimensional time sequence array to the two-dimensional space position array is realized, and the position intensity distribution of each analysis channel on the whole analysis surface is obtained. Because each analysis channel corresponds to a two-dimensional array, the three-dimensional array IntM [ C ] is used in the actual calculation process _n ,Rows,Cols](channel intensity distribution three-dimensional matrix) to represent two-dimensional distribution of each channel, C _n The number of channels is analyzed for the spectrum.

GPU thread set (32, 1), thread block set ((Rows+1)/32, (Colls+1)/32, C) _n )，C _n For the number of spectral analysis channels, the GPU memory map is shown in fig. 3.

S3.8, two-dimensional conversion of content line scanning data;

element content C after correction of the third element interference of each row obtained in time sequence in step S3.5 _{i school} Line two-dimensional array ConM projected to corresponding area of scanned single line data file according to random uniform distribution principle _i [Rowsi,Cols]The array corresponds to the surface data of the covered area, and the data is converted from a one-dimensional array to a two-dimensional array for each analysis channel, so that the content distribution of the analysis surface at the corresponding position of the scanning line is realized. Because each analysis element corresponds to a two-dimensional array, a three-dimensional array ConM is used in the actual calculation process _i [E _n ,Rows _i ,Cols]To represent the two-dimensional distribution of each channel, E _n To analyze the number of elements.

GPU thread settings (32, 1), thread block settings ((Rows) _i +1)/32，(Cols+1)/32，E _n )，E _n To analyze the number of elements, the GPU memory map is shown in fig. 3.

S3.9, converting the content time sequence array into a space position array;

the row two-dimensional array Con of the step S3.8 M _i [Rows _i ,Cols]Projecting the position information according to the coordinates of the analysis surface to a two-dimensional array ConM [ Rows, coll ] of the whole analysis surface]And obtaining the content distribution array of the whole analysis surface. And the conversion from the one-dimensional time sequence array to the two-dimensional space position array is realized, and the position content distribution of each spectrum analysis channel on the whole analysis surface is obtained. Because each analysis element corresponds to a two-dimensional array, a three-dimensional array ConM [ E ] is used in the actual calculation process _n ,Rows,Cols](element content distribution three-dimensional matrix) to represent two-dimensional distribution of each channel, E _n To analyze the number of elements.

GPU thread set (32, 1), thread block set ((Rows+1)/32, (Colls+1)/32, E) _n )，E _n To analyze the number of elements. The GPU memory map is shown in fig. 3.

S4, calculating the number C of spectrum analysis channels _n Dividing the obtained integer by 8 to obtain the circulation times, and obtaining the channel intensity distribution three-dimensional matrix IntM [ C ] obtained in the step S3.7 _n ,Rows,Cols]The 8 streams are fed by channels respectively, each stream is responsible for one channel data ConM _i [Rowsi,Cols]The number of the streams adopted in the last cycle is the number of the remaining uncomputed channels; calculating a plurality of intensity parameter values of 8 channels by means of 8 streams per cycle, the intensity parameter values including a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% score value, a 0.5% score value, a 2.5% score value, a 1/4 score value, a 3/4 score value, a 97.5% score value, a 99.5 score value, a 99.865 score value, a maximum value occurrence position, a minimum value occurrence position, an inclusion threshold value, an inclusion intensity signal distribution; the following 3 steps of operation are sequentially carried out in each stream, the flow chart is shown in fig. 9, and the specific implementation steps are as follows:

S4.1, reading an intensity distribution array IntM [ Rows, colls ] of a channel in each stream; GPU thread settings (32, 32), thread block settings ((rows+1)/32, (cols+1)/32).

S4.2, carrying out Hill sorting on an intensity distribution array IntM [ Rows, colls ], and calculating a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% quantile value, a 0.5% quantile value, a 2.5% quantile value, a 1/4 quantile value, a 3/4 quantile value, a 97.5% quantile value, a 99.5 quantile value, a 99.865 quantile value, a maximum value appearance position and a minimum value appearance position.

GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，C _n )。N _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

S4.3, for an intensity distribution array IntM [ Rows, colls ]]Iterative calculation of the inclusion signal threshold value, wherein the threshold value calculation mode is as follows: INT (INT) _inc ＝INT _avg +3×INT _sd ，INT _avg INT for all signal intensity averages of channel _sd INT for absolute standard deviation of all signal intensities of channel _inc Is a threshold value; after eliminating the signal exceeding the threshold value, calculating the average value and the absolute standard deviation, recalculating the threshold value, and repeatedly iterating until the amount of the eliminated data is less than 3, wherein INT is used at the moment _inc I.e. the inclusion signal threshold for that channel. The data higher than the threshold value in all signal data are statistically analyzed, the frequency of the occurrence of the intensity signal of each intensity segment is calculated, and the element inclusion signal statistical distribution matrix IntD is output _i ,256]，D _i The number of sparks occurring in each signal segment.

S5, calculating E _n Dividing the obtained integer by 8 to obtain cyclic times, and dividing element content distribution three-dimensional matrix ConM [ E ] _n ,Rows,Cols]8 streams are fed per element, each stream being responsible for one element data ConM [ Rows, colls ]]The number of streams adopted in the last cycle is the number of remaining uncomputed elements; each cycle calculation calculates a plurality of content parameter values for 8 channels by means of 8 streams, the content parameter values including maximum, minimum, median, average, absolute deviation, 0.135% quantileValue, 0.5% quantile value, 2.5% quantile value, 1/4 quantile value, 3/4 quantile value, 97.5% quantile value, 99.5 quantile value, 99.865 quantile value, maximum value occurrence position, minimum value occurrence position, content distribution, statistical segregation degree, statistical porosity. The flow chart is shown in fig. 10, and the specific implementation steps are as follows:

S5.2, carrying out Hill sorting on a content distribution array ConM [ Rows, colls ], and calculating a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% quantile value, a 0.5% quantile value, a 2.5% quantile value, a 1/4 quantile value, a 3/4 quantile value, a 97.5% quantile value, a 99.5 quantile value, a 99.865 quantile value, a maximum value appearance position and a minimum value appearance position.

GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，E _n )。N _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements. The GPU memory map is shown in fig. 3.

S5.3, calculating standard deviation by adopting a quarter bit distance method to calculate the statistical segregation degree of the content calculated by the steady variation coefficient. Carrying out line statistical analysis on the content matrix, calculating the occurrence times of each content segment, and outputting an element content distribution matrix ConD [ E ] _n ,256]，E _n To analyze the number of elements.

GPU thread set (1024,1,1), thread block set (1024, (N) _i +1023)/1024)，E _n )。N _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements.

S6, carrying out channel intensity distribution three-dimensional matrix IntM [ C ] obtained in step S3.7 _n ,Rows,Cols]Element content distribution three-dimensional matrix ConM [ E ] obtained in step S3.9 _n ,Rows,Cols]And step S4, each parameter required to be calculated in the step S5 is written into the CPU memory from the GPU memory.

And S7, displaying the data result on the CPU in a graph or chart form.

Examples

The scan is shown in FIG. 7 by scanning the 1000mm by 500mm oversized sample surface. This embodiment is exemplified by the sample description analysis. The method according to the invention comprises the following steps:

s1, adopting an OPA-1000 large-scale component segregation degree analyzer, wherein the analyzer is provided with 14 photomultiplier tubes (13 analysis channels (0-12), a matrix channel and matrix channel number 13), and selecting a working curve, wherein the working curve comprises 13 analysis elements and a matrix element. The scanning starting point positions (x, y, w) and the scanning areas (1000 mm multiplied by 500 mm) are set, and the scanning mode is a full-automatic mode of progressive continuous scanning. Each row is inflated for 5 seconds at the beginning, precombusted for 5 seconds, scanned at a row spacing of 4mm and a scanning speed of 1mm/s, and the acquisition frequency is 500 times/s. The number of scanning lines is 125 lines. The spot excitation diameter was 4mm. 125 scan absolute intensity files were obtained cumulatively, each data file containing 14 channels, each channel containing 500000 spark intensity data.

S2, converting the calibration working curve into coefficient matrixes CM 13 and 8.

S3, in the parallel operation process of the GPU, 125 scanning intensity data files are sent to 8 streams (Stream 1-Stream 8) to carry out synchronous parallel operation, 8 files are input at the same time each time, 15 times of circulation are carried out, the rest 5 files are sent to the 16 th circulation, only 5 streams are started for parallel calculation, and each Stream in each circulation only processes one intensity data file. 8 operations are performed in each stream in turn. The specific implementation steps are as follows:

S3.1, editing a scanning path for the surface to be analyzed of the sample to realize full coverage of the surface of the sample, converting a scanning path file of the surface of the sample into position coordinate information, and scanning the surface of the sample line by a program-controlled full-automatic segregation analyzer to obtain the absolute intensity I of an analysis channel element _i Coefficient matrix CM [13,8] converted from calibration working curve obtained in step S2]And reading the GPU memory in a streaming manner.

S3.2, converting the absolute intensity into an intensity ratio. This step GPU thread set (1024,1,1), thread block set (1024, (500000+1023)/1024), 14), GPU memory map as shown in fig. 3.

S3.3, converting the intensity ratio into a content ratio; and (3) converting the intensity ratio R obtained in the step S3.2 into a content ratio CR according to a coefficient matrix CM [13,8] converted by the calibration working curve obtained in the step S2, and performing out-of-limit processing on the obtained calculated content ratio exceeding the upper limit and the lower limit of the curve content ratio. This step GPU thread set (1024,1,1), thread block set (1024, (500000+1023)/1024), 13), GPU memory map is shown in fig. 3.

S3.4, converting the content ratio into content; calculating the content ratio CR of all analysis channels _i According to the rule that the sum of the element contents of all analysis channels is 100%, the channel content C of the matrix is calculated _m Then according to C _i ＝C _m ×CR _i Calculating the element content RC of each analysis channel element _i . This step GPU thread set (1024,1,1), thread block set (1024, (500000+1023)/1024), 13), GPU memory map is shown in fig. 3.

S3.5, performing third element interference correction on the element content. This step GPU thread set (1024,1,1), thread block set (1024, (500000+1023)/1024), 13), GPU memory map is shown in fig. 3.

S3.6, two-dimensional conversion of the intensity line scanning data; the absolute intensity I of analysis channel element obtained in time sequence _i Line two-dimensional array IntM projected to corresponding area of scanned single line data file according to random uniform distribution principle _i [44,10000]The array corresponds to the surface data of the covered area, and the data is converted from a one-dimensional array to a two-dimensional array for each analysis channel, so that the intensity distribution of the analysis surface at the corresponding position of the scanning line is realized, and because each analysis channel corresponds to one two-dimensional array, the three-dimensional array IntM is used in the actual calculation process _i [14,40,10000]Representing a two-dimensional distribution of each channel. In this step, GPU thread set (32, 1), thread block set ((40+1)/32, (10000+1)/32, 14), GPU memory map is shown in fig. 3.

S3.7, intM for each channel _i [40,10000]The array is projected to the whole two-dimensional array IntM [5000,10000 ] of the analysis surface according to the position information of the array on the analysis surface]And obtaining an intensity distribution array of the whole analysis surface. Realizing the conversion from one-dimensional time sequence array to two-dimensional space position arrayAnd (3) obtaining the position intensity distribution of each analysis channel on the whole analysis surface. Because each analysis channel corresponds to a two-dimensional array, a three-dimensional array IntM [14,5000,10000 ] is used in the actual calculation process]Representing a two-dimensional distribution of each channel.

GPU thread set (32, 1), thread block set ((10000+1)/32, (5000+1)/32, 14), GPU memory map see fig. 3.

S3.8, two-dimensional conversion of content line scanning data; projecting each line of content data obtained in time sequence to a line two-dimensional array ConM of a corresponding area of a scanning single line data file according to a random uniform distribution principle _i [40,10000]The array corresponds to the surface data of the covered area, and the data is converted from a one-dimensional array to a two-dimensional array for each analysis channel, so that the content distribution of the analysis surface at the corresponding position of the scanning line is realized. Because each analysis element corresponds to a two-dimensional array, a three-dimensional array ConM is used in the actual calculation process _i [13,10000,5000]Representing a two-dimensional distribution of each channel.

GPU thread set (32, 1), thread block set ((40+1)/32, (5000+1)/32, 13), GPU memory map see fig. 3.

S3.9, the row two-dimensional array ConM of the step S3.8 _i [40,10000]Projecting the position information according to the coordinates of the analysis surface to a two-dimensional array ConM [5000,10000 ] of the whole analysis surface]And obtaining the content distribution array of the whole analysis surface. And the conversion from the one-dimensional time sequence array to the two-dimensional space position array is realized, and the position content distribution of each analysis channel on the whole analysis surface is obtained. Because each analysis element corresponds to a two-dimensional array, a three-dimensional array ConM [13,5000,10000 ] is used in the actual calculation process]To represent the two-dimensional distribution E of each channel _n To analyze the number of elements.

GPU thread set (32, 1), thread block set ((5000+1)/32, (10000+1)/32, 13), GPU memory map see fig. 3.N (N) _i For the total number of sparks corresponding to the scanned document E _n To analyze the number of elements.

S4, obtaining the circulation times 2 by dividing 14 by 8, wherein 8 flows are adopted in one circulation, and 6 flows are adopted in the second circulation. The intensity distribution array IntM [14,5000,10000] is fed into 8 streams respectively according to channels, and each stream is responsible for the operation of one channel data. Each cycle of calculation relies on 8 streams to calculate the intensity parameter values of 8 channels, and 3 steps of calculation are sequentially carried out in each stream, wherein the specific implementation steps are as follows:

S4.1, reading an intensity distribution array IntM [5000,10000] of a channel in each stream; GPU thread settings (32, 32), thread block settings ((5000+1)/32, (10000+1)/32).

S4.2, carrying out Hill sorting on an intensity distribution array IntM [5000,10000], and calculating a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% quantile value, a 0.5% quantile value, a 2.5% quantile value, a 1/4 quantile value, a 3/4 quantile value, a 97.5% quantile value, a 99.5 quantile value, a 99.865 quantile value, a maximum value appearance position and a minimum value appearance position. GPU thread set (1024,1,1), thread block set (1024, (5000 x 10000+1023)/1024), 14).

S4.3 for an intensity distribution array IntM [5000,10000]]Iterative calculation of the inclusion signal threshold value, wherein the threshold value calculation mode is as follows: INT (INT) _inc ＝INT _avg +3×INT _sd ，INT _avg INT for all signal intensity averages of channel _sd INT for absolute standard deviation of all signal intensities of channel _inc As the threshold value, eliminating the signal exceeding the threshold value, calculating the average value and the absolute standard deviation, re-calculating the threshold value, repeatedly iterating until the eliminated data amount is less than 3, and at the moment, INT _inc I.e. the inclusion signal threshold for that channel. The data higher than the threshold value in all signal data are statistically analyzed, the frequency of the occurrence of the intensity signal of each intensity segment is calculated, and the element inclusion signal statistical distribution matrix IntD is output _i ,256]，D _i The number of sparks occurring in each signal segment.

GPU thread set (1024,1,1), thread block set (1024, (5000 x 10000+1023)/1024), 14).

S5, obtaining the circulation times 2 by calculating an integer obtained by dividing 13 by 8, respectively feeding the content distribution array ConM [13,5000,10000] into 8 streams according to elements, wherein each stream is responsible for the operation of one element data ConM2[5000,10000], the more elements wait for the next circulation, and the number of streams adopted in the last circulation is 5. Each cycle of calculation relies on 8 streams to calculate the respective content parameter values for 8 channels, and the specific implementation steps are as follows:

s5.2, carrying out Hill sorting on the read content distribution array ConM2[5000,10000] in each stream, and calculating a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% quantile value, a 0.5% quantile value, a 2.5% quantile value, a 1/4 quantile value, a 3/4 quantile value, a 97.5% quantile value, a 99.5 quantile value, a 99.865 quantile value, a maximum value appearance position and a minimum value appearance position.

GPU thread set (1024,1,1), thread block set (1024, (5000 x 10000+1023)/1024), 13). The GPU memory map is shown in fig. 3.

S5.3, calculating standard deviation by adopting a quarter bit distance method to calculate the statistical segregation degree of the content calculated by the steady variation coefficient. And carrying out statistical analysis on the content matrix according to the rows, calculating the occurrence times of each content segment, and outputting an element content distribution matrix ConD 13,256. GPU thread set (1024,1,1), thread block set (1024, (5000 x 10000+1023)/1024), 13).

S6, writing all parameters required to be calculated in the step S4 and the step S5 into a CPU memory from the GPU memory, wherein the parameters comprise a channel intensity distribution three-dimensional matrix IntM [14,5000,10000], an element content distribution three-dimensional matrix ConM [13,5000,10000 ].

And S7, displaying the data result on the CPU in a graph or chart form.

The calculation result (see fig. 5) of the sample using the model of the present invention was compared with the calculation result (see fig. 6) of the conventional CPU, the calculation result was completely consistent, and the calculation speed was increased from 3.5 hours to 15 minutes, see the table below. The improvement is nearly 14 times, and the very good application effect is obtained.

	Conventional method	The invention is that
			CPU reads in GPU memory	0	For 1 minute
Calculation of	3.5 hours	13 minutes
			GPU reads in CPU memory	0	For 1 minute
Total time of	3.5 hours	15 minutes

Claims

1. A method for rapidly processing big data in an oversized metal in-situ analyzer is characterized by comprising the following steps:

s3.2, converting the absolute intensity into an intensity ratio;

s3.3, converting the intensity ratio into a content ratio;

s3.4, converting the content ratio into content;

S3.5, performing third element interference correction on the element content;

s3.6, two-dimensional conversion of the intensity line scanning data;

s3.7, converting the intensity time sequence array into a space position array;

IntM for each analysis channel obtained in step S3.6 _i [Rowsi,Cols]According to which the array is arrangedThe coordinate position information of the analysis surface is projected to the whole two-dimensional array IntM [ Rows, colls ]]In the actual calculation process, for the intensity distribution array of the whole analysis surface, a three-dimensional array IntM [ C ] is used _n ,Rows,Cols]I.e. a three-dimensional matrix of channel intensity distribution to represent the two-dimensional distribution of each channel, C _n The number of the spectrum analysis channels;

s3.8, two-dimensional conversion of content line scanning data;

s3.9, converting the content time sequence array into a space position array;

s6, carrying out channel intensity distribution three-dimensional matrix IntM [ C ] obtained in step S3.7 _n ,Rows,Cols]And step S3.9 obtained element content distribution three-dimensional matrix ConM [ E ] _n ,Rows,Cols]Step S4, each intensity and content parameter value required to be calculated in the step S5 is written into the CPU memory from the GPU memory;

2. The method for rapid processing of large data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S3.2, GPU thread setting (1024,1,1), thread block setting (1024, (N _i +1023)/1024)，C _n )，N _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

3. The method for rapid processing of big data in an oversized metal in-situ analyzer according to claim 1, wherein in the steps S3.3, S3.4 and S3.5, GPU thread setup (1024,1,1), thread block setup (1024, (N _i +1023)/1024)，E _n )，N _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements.

4. The method for quickly processing big data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S3.6 and the step S3.7, GPU thread setting (32, 1), thread block setting ((rowsi+1)/32, (cols+1)/32, c _n )，C _n The number of channels is analyzed for the spectrum.

5. The method for rapid processing of large data in an oversized metal in-situ analyzer according to claim 1, wherein in the steps S3.8 and S3.9, GPU thread setup (32, 1), thread block setup ((Rows) _i +1)/32，(Cols+1)/32，E _n )，E _n To analyze the number of elements.

6. The method according to claim 1, wherein in the step S4, the intensity parameter values include a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% score, a 0.5% score, a 2.5% score, a 1/4 score, a 3/4 score, a 97.5% score, a 99.5 score, a 99.865 score, a maximum value occurrence position, a minimum value occurrence position, an inclusion threshold, and an inclusion intensity signal distribution.

7. The method for quickly processing big data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S4.1, GPU thread is set (32, 32), thread block is set ((rows+1)/32, (cols+1)/32).

8. The method for rapid processing of large data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S4.2 and the step S4.3, GPU thread setting (1024,1,1), thread block setting (1024, (N _i +1023)/1024)，C _n )，N _i The number of single sparks obtained for each line per spectral analysis channel, C _n The number of channels is analyzed for the spectrum.

9. The method for rapidly processing big data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S5, the content parameter values include a maximum value, a minimum value, a median value, an average value, an absolute deviation, a 0.135% score, a 0.5% score, a 2.5% score, a 1/4 score, a 3/4 score, a 97.5% score, a 99.5 score, a 99.865 score, a maximum value occurrence position, a minimum value occurrence position, a content distribution, a statistical segregation degree, and a statistical porosity.

10. The method for quickly processing big data in an oversized metal in-situ analyzer according to claim 1, wherein in the step S5.2 and the step S5.3, GPU thread setting (1024,1,1), thread block setting (1024，(N _i +1023)/1024)，E _n )，N _i Number of single spark data obtained per spectral analysis channel per row, E _n To analyze the number of elements.

11. The method for rapid processing of large data in an oversized metal in-situ analyzer of claim 1, wherein all data results of the method are completed within 5 minutes.

12. The method for rapidly processing big data in an oversized metal in-situ analyzer according to claim 1, wherein the scanning area of the method is 100-1000 mm in length and 100-500 mm in width.