CN117235593A

CN117235593A - Refining device process optimization method and system based on big data

Info

Publication number: CN117235593A
Application number: CN202311149999.1A
Authority: CN
Inventors: 王珠; 王若暄; 张雅洁
Original assignee: China University of Petroleum Beijing
Current assignee: China University of Petroleum Beijing
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2023-12-15

Abstract

The invention relates to a refining device process optimization method and system based on big data, comprising the following steps: acquiring current working condition and process position data of a refining device; and selecting proper excellent parameter points from a preset excellent parameter recommendation table of different working conditions to output based on the current working condition and the process position data, and optimizing the process parameters of the refining device under the current working condition. The invention has the advantages of strong practicability, wide application range and the like, has excellent application prospect and commercial value, and can be widely applied to the technical field of refining device process optimization.

Description

Refining device process optimization method and system based on big data

Technical Field

The invention relates to the technical field of refining device process optimization, in particular to a refining device process optimization method and system based on big data.

Background

With the development of computer technology and the continuous expansion of application fields, a great amount of historical data is gradually accumulated in an enterprise information system, and the data covers various processes of the process. At present, informatization of most of domestic enterprises is mainly in a management level, and only data acquisition, storage and retrieval are performed, so that the requirements of data analysis and processing cannot be met.

Data clustering is a common method of efficiently processing vast amounts of data, with the purpose of determining a limited set of categories to describe a data set based on similarity between data objects. The clustering technology is used as an analysis process for dividing an object set into a plurality of classes composed of similar objects according to the characteristics of the objects, and the unsupervised learning method for the sample data set according to the intrinsic properties and rules of a certain standard or data has a good processing effect on complex and numerous technological parameters in a production flow.

As a powerful analysis tool for data mining, cluster analysis generally has two uses:

(1) As an independent data mining tool, finding the distribution characteristics of data;

(2) As a data preprocessing step of some other data analysis methods, data which has been grouped based on a certain pattern is provided to other methods, and further, the other methods are allowed to perform professional analysis on the corresponding data division results.

Currently, cluster analysis has been successfully applied in many fields including image processing, pattern recognition, business, biology, geography, web services, intelligence retrieval, and the like. Through data clustering analysis, information hidden in a large amount of seemingly disordered data can be concentrated, extracted and refined to find out the internal rules of the researched objects, potential modes are mined from the internal rules, enterprises and merchants can be helped to adjust market policies, reduce risks and rationality to face the market, correct decisions are made, governments can be helped to adjust future management policies and economic structures, and ecological development and the like are actively dealt with.

SVDD (support vector description) is an emerging intelligent algorithm, and the basic idea is to search for the most suitable hypersphere containing all data points of a batch of sample data distributed in a characteristic space, and classify a target sample and an abnormal sample by taking the hypersphere as a decision boundary, so that abnormal behavior detection is realized. The SVDD algorithm is an extension of the traditional SVM algorithm and is widely applied to fault detection, early warning, intrusion detection and process optimization.

However, when the sample size and data dimension are too high, a large calculation cost is required, which is disadvantageous for process optimization in the refining apparatus, abnormality detection, and the like. The K-means clustering algorithm is combined with the SVDD method, a large amount of data can be subjected to preliminary processing by the K-means algorithm, then SVDD superball training is carried out on different types of data, and the optimal process parameter value is obtained by extracting the data segment of the optimal working condition. The combination of the two methods can reduce the complexity of the algorithm in the hypersphere, reduce the error rate of classification and provide guarantee for the process optimization of the refining device.

In actual production, the manufacturing procedure of the production flow of the chemical product is relatively complex, and the accuracy of each technological parameter setting directly influences the quality of the finished product. The existing refining device has the following problems for improving the quality of products:

(1) A large amount of historical data exist in refining devices of various enterprises, but the data are not reasonably utilized, so that the purposes of improving the product quality of the refining devices and improving the economic benefit of the enterprises are achieved;

(2) Each refining device of an enterprise has different production requirements under different production plans, and various working conditions are contained in the actual production process, so that the optimal values of key parameters in the refining devices under different working conditions are difficult to be determined manually;

(3) How to analyze key technological parameters of the refining device through a large amount of historical data to obtain optimal industrial parameter values;

(4) How to ensure that enterprises can obtain and accumulate key parameter optimal values under different working conditions for a long time, and ensure the long-term effectiveness of training results.

Disclosure of Invention

Aiming at the problems, the invention aims to provide a refining device process optimization method and system based on big data, which optimize the refining device process by utilizing a large amount of historical data of refining devices in enterprises, provide a guarantee for process optimization of the refining devices, and simultaneously provide a reliable basis for factories to pursue higher product quality and economic benefit.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

In a first aspect, the invention provides a process optimization method of a refining device based on big data, comprising the following steps:

acquiring current working condition and process position data of a refining device from an upper cloud platform or a DCS;

and selecting proper excellent parameter points from a preset excellent parameter recommendation table of different working conditions to output based on the current working condition and the process position data, and optimizing the process parameters of the refining device under the current working condition.

Further, the excellent parameter recommendation table under different working conditions is obtained by training K-means and SVDD hypersphere by using historical data of a refining device in an enterprise, and comprises the following steps:

1.1 Collecting historical data of an enterprise refining device, preprocessing the historical data, segmenting the data matrix, and forming an input data matrix;

1.2 Aiming at the input data matrix of each segment of data, classifying by adopting an improved K-means clustering method, performing hypersphere training by utilizing an SVDD method, and discarding abnormal hyperspheres by combining visual analysis to obtain typical process parameter points with different hyperspheres and time stamps;

1.3 Judging whether the number of typical process parameters screened by the segmentation meets the preset condition, if so, indicating that the segmentation number is reasonable, training all the segments, then entering the step 1.4), otherwise, reselecting the segmentation number, and returning to the step 1.1) until the preset condition is met;

1.4 Based on the obtained typical technological parameter points, carrying out artificial mechanism fusion and screening;

1.5 Outputting the quality index meeting the requirement, the corresponding working condition data and the corresponding excellent process position number point, and storing the quality index, the corresponding working condition data and the corresponding excellent process position number point into an excellent parameter recommendation table.

Further, in the step 1.1), the step of collecting and preprocessing historical data of the enterprise refining device, and segmenting the data to form an input data matrix includes:

acquiring historical data of key process variables related to product quality according to field experience;

selecting a time period for which all key process variables are commonly and effectively sampled;

determining an initial segmentation number and segmenting the data;

and forming a vector by combining the output measured values of each key process variable at the same moment for each data segment to form an input data matrix.

Further, in the step 1.2), for the input data matrix of each segment of data, after classifying by adopting an improved K-means clustering method, performing hypersphere training by using an SVDD method, and discarding abnormal hyperspheres by combining visual analysis to obtain typical process parameter points of different hyperspheres and with time stamps, including:

adopting an improved K-means clustering method to realize the selection of optimal clustering numbers, and dividing and label setting the categories of an input data matrix;

Performing hypersphere training on each type of data by adopting an SVDD method, and extracting data characteristics of different working conditions;

performing visual analysis on the supersphere in each segment of data, discarding the supersphere with the data volume in the supersphere less than the preset value and the supersphere visual abnormality, and reserving the rest normal superspheres;

and substituting all the preprocessed data into the reserved hypersphere again, and extracting representative optimal parameters of different working conditions by adopting a median method to serve as typical process parameter points.

Further, in the step 1.4), based on the obtained typical process parameter points, the artificial mechanism fusion and screening are performed, which means that the extracted typical process points are matched with the time stamp, the working condition and the quality data, the process parameter points with the quality meeting the preset requirements are selected and stored in an excellent parameter recommendation table, and the method comprises the following steps:

product quality index and position number data anomaly screening: if the fluctuation of the product quality near the process point time stamp is larger than a preset value, discarding the typical process point; if the abnormal state of the bit data of the process point is 0 or negative, the typical process point is also omitted;

typical process point similarity comparison: and comparing the bit number data between the typical process points in pairs, if the similarity is larger than a preset value, comparing the quality indexes, if the quality indexes are different, observing whether the fluctuation of the bit number data near the time stamp of the two typical process points is larger than the preset value, if one of the process point number data fluctuation is larger than the preset value, discarding the typical process point, and if the fluctuation of the two process point number data is larger than the preset value or smaller than the other preset value, discarding the two process point number data.

Further, the selecting, based on the current working condition and the process position data, a suitable excellent parameter point from a preset excellent parameter recommendation table of different working conditions to output, and optimizing the process parameters of the refining device under the current working condition includes:

acquiring current DCS process bit data through an interface, and identifying current working conditions;

obtaining selectable excellent technological parameter points under the current operation condition from an excellent parameter recommendation table;

and calculating the distance between the current running parameter and all selectable excellent process parameter points under the working condition, and extracting the excellent process parameter point with the minimum distance for output hot standby.

Further, the method comprises the steps of: acquiring working condition and process position number data sets in a preset time period, expanding and updating the superball cluster based on the acquired data sets, and obtaining an excellent parameter recommendation table of different updated working conditions.

Further, the acquiring the working condition and process position number data set in the preset time period, expanding and updating the super-sphere cluster based on the acquired data set, and obtaining the updated excellent parameter recommendation table of different working conditions, including:

3.1 Acquiring a DCS bit number data set in a preset time period, and preprocessing to obtain a latest data set;

3.2 Sequentially judging whether the latest data set falls into the existing hypersphere, if so, entering the step 3.3), otherwise, entering the step 3.4);

3.3 Judging which existing hypersphere the latest data set falls into, judging whether the hypersphere data amount is less than a set threshold value, if so, adding the latest data set into the hypersphere and retraining the hypersphere to acquire new hypersphere characteristics, extracting new typical process points, otherwise, not updating;

3.4 Recording the latest data set, forming a new input data matrix, performing hyper-sphere cluster training, and storing the extracted process parameter points into an excellent parameter recommendation table.

In a second aspect, the present invention provides a process optimization system for a refining apparatus based on big data, comprising:

the data acquisition module is used for acquiring the current working condition and process position data of the refining device;

and the parameter optimization module is used for selecting proper excellent parameter points from the preset excellent parameter recommendation tables of different working conditions to output based on the current working condition and the process position data, and optimizing the process parameters of the refining device under the current working condition.

In a third aspect, the present invention provides a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods.

In a fourth aspect, the present invention provides a computing device comprising: one or more processors, memory, and one or more programs, wherein one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods.

Due to the adoption of the technical scheme, the invention has the following advantages:

(1) The invention analyzes a large amount of historical data in the refining device based on the K-means clustering-SVDD method, and effectively utilizes the historical data storage of enterprises;

(2) The improved K-means algorithm is adopted to classify huge data under different working conditions and label the huge data, and the problem of inaccurate manually set clustering number is avoided, so that the manual work load is reduced, and the accuracy of clustering results is improved by drawing a trend chart of the clustering number and a loss function;

(3) SVDD (singular value decomposition) superball training is carried out one by adopting clustered classification results, and the combination of the two methods reduces the complexity of an algorithm on one hand and improves the accuracy of data feature extraction on the other hand;

(4) The method adopts the modes of algorithm training, mechanism fusion, screening and data segmentation, so that the working condition with good product quality can be extracted more carefully, the optimal parameter of the most effective working condition can be extracted from the working condition, and reliable guidance is provided for enterprise production;

(5) The hypersphere visual analysis provides clearer data characteristics for actual production, provides conditions for deep mining of data, and provides clearer man-machine interaction modes.

(6) The online updating optimization ensures the effective expansion updating of the super-sphere cluster, maintains the comprehensiveness and the comprehensiveness of the super-sphere cluster, further ensures that the super-sphere cluster can cover various working conditions and cover various optimization points, and finally ensures the long-term effectiveness of the method;

therefore, the invention has the advantages of strong practicability, wide application range and the like, has excellent application prospect and commercial value, and can be widely applied to the technical field of process optimization of refining devices.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Like parts are designated with like reference numerals throughout the drawings. In the drawings:

FIG. 1 is a flow chart of a process optimization method of a refining device based on big data provided by an embodiment of the invention;

FIG. 2 is a process optimization offline training flow chart provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a data acquisition and preprocessing method according to an embodiment of the present invention;

FIG. 4 is a flow chart of an algorithm training portion provided by an embodiment of the present invention;

FIG. 5 is a block diagram of an SVDD algorithm according to an embodiment of the present invention, wherein triangles represent abnormal data, circles represent support vectors, and squares represent normal data;

FIG. 6 is a schematic diagram of an optimal parameter extraction method according to an embodiment of the present invention;

FIG. 7 is a flow chart of a process optimization online recommendation function provided by an embodiment of the present invention;

FIG. 8 is a flow chart of a process optimization online update function provided by an embodiment of the present invention;

FIG. 9 is a graph of preliminary training K values versus loss function selections provided by an embodiment of the present invention;

FIG. 10 is a super sphere visualization distribution diagram of data of paragraph 5_2 according to an embodiment of the present invention;

FIG. 11 is a super sphere visualization distribution diagram of data of paragraph 3_2 according to an embodiment of the present invention;

FIG. 12 is a super sphere visualization distribution of data from paragraph 3_3 according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which are obtained by a person skilled in the art based on the described embodiments of the invention, fall within the scope of protection of the invention.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

In some embodiments of the present application, a process optimization method for a refining device based on big data is provided, and the process optimization method is mainly divided into two parts, namely offline training and online recommendation updating optimization. In an offline training part, a large amount of historical data of a refining device in an enterprise is utilized, reasonable data segmentation is carried out, K-means clustering is carried out on each segment of data, and then SVDD super-sphere training is carried out on each type of data; then performing visual analysis to discard abnormal and less-quantity superspheres, and extracting typical process points aiming at the reserved superspheres; and selecting process parameter points with better quality through artificial mechanism fusion and screening, and storing the process parameter points in an excellent parameter recommendation table. In the online recommendation updating and optimizing part, proper excellent parameter points are selected from the excellent parameter recommendation table to output by utilizing the acquired current working condition and process position data, so that basis is provided for on-site personnel to operate, production is guided, meanwhile, the excellent parameter recommendation table is updated by utilizing data in a preset time period, effective expansion and update of the super-required clusters are ensured, the comprehensiveness and the comprehensiveness of the super-spherical clusters are maintained, and reliable basis is provided for pursuing higher product quality and economic benefit for factories.

In accordance therewith, in other embodiments of the present invention, a process optimization system, apparatus, and storage medium for a big data based refining plant is provided.

Example 1

As shown in fig. 1, the present embodiment provides a process optimization method of a refining device based on big data, which includes the following steps:

1) Offline training: performing data segmentation by utilizing a large amount of historical data of a refining device in an enterprise, performing K-means and SVDD hypersphere training, performing hypersphere screening by visual analysis to obtain a hypersphere cluster and data characteristics thereof, and combining artificial mechanism fusion and screening to obtain excellent parameter recommendation tables of different working conditions;

2) On-line recommendation: and (3) selecting proper excellent parameter points from the excellent parameter recommendation tables of different working conditions obtained in the step (1) to output based on the obtained current working condition and process position data, and optimizing the process parameters of the refining device under the current working condition.

In some embodiments, the method further comprises the steps of:

3) Online updating: acquiring working condition and process position number data sets in a preset time period, expanding and updating the super-sphere cluster in the step 1) based on the acquired data sets, and obtaining an excellent parameter recommendation table of different updated working conditions.

In some embodiments, as shown in fig. 2, the step 1) includes the following steps:

1.1 Collecting historical data of an enterprise refining device, preprocessing the historical data, and reasonably segmenting a data matrix to form an input data matrix;

1.2 The input data matrix of each segment of data is subjected to algorithm training stage, namely, after being classified by adopting an improved K-means clustering method, the hypersphere is trained by utilizing a support vector description (SVDD) method, and abnormal hyperspheres are discarded by combining visual analysis, so that typical process parameter points of different hyperspheres and with time stamps are obtained;

1.3 Judging whether the number of typical process parameters screened by segmentation meets preset conditions, namely, the conditions of not less than the number of the quality index distribution intervals of the segments, if so, indicating that the segmentation number is reasonable, entering step 1.4) after training all the segments, otherwise, reselecting the segmentation number, and returning to step 1.1) until the preset conditions are met;

In some embodiments, in step 1.1), the offline training should select long-period historical data of key process parameters in the refining apparatus, and the data to be acquired should be in a steady state so as not to affect the training result. Because different production demands exist under different production plans in the field production process, and various working conditions are included in the actual production process, a plurality of possibilities of analyzing a large amount of historical data, including data of different production plans and working conditions, and searching for optimal parameters can be adopted.

As shown in fig. 3, specifically, the method comprises the following steps:

1.1.1 For the refining device, the historical data is collected according to key process variables related to the product quality, which are given by field experience.

In this embodiment, the determined key process variables mainly include the following three types:

(1) the key process variable long-term history data comprises output measured values of flow, liquid level, pressure, temperature and the like;

(2) judging process indexes of the product quality, such as chromaticity, selectivity, conversion rate and the like;

(3) determining indexes of process load and working condition, such as hydrogen peroxide and catalyst concentration;

1.1.2 Determining an initial segmentation number and reasonably segmenting the data;

1.1.3 For each data segment, selecting a time segment for which all the key process variables are effectively sampled together, and forming a vector by the output measured values of each key process variable at the same moment to form an input data matrix.

In some embodiments, in the step 1.2), as shown in fig. 4, the method includes the following steps:

1.2.1 Adopting an improved K-means clustering method to realize the selection of optimal clustering number, and dividing and label setting the categories of the input data matrix;

1.2.2 Performing hypersphere training on each category of data by adopting an SVDD method, and extracting the data characteristics of each hypersphere cluster, namely different working conditions;

1.2.3 The super sphere in each segment of data is subjected to visual analysis, the super sphere with less data quantity in the super sphere and abnormal visualization of the super sphere is removed, and the rest normal super spheres are reserved;

1.2.4 All the data are substituted into the reserved supersphere again, and the representative optimal parameters of different working conditions are extracted by adopting a median method and are used as typical process parameter points.

In some embodiments, in the step 1.2.1), because the data size in the refining device is huge and the dimension is high, a suitable cluster number cannot be set manually, so in this embodiment, the optimal cluster number K is determined first by the improved K-means clustering algorithm, and clustering is performed according to the optimal cluster number K.

The specific method for selecting the optimal cluster number K is as follows:

(1) determining a value set of K and the maximum iteration number t _max ；

(2) Iteration is carried out on the value of each K, the centroid position of the class cluster is updated, and the loss function for ending the iteration is recorded as J _minn ；

(3) Comparing the loss functions under all K values under the same coordinate system;

(4) and observing the image, wherein when obvious inflection points appear on the loss function value, the corresponding K value is the optimal cluster number.

The specific implementation steps of the K-means algorithm are as follows:

(1) determining an optimal cluster number K by the method;

(2) k objects are selected as initial clustering centers;

(3) calculating the distance from each clustering object to the clustering center;

(4) updating the clustering center until the clustering center is not changed any more or the maximum iteration number is reached, and terminating the algorithm;

(5) the data contained in each class is saved and labeled.

In some embodiments, the loss function is defined in the K-means algorithm as the sum of squares of errors of individual samples from the cluster center point to which they belong, J. If the difference value of the previous iteration J and the subsequent iteration J is smaller than a certain threshold value, ending the iteration to obtain a clustering final result, namely:

wherein t is the iteration number and epsilon is a set threshold. The iteration and termination modes are as follows:

Begin

Computing and updating cluster centers

Calculating the distance from each clustering object to the clustering center

The cluster center is not changed any more or reaches the maximum iteration number

End

And establishing the SVDD super-sphere model by using model parameters contained in each class after K-means clustering.

In some embodiments, in step 1.2.2) above, the basic idea of SVDD is to construct the hyper-boundary to form a hyper-sphere so that it contains as much positive samples as possible in the training samples, and to achieve maximum separation of positive and negative samples. The core problem of SVDD is to seek an optimal boundary to achieve an optimal detection effect.

SVDD maps the function from the original space to the feature space through nonlinear transformation function phi x-F, and finds a hypersphere with the minimum volume in the feature space.

Common kernel functions include: linear kernel functions, polynomial kernel functions, gaussian kernel functions, and sigmoid kernel functions.

As shown in FIG. 5, the SVDD is optimized by adopting a Gaussian kernel function method, training data is mapped to a higher-dimensional space to calculate a hypersphere, and the Gaussian kernel function has the following formula:

K(x _i ,x _j )＝exp(-(x _i -x _j ) ² /s ² )

in some embodiments, the Lagrangian coefficient is satisfied by 0 < α in all normal training samples _i Samples < C are called support vectors, alpha _i Is x _i The corresponding Lagrangian coefficient, C, is a constant. The sphere center and the radius R of the super sphere are calculated through calculation, and the sphere center distance d from the test sample to the super sphere where the test sample is located is calculated.

If d is less than or equal to kR (k is more than or equal to 1), the test sample is on or in the sphere of the hypersphere, and belongs to a normal sample; otherwise, it belongs to abnormal samples.

In the actual training process, a small amount of negative samples should be added in the positive sample training set to prevent the overfitting phenomenon. Assume that the positive sample and negative sample labels in the training sample are respectively:

the manner in which the center and radius of the hyper-sphere and the distance of the test sample from the center of the hyper-sphere are calculated will vary.

The calculation formula of the sphere center and the radius of the super sphere is as follows:

test sample x _t The distance to the sphere center of the super sphere is:

in some embodiments, after the super-sphere training is completed, the characteristics of the sphere center, the radius and the like are obtained. At this time, a median method is adopted, namely, all the preprocessed data are replaced into the hypersphere, the distance between each piece of data and the sphere center is calculated, the pieces of data are ordered, and the technological parameters of the timestamp where the median is located are taken out as typical technological points representing the characteristics of the hypersphere.

Preferably, in the step 1.2.3), the method includes the steps of:

1.2.3.1 Respectively drawing a hypersphere distribution visual graph of each piece of data;

1.2.3.2 Judging whether the hyper sphere to be discarded is contained in the hyper sphere visualization diagram, wherein the specific discarding reasons comprise two possibilities: (1) the data volume in the super sphere is too small (2) the visual distribution of the super sphere is abnormal, namely, the distribution is more dispersed or most of parameters are 0;

1.2.3.3 Recording the retained superspheres.

Preferably, in the above step 1.2.4), as shown in fig. 6, the method includes the steps of:

1.2.4.1 Calculating the median of the training data inside the hypersphere;

1.2.4.2 Substituting the test set data into the hypersphere, and calculating the distance between the test data and the median of the hypersphere;

1.2.4.3 Judging the data closest to the median;

1.2.4.4 Extracting as typical technological parameter points, outputting quality indexes, corresponding working condition data and corresponding ideal technological position number points.

Preferably, in the step 1.4), the artificial mechanism fusion and screening are performed, that is, the extracted typical process points are matched with the time stamp, the working condition and the quality data, and the process parameter points with better quality are selected from the obtained data and stored in the excellent parameter recommendation table. Wherein, the artificial mechanism fusion mainly comprises two steps to realize the screening of typical process points:

(1) Screening the product quality index and the bit data abnormally, namely discarding the typical process point if the product quality fluctuates greatly near the process point timestamp; if the abnormal state of the bit data of the process point is 0 or negative, the typical process point is also omitted;

(2) and comparing the similarity of the typical process points, namely if the similarity of the bit data between the typical process points is larger, further observing the quality index, if the quality index is different, observing whether the bit data near the time stamp where the two process points are located has larger fluctuation, if one of the process point data has larger fluctuation, discarding the process point with larger fluctuation, and if the two bit numbers have larger fluctuation or do not have larger fluctuation, discarding the process point.

And finally, removing typical process points with poor product quality from the reserved data to form a final excellent parameter recommendation table.

Preferably, as shown in fig. 7, in the above step 2), the following steps are included:

2.1 Acquiring current DCS process bit data through an interface and identifying current working conditions;

2.2 Obtaining selectable excellent technological parameter points under the current operation working condition from an excellent parameter recommendation table;

2.3 Calculating the distance between the current running parameter and all the selectable excellent process parameter points under the working condition, extracting the excellent process parameter point with the minimum distance, and outputting hot standby.

Preferably, as shown in fig. 8, in the above step 3), the following steps are included:

3.1 Acquiring a DCS bit data set in a preset time period (such as the last month), and preprocessing to obtain the latest data set;

3.4 Recording the latest data set and forming a new input data matrix, performing superball cluster training by using the same method as the step 1), and storing the extracted technological parameter points with better quality into an excellent parameter recommendation table.

The invention improves the product quality by optimizing key process parameters, provides parameter selection basis for different production plans and working conditions, reasonably utilizes huge data and difficult-to-analyze data, ensures the effective expansion and update of the superball clusters, maintains the comprehensiveness and the comprehensiveness of the superball clusters, further ensures that the superball clusters can cover various working conditions and cover various optimization points, and finally ensures the long-term effectiveness of the method. Has the advantages of strong practicability, wide application range and the like, and has excellent application prospect and commercial value.

Example 2

The reaction kettle is used as the most complex production device in the chemical process, and the invention takes the reaction kettle as an example to illustrate the implementation steps of the method. In the production operation of the reaction kettle, the product quality is commonly influenced by a plurality of materials in a plurality of loops, and in the actual production process, the adjustment of a production plan and the change of working conditions make the setting of the parameter values of various process variables through manual experience impractical under different production plans and working conditions, so that the product quality is difficult to ensure. Meanwhile, the field needs to periodically sample and test the reaction clear liquid, and data acquisition is carried out on key process variable parameters of a key loop, so that data guarantee is provided for large data process optimization, and therefore, a refining enterprise can be assisted in better guaranteeing the product quality in a data driving mode.

In a certain refining reaction kettle, the chromaticity is obtained by testing the reaction liquid of the discharge of the certain ammoximation reaction kettle, indexes such as selectivity and conversion rate are used as the judging basis of the product quality, and the concentrations of hydrogen peroxide and a catalyst are used as the judging basis of the process load and the working condition. The main key process variables in the reaction kettle comprise 12 output measured values of a flow, liquid level, temperature and other loops, and the sampling frequency is 60s for offline training and collecting data of 3-5 months aiming at the 12 key variables.

After the data sampling is completed, data preprocessing is needed first, and the data length of effective sampling of the above 12 key variables is 123769. And forming parameters of 12 key variables at the same time into a data vector to form an initial data matrix for large data process optimization.

The off-line training part of big data process optimization is mainly divided into two steps, namely algorithm training and mechanism fusion and screening.

Firstly, reasonably segmenting the data, and after the segmentation number is determined, clustering K-means as the first stage of algorithm training. The K-means clustering algorithm needs to select the optimal clustering number to achieve the optimal clustering effect so as to ensure the effectiveness and accuracy of the algorithm. If the number of the data samples is not large, the clustering number can be selected through manual experience; if the number of the data samples is huge, an optimal cluster value is selected through an inflection point method: the square sum of errors of various types under different clustering numbers is calculated, and as the clustering number is increased, the points in the types are reduced, and the square sum of errors is reduced. And observing the slope of the curve, and when an obvious inflection point appears, obtaining the corresponding cluster number as the optimal cluster value. Taking the 12-dimensional data matrix formed by the above example as an example, the data is divided into 5 segments, and taking the 2-segment data as an example, the optimal image is selected as shown in fig. 9. Wherein, the ordinate represents the sum of squares of errors in the clusters, and the abscissa represents the value of the number of clusters.

The image shows that the clustering number is a remarkable inflection point when the clustering number is 4 to 5, namely the optimal clustering number of the control loop is 5. The optimal cluster number K=5 is manually input, and the algorithm training is completed in the first stage. And entering the second stage of algorithm training, namely training the SVDD super-sphere model.

Labeling different types of model parameters clustered by a K-means clustering algorithm, sequencing the model parameters according to the number of the model parameters contained in each type from large to small, and establishing an SVDD super-sphere model. All data of the above data matrix are aggregated into 5 classes, so 5 SVDD hypersphere models need to be built. First, SVDD super sphere model parameters are determined, including: selecting a Gaussian kernel function as an SVDD kernel function; the width of the hypersphere kernel function is 12; the super sphere threshold is 1.5. Secondly, in the SVDD hypersphere modeling process, a small amount of negative samples are required to be added into the positive samples to prevent the occurrence of the overfitting phenomenon. Therefore, two samples are taken from each hypersphere except the Kth hypersphere as negative samples, so that the Kth hypersphere fuses the comprehensive training set formed by the positive and negative samples.

After the super-sphere training is completed, visual analysis is needed, wherein a 5_2-segment super-sphere visual distribution diagram is shown in fig. 10, and the super-spheres in the segment are compact and reasonable in distribution and can be reserved.

In addition, taking a visual image drawn with a part of training results as an example, a visual image in which discarding is required is shown. As shown in fig. 11, in the 3 rd superball of the 3_2 section, two distributions are shown in training, and the distributions are more dispersed, so that the distributions are omitted; the 5 th and 6 th hyper-spheres obviously contain fewer data samples, so that the data samples are discarded, and other hyper-spheres can be reserved. As shown in fig. 12, the visual image of the 2 nd super sphere in the 3_3 section is obviously abnormal, contains many cases with the number of bits of data being 0 and is more scattered, so that the super sphere is omitted, and other super spheres can be reserved.

Taking the 5_2 segment data as an example, according to the distance between the sphere center of each reserved SVDD super sphere and the test data, the test set is replaced into the super sphere for analysis, the median of the test data vector contained in each super sphere is solved, the test data vector closest to the obtained median is searched in a traversing way, and the time, the position and the specific value of the data vector corresponding to the most representative data vector can be extracted, wherein the result is shown in the table 1.

Table 1, section 5_2 algorithm training results example

So far, the algorithm training is completed, and then a mechanism fusion and screening part is entered.

It should be noted that in an actual refining apparatus, after the parameters of the key process variables are adjusted, the effect is not immediately reflected in the output quality index, but there is a certain time delay, and the delay of the reaction kettle is different according to the different reaction kettles, and the delay time of most reaction kettles ranges from about 4 hours to 48 hours. Therefore, in the process of artificial mechanism fusion and screening, the average value of the test results of the quality index and the working condition index in the delay time after the time stamp of the process point is extracted in the algorithm training process is needed, and whether the conditions of excellent parameter points are met or not is observed to determine whether the test results are stored in the recommendation table or not. In this example, the actual delay may reach 24-48 hours, and the average value of the quality index and the working condition index test result within 48 hours after the time stamp is selected to determine whether the quality index and the working condition index test result are valid.

Specifically, the product quality index, namely chromaticity, selectivity, conversion rate and working condition index, namely hydrogen peroxide and catalyst concentration of a typical data sampling point obtained by the 5_2 segment data-retained hypersphere in the algorithm training process are extracted independently, as shown in table 2.

Table 2, section 5_2 algorithm training results correspond to product quality index and working conditions

At this time, analysis of mechanism fusion and screening can be performed, firstly, quality indexes near all time stamps are observed, wherein quality indexes near typical process points extracted by No. 4 hypersphere have large fluctuation, so that the quality indexes of the extracted parameters are difficult to ensure, and are directly discarded; next, it is checked whether the situation where the bit number data is zero occurs at the remaining 4 typical process points, which does not occur in the example of the present piece of data, so the above 4 parameter points continue to be reserved.

Then, the similarity comparison is carried out, and the position data of the No. 2 superball and the No. 3 superball are similar, and the quality indexes are similar, so that the data are reserved. Finally, the quality index is observed, and the parameter points with poor quality index are removed, in this example, the quality index corresponding to the typical process point extracted by the No. 5 hyper-sphere is 181.18, and the quality index is poor (the lower the chromaticity is, the selectivity and the conversion rate are more than 99.5, which belongs to the condition of good product quality), so the typical process point is discarded. And finally, storing the serial numbers 1, 2 and 3 of the reserved typical process points into a recommendation table as excellent parameter points. From the table it can be derived that: the quality indexes of the No. 1 parameter points are about No. 134.86,2 and No. 3, the quality indexes are all at good level, the fluctuation of the quality indexes and the position data near the time stamp is small, the reliability of the extracted excellent process parameters is ensured, and the rationality and the effectiveness of the method are further proved.

And storing the hyperspheres and the internal data of the hyperspheres and the recommended data of different working conditions in the database, so as to update the optimization part online. Connecting the database with an enterprise real-time database, identifying the current working condition, extracting process bit number data, obtaining excellent process point parameters stored in the current working condition from a recommended table, calculating the distance between the current bit number data and each excellent process parameter, and selecting an excellent parameter point with the minimum distance for output hot standby.

In addition, in the long-term use process, through an online learning function, the super sphere data characteristics and the recommendation table are updated once every month. Selecting a DCS bit number data set of a month, preprocessing, sequentially judging which super sphere falls into, and storing and updating the data falling into the super sphere; and the existing hypersphere is retrained to update the hypersphere characteristics and extract the latest typical process points, and then the optimal process parameter points are extracted by artificial mechanism fusion and screening to update and expand the recommendation table. If most of the data set in the past month does not fall into the existing hypersphere, training in the same mode as offline training for data record and formation of a new data matrix, namely, after clustering the data matrix again to obtain a clustering number, extracting SVDD characteristics for each class, substituting the data which does not fall into the existing hypersphere in the past period into the data, taking out representative process parameter points by adopting a median method, then carrying out artificial mechanism fusion and screening, matching time stamps, corresponding working conditions and quality data, and selecting out the process parameter points with better quality from the data and storing the process parameter points into a recommended table.

The refining device process optimization method and system based on big data have unique advantages in processing multivariable and huge data, and the classification scheme can be well given in an unsupervised iteration mode through a K-means clustering method although the reasons for influencing the product quality in the chemical process are various and can not be manually and directly analyzed and can only be adjusted through experience. In addition, based on the current classification result, the SVDD (singular value decomposition) superball training method is utilized to enable positive samples in training samples to be contained as much as possible, so that the positive and negative samples are separated to the greatest extent, and the optimal boundary is sought to achieve the most accurate division of data. It is worth noting that in the actual training process, a small amount of negative samples are added in the positive sample training set to prevent the overfitting phenomenon, so that the accuracy of data analysis is further ensured. Meanwhile, visual analysis is added, so that the distribution of the SVDD super spheres is displayed more clearly, and a reliable basis is provided for further data mining and super sphere screening.

After the algorithm training process is completed, a process of artificial mechanism fusion and screening can be performed, so that the quality of the selection of typical process parameter points is ensured. The quality index and the bit data of the typical process point are re-screened from three aspects of data abnormality, data fluctuation and similarity comparison, and finally the optimal parameter point is obtained, so that the effectiveness and the rationality of the overall technical scheme are further improved, and a solid foundation is laid for guiding actual production.

Meanwhile, the online updating and optimizing part is added, so that a real-time optimizing scheme can be provided in real time, data is updated and expanded periodically, the comprehensiveness and the comprehensiveness of the super-sphere cluster are maintained, the super-sphere cluster can cover various working conditions and various optimizing points, and the long-term effectiveness of the method is guaranteed.

In summary, aiming at the situations that most of industrial data are accumulated, the utilization rate is low and the industrial data are difficult to utilize at present, the data can be perfectly utilized by using a big data process optimization algorithm, so that a higher-quality product is provided for the process, and the economic benefit is improved.

Compared with the prior art, the invention has the following advantages:

the traditional process optimization finds the optimal process operation condition by establishing a mechanism model and deducing a mathematical model of a process, the method is very difficult to realize although the accuracy is very high, the actual process mechanism is complex, the variable factors are many, an accurate mathematical model cannot be given, and the problem which can be solved is relatively single. In addition, most of common data driving optimization methods are one-time calculation, and the optimization of enterprises for a long time and independently is difficult to meet under the actual condition that production scheduling plans often change. In contrast, the advantages of the present invention are summarized as follows:

(1) The invention does not need an accurate mathematical model and does not need to consider the problem of model mismatch;

(2) The online updating and optimizing part can ensure to cover various optimizing points under various working conditions, effectively update and expand, maintain comprehensiveness and comprehensiveness, and realize long-term autonomous optimization and effectiveness of enterprises;

(3) The invention has strong adaptability in various environments and processes, and has the capability of continuous improvement and deep training optimization;

(4) According to the invention, massive historical time sequence data are effectively utilized, deep analysis and mining are carried out on the historical time sequence data, valuable information can be rapidly obtained, and a technological parameter optimization scheme for popularization is formed;

(5) The method combines a visual analysis method, more intuitively displays the data characteristics and provides a reliable basis for further data mining and data screening.

Example 3

In contrast to the above-described embodiment 1, which provides a process optimization method for a refining apparatus based on big data, this embodiment provides a process optimization system for a refining apparatus based on big data. The system provided in this embodiment may implement the process optimization method of the refining apparatus based on big data in embodiment 1, and the system may be implemented by software, hardware or a combination of software and hardware. For example, the system may include integrated or separate functional modules or functional units to perform the corresponding steps in the methods of embodiment 1. Since the system of this embodiment is substantially similar to the method embodiment, the description of this embodiment is relatively simple, and the relevant points may be found in part in the description of embodiment 1, which is provided by way of illustration only.

The refining device process optimization system based on big data provided in this embodiment includes:

In some embodiments, the system further includes an online updating module, configured to acquire a working condition and process bit number data set in a preset time period, expand and update the super-sphere cluster based on the acquired data set, and obtain an updated excellent parameter recommendation table for different working conditions.

Example 4

The present embodiment provides a processing device corresponding to the process optimization method of the refining apparatus based on big data provided in the present embodiment 1, where the processing device may be a processing device for a client, for example, a mobile phone, a notebook computer, a tablet computer, a desktop computer, etc., to execute the method of embodiment 1.

The processing device comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus so as to complete communication among each other. The memory stores a computer program that can be executed on the processor, and when the processor executes the computer program, the process optimization method for the refining apparatus based on big data provided in this embodiment 1 is executed.

In some embodiments, the memory may be a high-speed random access memory (RAM: random Access Memory), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

In other embodiments, the processor may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or other general purpose processor, which is not limited herein.

Example 5

The big data based refining apparatus process optimization method of this embodiment 1 may be embodied as a computer program product, which may include a computer readable storage medium having computer readable program instructions loaded thereon for performing the big data based refining apparatus process optimization method of this embodiment 1.

The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination of the preceding.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims

1. A refining device process optimization method based on big data is characterized by comprising the following steps:

2. The process optimization method for the refining device based on big data as set forth in claim 1, wherein the excellent parameter recommendation table under different working conditions is obtained by training K-means and SVDD super-spheres by using historical data of the refining device in an enterprise, and comprises the following steps:

1.1 Collecting historical data of an enterprise refining device, preprocessing the historical data, segmenting the data, and forming an input data matrix;

3. The process optimization method for the refining device based on big data as set forth in claim 2, wherein in the step 1.1), the steps of collecting and preprocessing the history data of the enterprise refining device, and forming the input data matrix after segmenting the data include:

determining an initial segmentation number and segmenting the data;

4. The process optimization method of the refining device based on big data as set forth in claim 2, wherein in the step 1.2), for each segment of input data matrix of data, after classifying by adopting an improved K-means clustering method, performing hypersphere training by using an SVDD method, and discarding abnormal hyperspheres by combining with visual analysis to obtain typical process parameter points with different hyperspheres and time stamps, the method comprises the following steps:

5. The process optimization method of the refining device based on big data as set forth in claim 2, wherein in the step 1.4), based on the obtained typical process parameter points, the artificial mechanism fusion and screening are performed, that is, by matching the extracted typical process points with time stamps, working conditions and quality data, the process parameter points with quality meeting the preset requirements are selected and stored in an excellent parameter recommendation table, and the method comprises the following steps:

6. The process optimization method for the refining device based on big data as set forth in claim 1, wherein the process parameters of the refining device under the current working condition are optimized by selecting suitable excellent parameter points from a pre-established excellent parameter recommendation table of different working conditions to output based on the current working condition and the process position data, and the process parameters of the refining device under the current working condition include:

7. The process optimization method for the refining device based on big data as set forth in claim 2, wherein the method comprises the steps of: acquiring working condition and process position number data sets in a preset time period, expanding and updating the superball cluster based on the acquired data sets, and obtaining updated excellent parameter recommendation tables of different working conditions; comprising the following steps:

8. A process optimization system for a refining device based on big data, comprising:

The data acquisition module is used for acquiring current working condition and process position data of the refining device from the upper cloud platform or the DCS;

9. A computer readable storage medium storing one or more programs, wherein the one or more programs comprise instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-7.

10. A computing device, comprising: one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-7.