CN114661802B

CN114661802B - Efficient collection and analysis system and method for factory equipment data

Info

Publication number: CN114661802B
Application number: CN202210087098.3A
Authority: CN
Inventors: 张华成; 刘建明
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2024-04-05
Anticipated expiration: 2042-01-25
Also published as: CN114661802A

Abstract

The invention relates to a high-efficiency collection and analysis system and method for factory equipment data, which solve the technical problem of low transmission efficiency by adopting an intermediate node which is connected with a sensor and comprises the sensor for collecting factory equipment original data; the intermediate node is loaded with a data preprocessing program, which comprises the steps of extracting the characteristics of data to be preprocessed, and searching similar data in an intermediate node historical transmission database, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted; calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target; defining data to be transmitted by using a similar data tag and a transformation compensation sequence; comparing the size of the data to be preprocessed with the size of the data to be transmitted, wherein the data to be processed is smaller than the data to be transmitted; the technical scheme for judging the sizes of the data to be transmitted and the transmission threshold value well solves the problem and can be used for data acquisition of factory equipment.

Description

Efficient collection and analysis system and method for factory equipment data

Technical Field

The invention relates to the field of electronic intelligent manufacturing, in particular to a system and a method for efficiently collecting and analyzing factory equipment data.

Background

The real-time collection and transmission of the equipment state data are key links of equipment operation and fault diagnosis, the extraction and fusion of various heterogeneous data source data in enterprises are the basis of the analysis and calculation of enterprise intelligent BI indexes, however, a better method is still lacking in how to collect and extract data efficiently, and the representation and aggregation of heterogeneous data, structural and unstructured data and the like are entered.

The invention provides a high-efficiency collection and analysis system and method for factory equipment data, which can solve the technical problem of low transmission efficiency.

Disclosure of Invention

The invention aims to solve the technical problems of low transmission efficiency and high expenditure in the prior art. The novel efficient collecting and analyzing system for the plant equipment data has the characteristic of high transmission efficiency.

In order to solve the technical problems, the technical scheme adopted is as follows:

a plant data efficient collection analysis system, the plant data efficient collection analysis system comprising: the sensor for collecting the original data of the factory equipment is connected with an intermediate node of the sensor; the intermediate node is loaded with a data preprocessing program, and the data preprocessing program comprises:

step 1, an intermediate node receives data to be preprocessed;

step 2, extracting characteristics of the data to be preprocessed, searching and matching similar data in an intermediate node historical transmission database, and recording similar data labels, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step 5;

step 3, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;

step 4, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the step 5 if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;

step 5, judging the size of the data to be transmitted and the transmission threshold value, returning to execute the step 1 if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step 6;

and 6, the intermediate node performs compression coding on the data to be transmitted and outputs the data.

The working principle of the invention is as follows: in the invention, the sensor in a certain area is set as a domain, and an intermediate node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exists, new data is not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.

In the above solution, not optimized, further, step 2 includes:

step 2.1, defining a feature extraction frame, wherein the width of the feature extraction frame is w;

step 2.2, traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;

and 2.3, matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.

The width w of the feature extraction frame is preset according to specific situations, and the precision and the efficiency are set comprehensively.

Further, in step 2, the data to be preprocessed is analog data, including:

step (1), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;

step (2) of detecting N from the new characteristic curveThe maximum or minimum is denoted as { (v) _i ,t _i ) I=0, 1,., N }, where N is a natural number greater than 3;

step (3), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) _i ,Δt _i )|i＝1,2,...,N}；

Step (4), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;

w＝(max(Δt _i )-min(Δt _i ))×p；

wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;

step (5) of determining a peak threshold range (V ₁ ,V ₂ ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan ₁ ,T ₂ )；

Step (6), peak threshold range (V ₁ ,V ₂ ) And a time interval threshold range (T ₁ ,T ₂ ) The formed area is defined as a trusted area of the standard feature points;

step (7), updating and defining the curve formed by the trusted areas of the standard feature points as an optimized feature curve function T=(s) ₁ ,s ₂ ,...,s _n ) Wherein n is the length of the correction characteristic curve;

step (8), the history data feature S= (S) passing through the steps (1) to (7) in the intermediate node history transmission database ₁ ,s ₂ ,...,s _m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.

In the preferred scheme, for the analog signal curve, the mode of gathering and judging the trusted points and reconstructing the curve after removing the untrusted points is adopted, so that the precision is improved. The difficulty of consistency judgment is also reduced.

Further, step 3 includes;

step 3.1, determining source data sub-elements in source data and target data sub-elements in target data by taking a feature extraction frame as a unit;

step 3.2, defining a transformation equation as: e=d+ηxs; the compensation equation is: i "=αi' +β;

wherein eta is a preset weight value, I ' represents a source data sub-element after deformation, and I ' ' represents a source data sub-element after amplitude compensation; connectivity s of adjacent source data sub-elements, s=0 representing non-connectivity, s=1 representing connectivity;

step 3.3, determining the distance d from the source data subelement to the target data subelement, and the amplitude compensation coefficients alpha and beta;

and 3.4, defining a source data sub-element label, a target data sub-element label, a distance transformation parameter d, a connectivity s, an amplitude transformation parameter alpha and beta as a transformation compensation sequence.

Further, the data preprocessing program further includes:

and 7, judging whether the intermediate node database overflows or not by the intermediate node, and if so, eliminating the library data in the intermediate node historical transmission database according to the warehousing sequence.

The invention also provides a high-efficiency collection and analysis method for the plant equipment data, which is based on the high-efficiency collection and analysis system for the plant equipment data, and comprises the following steps:

step one, an intermediate node receives factory equipment data collected by a sensor and defines the factory equipment data as data to be preprocessed;

step two, the intermediate node extracts the characteristics of the data to be preprocessed, searches and matches similar data in an intermediate node historical transmission database, records similar data labels, and uses the similar data as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the fifth step;

step three, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;

step four, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the fifth step if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;

step five, judging the sizes of the data to be transmitted and the transmission threshold value, returning to execute the step one if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step six;

step six, the intermediate node compresses and codes the data to be transmitted and outputs the data to the data receiving end;

step seven, after receiving the data, the data receiving end decodes the data and then segments the data, judging whether the data segment has a transformation compensation sequence, if so, executing the step eight, otherwise, executing the step nine;

step eight, determining reference data according to the transformation compensation sequence parameters, and solving data to be preprocessed;

and step nine, carrying out subsequent processing on the data.

Further, the second step includes:

defining a feature extraction frame, wherein the width of the feature extraction frame is w;

traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;

and (C) matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.

Further, the data to be preprocessed in the second step is analog data, including:

step (a), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;

step (b), N maxima or minima are detected from the new characteristic curve and designated as { (v) _i ,t _i ) I=0, 1,., N }, where N is a natural number greater than 3;

step (c), calculating the time difference between adjacent maximum values or minimum values to obtain an extremum interval database

{(v _i ,Δt _i )|i＝1,2,...,N}；

Step (d), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;

w＝(max(Δt _i )-min(Δt _i ))×p；

step (e) of determining a peak threshold range (V ₁ ,V ₂ ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan ₁ ,T ₂ )；

Step (f) of setting the peak threshold range (V ₁ ,V ₂ ) And a time interval threshold range (T ₁ ,T ₂ ) The formed area is defined as a trusted area of the standard feature points;

step (g), updating and defining a curve composed of the trusted areas of the standard feature points as an optimized feature curve function T=(s) ₁ ,s ₂ ,...,s _n ) Wherein n is the length of the correction characteristic curve;

step (h), the history data features S= (S) passing through the steps (a) - (g) in the intermediate node history transmission database ₁ ,s ₂ ,...,s _m ) M is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of the preset threshold value, and if not, the consistency contrast is judged to be consistent

The invention has the beneficial effects that: in the invention, the sensor in a certain area is set as a domain, and an intermediate node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exists, new data is not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.

Drawings

The invention will be further described with reference to the drawings and examples.

FIG. 1 is a schematic diagram of a system for efficient collection and analysis of plant data.

FIG. 2 is a schematic flow chart of a data preprocessing procedure.

Detailed Description

The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Example 1

The embodiment provides a high-efficiency collection and analysis system for plant equipment data, as shown in fig. 1, the high-efficiency collection and analysis system for plant equipment data comprises: the sensor for collecting the original data of the factory equipment is connected with an intermediate node of the sensor; the intermediate node is loaded with a data preprocessing program, as shown in fig. 2, where the data preprocessing program includes:

step 1, an intermediate node receives data to be preprocessed;

In the embodiment, the sensor in a certain area is set as a domain, and a medium node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exist, new data are not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.

Preferably, step 2 comprises:

Preferably, in step 2, the data to be preprocessed is analog data, including:

step (2), N maxima or minima are detected from the new characteristic curve and are designated as { (v) _i ,t _i ) I=0, 1,., N }, where N is a natural number greater than 3;

w＝(max(Δt _i )-min(Δt _i ))×p；

Preferably, step 3 comprises;

Preferably, the data preprocessing program further includes:

The embodiment also provides a method for efficiently collecting and analyzing the plant equipment data, which is based on the system for efficiently collecting and analyzing the plant equipment data, and comprises the following steps:

and step nine, carrying out subsequent processing on the data.

Preferably, the second step includes:

Preferably, the data to be preprocessed in the second step is analog data, including:

step (c), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) _i ,Δt _i )|i＝1,2,...,N}；

w＝(max(Δt _i )-min(Δt _i ))×p；

step (h), the history data features S= (S) passing through the steps (a) - (g) in the intermediate node history transmission database ₁ ,s ₂ ,...,s _m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.

The embodiment also adopts the collection and aggregation of multi-source operation and data sources: web Services technology is employed in conjunction with data warehouse to integrate distributed heterogeneous PDM, ERP and MES multisource operation and data resources. The method has the advantages that each heterogeneous data source of PDM, ERP and MES is packaged through Web Services, the problem of interoperation of the heterogeneous data sources is solved, PDM, ERP, MES source data is extracted, converted, cleaned, loaded (ETL) and integrated under the condition that autonomy of each data source is not affected, the PDM, ERP, MES source data is loaded into a data warehouse, establishment of the data warehouse is achieved, data information of the heterogeneous data sources is uniformly managed through the data warehouse, and data integration is achieved.

Acquisition aggregation system architecture design for multi-source operation and maintenance data source

The embodiment proposes a mode of integrating the data of each heterogeneous multi-source operation and data source of PDM, ERP and MES based on a data warehouse. In order to solve the communication problem of PDM, ERP and MES heterogeneous data sources, a Web Services technology irrelevant to a platform and a language is adopted to realize communication among the data sources dispersed in different networks, and the data source isomerism necessarily causes the data isomerism, especially the semantic isomerism among the data source data is prominent, so that the difference among the data is shielded by means of the mode mapping of X ML standard and the data source. The system architecture of the multi-source operation and maintenance data source collection and aggregation system is designed to be divided into three layers by combining the requirements and the data characteristics of the multi-source operation and maintenance data source collection and aggregation system and integrating the whole system environment and the current technical background, and the system architecture is sequentially as follows from bottom to top: a data source layer, a data integration layer and a data warehouse layer.

(1) And a data source layer. The data source layer is a key layer of the whole system, the data sources are data integration objects, the multi-source operation and data sources are provided through the PDM, ERP and MES systems, different data sources provide various data resources related to product design, process and manufacture, and the data integration purpose is achieved by collecting and integrating multiple aspects of data to the greatest extent. The data provided by the data sources mainly comprises material inventory and consumption data, equipment and tooling status data, personnel data, production process data, production plan data and the like, which can effectively provide services for OPM product design and manufacture. (2) And a data integration layer. The data integration layer is the most critical layer of the acquisition and aggregation system of the multi-source operation and maintenance data source, is the core of the whole system, and mainly comprises two modules: the Web Services data access module and the Quartz scheduling management module. The system acquires data resources by calling different Web services, and the data contents are encapsulated in an XML document, so that the XML Schema is firstly required to be used for verifying whether the XML document structure and the data contents accord with the predefined definition. And analyzing and loading the data which is qualified after verification into a data warehouse. The Quartz is mainly used for registering accessed Web services, generating the services into tasks, setting period calling time for each task, and restarting the Quartz service, so that the Web service is periodically called to acquire updated data of a data source, and the data of the data source and a data warehouse are kept consistent.

(3) And a data warehouse layer. The data warehouse is a layer of multi-source operation and data source collection and aggregation achievement, mainly stores data which come from different data sources and meet the requirements of data integration subjects, and provides continuous data resources for the design and manufacture of OPM products.

Data warehouse module architecture design

The data warehouse is the integration of a plurality of data sources of PDM, ERP and MES, and the system scalability is considered, and the architecture adopts a flexible and extensible multi-layer system architecture which can support large-scale data storage and simultaneously has good support for the system expansion. The architecture of a data warehouse is critical in the construction of data warehouse modules, and is divided into two parts: data warehouse logical architecture and data warehouse physical architecture.

(1) Data warehouse logic architecture design

The data warehouse logical architecture consists of four layers: source system, DW (Data warehouses), DM (DataMart), report and analysis, wherein the DW layer can be subdivided into two parts: EDW (Enterprise DataWarehouse ) and ODS (Operational Data Store, operational database), each layer is copied and transformed from its next layer. The design of DWs is based on analysis of business logic, while the design of DM layers is directed to different specific topics.

Data warehouse physical architecture design

EDW, ODS and DM in the data warehouse are designed to meet different requirements, respectively: the EDW stores all historical data, the ODS stores data over a period of time, and the DM is an aggregate of detailed data of the underlying data warehouse. They also differ in their requirements for data processing, so that separate storage schemes (schemes) are created at design time.

Web Services data access module design

The Web Services data access module is a core module of the multi-source operation and data source acquisition and aggregation system and is a key for successfully collecting and integrating data. And establishing communication with heterogeneous data sources through Web service, realizing data exchange through SOAP protocol, generating SQL sentences after verifying and analyzing the acquired XML document data, and storing the data into a data warehouse by an application program through calling a JDBC interface to finish data integration.

While the foregoing describes the illustrative embodiments of the present invention so that those skilled in the art may understand the present invention, the present invention is not limited to the specific embodiments, and all inventive innovations utilizing the inventive concepts are herein within the scope of the present invention as defined and defined by the appended claims, as long as the various changes are within the spirit and scope of the present invention.

Claims

1. The utility model provides a mill's equipment data high-efficient collection analytic system which characterized in that: the high-efficiency collection and analysis system for the plant equipment data comprises a sensor for collecting the plant equipment raw data and an intermediate node connected with the sensor; the intermediate node is loaded with a data preprocessing program, and the data preprocessing program comprises:

step 1, an intermediate node receives data to be preprocessed;

2. The plant data efficient collection and analysis system of claim 1, wherein: the step 2 comprises the following steps:

3. The plant data efficient collection and analysis system of claim 1, wherein: in step 2, the data to be preprocessed is analog data, including:

w＝(max(Δt _i )-min(Δt _i ))×p；

step (8), the historical data feature S= (S) passing through the steps (1) to (7) in the intermediate node historical transmission database ₁ ,s ₂ ,...,s _m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.

4. The plant data efficient collection and analysis system according to claim 2 or 3, wherein: step 3 includes;

and 3.4, defining a source data sub-element label, a target data sub-element label, a distance transformation parameter d, connectivity s, an amplitude transformation parameter alpha and beta as a transformation compensation sequence.

5. The plant data efficient collection and analysis system of claim 1, wherein: the data preprocessing program further includes:

6. A method for efficiently collecting and analyzing factory equipment data is characterized in that: the high-efficiency collection and analysis method for the plant equipment data is based on the high-efficiency collection and analysis system for the plant equipment data according to any one of claims 1 to 5, and comprises the following steps:

step two, the intermediate node extracts the characteristics of the data to be preprocessed, searches and matches similar data in the intermediate node historical transmission database, records similar data labels, and uses the similar data as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step five;

and step nine, carrying out subsequent processing on the data.

7. The method for efficient collection and analysis of plant data according to claim 6, wherein:

the second step comprises:

8. The method for efficient collection and analysis of plant data according to claim 6, wherein: in the second step, the data to be preprocessed is analog data, including:

w＝(max(Δt _i )-min(Δt _i ))×p；

step (h), the historical data feature S= (S) passing through the steps (a) to (g) in the intermediate node historical transmission database ₁ ,s ₂ ,...,s _m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.