CN114661802B - Efficient collection and analysis system and method for factory equipment data - Google Patents

Efficient collection and analysis system and method for factory equipment data Download PDF

Info

Publication number
CN114661802B
CN114661802B CN202210087098.3A CN202210087098A CN114661802B CN 114661802 B CN114661802 B CN 114661802B CN 202210087098 A CN202210087098 A CN 202210087098A CN 114661802 B CN114661802 B CN 114661802B
Authority
CN
China
Prior art keywords
data
preprocessed
defining
transmitted
intermediate node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210087098.3A
Other languages
Chinese (zh)
Other versions
CN114661802A (en
Inventor
张华成
刘建明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210087098.3A priority Critical patent/CN114661802B/en
Publication of CN114661802A publication Critical patent/CN114661802A/en
Application granted granted Critical
Publication of CN114661802B publication Critical patent/CN114661802B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention relates to a high-efficiency collection and analysis system and method for factory equipment data, which solve the technical problem of low transmission efficiency by adopting an intermediate node which is connected with a sensor and comprises the sensor for collecting factory equipment original data; the intermediate node is loaded with a data preprocessing program, which comprises the steps of extracting the characteristics of data to be preprocessed, and searching similar data in an intermediate node historical transmission database, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted; calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target; defining data to be transmitted by using a similar data tag and a transformation compensation sequence; comparing the size of the data to be preprocessed with the size of the data to be transmitted, wherein the data to be processed is smaller than the data to be transmitted; the technical scheme for judging the sizes of the data to be transmitted and the transmission threshold value well solves the problem and can be used for data acquisition of factory equipment.

Description

Efficient collection and analysis system and method for factory equipment data
Technical Field
The invention relates to the field of electronic intelligent manufacturing, in particular to a system and a method for efficiently collecting and analyzing factory equipment data.
Background
The real-time collection and transmission of the equipment state data are key links of equipment operation and fault diagnosis, the extraction and fusion of various heterogeneous data source data in enterprises are the basis of the analysis and calculation of enterprise intelligent BI indexes, however, a better method is still lacking in how to collect and extract data efficiently, and the representation and aggregation of heterogeneous data, structural and unstructured data and the like are entered.
The invention provides a high-efficiency collection and analysis system and method for factory equipment data, which can solve the technical problem of low transmission efficiency.
Disclosure of Invention
The invention aims to solve the technical problems of low transmission efficiency and high expenditure in the prior art. The novel efficient collecting and analyzing system for the plant equipment data has the characteristic of high transmission efficiency.
In order to solve the technical problems, the technical scheme adopted is as follows:
a plant data efficient collection analysis system, the plant data efficient collection analysis system comprising: the sensor for collecting the original data of the factory equipment is connected with an intermediate node of the sensor; the intermediate node is loaded with a data preprocessing program, and the data preprocessing program comprises:
step 1, an intermediate node receives data to be preprocessed;
step 2, extracting characteristics of the data to be preprocessed, searching and matching similar data in an intermediate node historical transmission database, and recording similar data labels, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step 5;
step 3, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step 4, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the step 5 if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step 5, judging the size of the data to be transmitted and the transmission threshold value, returning to execute the step 1 if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step 6;
and 6, the intermediate node performs compression coding on the data to be transmitted and outputs the data.
The working principle of the invention is as follows: in the invention, the sensor in a certain area is set as a domain, and an intermediate node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exists, new data is not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.
In the above solution, not optimized, further, step 2 includes:
step 2.1, defining a feature extraction frame, wherein the width of the feature extraction frame is w;
step 2.2, traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and 2.3, matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
The width w of the feature extraction frame is preset according to specific situations, and the precision and the efficiency are set comprehensively.
Further, in step 2, the data to be preprocessed is analog data, including:
step (1), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (2) of detecting N from the new characteristic curveThe maximum or minimum is denoted as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (3), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) i ,Δt i )|i=1,2,...,N};
Step (4), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (5) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (6), peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (7), updating and defining the curve formed by the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (8), the history data feature S= (S) passing through the steps (1) to (7) in the intermediate node history transmission database 1 ,s 2 ,...,s m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.
In the preferred scheme, for the analog signal curve, the mode of gathering and judging the trusted points and reconstructing the curve after removing the untrusted points is adopted, so that the precision is improved. The difficulty of consistency judgment is also reduced.
Further, step 3 includes;
step 3.1, determining source data sub-elements in source data and target data sub-elements in target data by taking a feature extraction frame as a unit;
step 3.2, defining a transformation equation as: e=d+ηxs; the compensation equation is: i "=αi' +β;
wherein eta is a preset weight value, I ' represents a source data sub-element after deformation, and I ' ' represents a source data sub-element after amplitude compensation; connectivity s of adjacent source data sub-elements, s=0 representing non-connectivity, s=1 representing connectivity;
step 3.3, determining the distance d from the source data subelement to the target data subelement, and the amplitude compensation coefficients alpha and beta;
and 3.4, defining a source data sub-element label, a target data sub-element label, a distance transformation parameter d, a connectivity s, an amplitude transformation parameter alpha and beta as a transformation compensation sequence.
Further, the data preprocessing program further includes:
and 7, judging whether the intermediate node database overflows or not by the intermediate node, and if so, eliminating the library data in the intermediate node historical transmission database according to the warehousing sequence.
The invention also provides a high-efficiency collection and analysis method for the plant equipment data, which is based on the high-efficiency collection and analysis system for the plant equipment data, and comprises the following steps:
step one, an intermediate node receives factory equipment data collected by a sensor and defines the factory equipment data as data to be preprocessed;
step two, the intermediate node extracts the characteristics of the data to be preprocessed, searches and matches similar data in an intermediate node historical transmission database, records similar data labels, and uses the similar data as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the fifth step;
step three, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step four, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the fifth step if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step five, judging the sizes of the data to be transmitted and the transmission threshold value, returning to execute the step one if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step six;
step six, the intermediate node compresses and codes the data to be transmitted and outputs the data to the data receiving end;
step seven, after receiving the data, the data receiving end decodes the data and then segments the data, judging whether the data segment has a transformation compensation sequence, if so, executing the step eight, otherwise, executing the step nine;
step eight, determining reference data according to the transformation compensation sequence parameters, and solving data to be preprocessed;
and step nine, carrying out subsequent processing on the data.
Further, the second step includes:
defining a feature extraction frame, wherein the width of the feature extraction frame is w;
traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and (C) matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
Further, the data to be preprocessed in the second step is analog data, including:
step (a), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (b), N maxima or minima are detected from the new characteristic curve and designated as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (c), calculating the time difference between adjacent maximum values or minimum values to obtain an extremum interval database
{(v i ,Δt i )|i=1,2,...,N};
Step (d), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (e) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (f) of setting the peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (g), updating and defining a curve composed of the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (h), the history data features S= (S) passing through the steps (a) - (g) in the intermediate node history transmission database 1 ,s 2 ,...,s m ) M is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of the preset threshold value, and if not, the consistency contrast is judged to be consistent
The invention has the beneficial effects that: in the invention, the sensor in a certain area is set as a domain, and an intermediate node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exists, new data is not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.
Drawings
The invention will be further described with reference to the drawings and examples.
FIG. 1 is a schematic diagram of a system for efficient collection and analysis of plant data.
FIG. 2 is a schematic flow chart of a data preprocessing procedure.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
The embodiment provides a high-efficiency collection and analysis system for plant equipment data, as shown in fig. 1, the high-efficiency collection and analysis system for plant equipment data comprises: the sensor for collecting the original data of the factory equipment is connected with an intermediate node of the sensor; the intermediate node is loaded with a data preprocessing program, as shown in fig. 2, where the data preprocessing program includes:
step 1, an intermediate node receives data to be preprocessed;
step 2, extracting characteristics of the data to be preprocessed, searching and matching similar data in an intermediate node historical transmission database, and recording similar data labels, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step 5;
step 3, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step 4, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the step 5 if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step 5, judging the size of the data to be transmitted and the transmission threshold value, returning to execute the step 1 if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step 6;
and 6, the intermediate node performs compression coding on the data to be transmitted and outputs the data.
In the embodiment, the sensor in a certain area is set as a domain, and a medium node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exist, new data are not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.
Preferably, step 2 comprises:
step 2.1, defining a feature extraction frame, wherein the width of the feature extraction frame is w;
step 2.2, traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and 2.3, matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
The width w of the feature extraction frame is preset according to specific situations, and the precision and the efficiency are set comprehensively.
Preferably, in step 2, the data to be preprocessed is analog data, including:
step (1), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (2), N maxima or minima are detected from the new characteristic curve and are designated as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (3), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) i ,Δt i )|i=1,2,...,N};
Step (4), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (5) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (6), peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (7), updating and defining the curve formed by the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (8), the history data feature S= (S) passing through the steps (1) to (7) in the intermediate node history transmission database 1 ,s 2 ,...,s m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.
In the preferred scheme, for the analog signal curve, the mode of gathering and judging the trusted points and reconstructing the curve after removing the untrusted points is adopted, so that the precision is improved. The difficulty of consistency judgment is also reduced.
Preferably, step 3 comprises;
step 3.1, determining source data sub-elements in source data and target data sub-elements in target data by taking a feature extraction frame as a unit;
step 3.2, defining a transformation equation as: e=d+ηxs; the compensation equation is: i "=αi' +β;
wherein eta is a preset weight value, I ' represents a source data sub-element after deformation, and I ' ' represents a source data sub-element after amplitude compensation; connectivity s of adjacent source data sub-elements, s=0 representing non-connectivity, s=1 representing connectivity;
step 3.3, determining the distance d from the source data subelement to the target data subelement, and the amplitude compensation coefficients alpha and beta;
and 3.4, defining a source data sub-element label, a target data sub-element label, a distance transformation parameter d, a connectivity s, an amplitude transformation parameter alpha and beta as a transformation compensation sequence.
Preferably, the data preprocessing program further includes:
and 7, judging whether the intermediate node database overflows or not by the intermediate node, and if so, eliminating the library data in the intermediate node historical transmission database according to the warehousing sequence.
The embodiment also provides a method for efficiently collecting and analyzing the plant equipment data, which is based on the system for efficiently collecting and analyzing the plant equipment data, and comprises the following steps:
step one, an intermediate node receives factory equipment data collected by a sensor and defines the factory equipment data as data to be preprocessed;
step two, the intermediate node extracts the characteristics of the data to be preprocessed, searches and matches similar data in an intermediate node historical transmission database, records similar data labels, and uses the similar data as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the fifth step;
step three, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step four, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the fifth step if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step five, judging the sizes of the data to be transmitted and the transmission threshold value, returning to execute the step one if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step six;
step six, the intermediate node compresses and codes the data to be transmitted and outputs the data to the data receiving end;
step seven, after receiving the data, the data receiving end decodes the data and then segments the data, judging whether the data segment has a transformation compensation sequence, if so, executing the step eight, otherwise, executing the step nine;
step eight, determining reference data according to the transformation compensation sequence parameters, and solving data to be preprocessed;
and step nine, carrying out subsequent processing on the data.
Preferably, the second step includes:
defining a feature extraction frame, wherein the width of the feature extraction frame is w;
traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and (C) matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
Preferably, the data to be preprocessed in the second step is analog data, including:
step (a), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (b), N maxima or minima are detected from the new characteristic curve and designated as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (c), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) i ,Δt i )|i=1,2,...,N};
Step (d), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (e) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (f) of setting the peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (g), updating and defining a curve composed of the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (h), the history data features S= (S) passing through the steps (a) - (g) in the intermediate node history transmission database 1 ,s 2 ,...,s m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.
The embodiment also adopts the collection and aggregation of multi-source operation and data sources: web Services technology is employed in conjunction with data warehouse to integrate distributed heterogeneous PDM, ERP and MES multisource operation and data resources. The method has the advantages that each heterogeneous data source of PDM, ERP and MES is packaged through Web Services, the problem of interoperation of the heterogeneous data sources is solved, PDM, ERP, MES source data is extracted, converted, cleaned, loaded (ETL) and integrated under the condition that autonomy of each data source is not affected, the PDM, ERP, MES source data is loaded into a data warehouse, establishment of the data warehouse is achieved, data information of the heterogeneous data sources is uniformly managed through the data warehouse, and data integration is achieved.
Acquisition aggregation system architecture design for multi-source operation and maintenance data source
The embodiment proposes a mode of integrating the data of each heterogeneous multi-source operation and data source of PDM, ERP and MES based on a data warehouse. In order to solve the communication problem of PDM, ERP and MES heterogeneous data sources, a Web Services technology irrelevant to a platform and a language is adopted to realize communication among the data sources dispersed in different networks, and the data source isomerism necessarily causes the data isomerism, especially the semantic isomerism among the data source data is prominent, so that the difference among the data is shielded by means of the mode mapping of X ML standard and the data source. The system architecture of the multi-source operation and maintenance data source collection and aggregation system is designed to be divided into three layers by combining the requirements and the data characteristics of the multi-source operation and maintenance data source collection and aggregation system and integrating the whole system environment and the current technical background, and the system architecture is sequentially as follows from bottom to top: a data source layer, a data integration layer and a data warehouse layer.
(1) And a data source layer. The data source layer is a key layer of the whole system, the data sources are data integration objects, the multi-source operation and data sources are provided through the PDM, ERP and MES systems, different data sources provide various data resources related to product design, process and manufacture, and the data integration purpose is achieved by collecting and integrating multiple aspects of data to the greatest extent. The data provided by the data sources mainly comprises material inventory and consumption data, equipment and tooling status data, personnel data, production process data, production plan data and the like, which can effectively provide services for OPM product design and manufacture. (2) And a data integration layer. The data integration layer is the most critical layer of the acquisition and aggregation system of the multi-source operation and maintenance data source, is the core of the whole system, and mainly comprises two modules: the Web Services data access module and the Quartz scheduling management module. The system acquires data resources by calling different Web services, and the data contents are encapsulated in an XML document, so that the XML Schema is firstly required to be used for verifying whether the XML document structure and the data contents accord with the predefined definition. And analyzing and loading the data which is qualified after verification into a data warehouse. The Quartz is mainly used for registering accessed Web services, generating the services into tasks, setting period calling time for each task, and restarting the Quartz service, so that the Web service is periodically called to acquire updated data of a data source, and the data of the data source and a data warehouse are kept consistent.
(3) And a data warehouse layer. The data warehouse is a layer of multi-source operation and data source collection and aggregation achievement, mainly stores data which come from different data sources and meet the requirements of data integration subjects, and provides continuous data resources for the design and manufacture of OPM products.
Data warehouse module architecture design
The data warehouse is the integration of a plurality of data sources of PDM, ERP and MES, and the system scalability is considered, and the architecture adopts a flexible and extensible multi-layer system architecture which can support large-scale data storage and simultaneously has good support for the system expansion. The architecture of a data warehouse is critical in the construction of data warehouse modules, and is divided into two parts: data warehouse logical architecture and data warehouse physical architecture.
(1) Data warehouse logic architecture design
The data warehouse logical architecture consists of four layers: source system, DW (Data warehouses), DM (DataMart), report and analysis, wherein the DW layer can be subdivided into two parts: EDW (Enterprise DataWarehouse ) and ODS (Operational Data Store, operational database), each layer is copied and transformed from its next layer. The design of DWs is based on analysis of business logic, while the design of DM layers is directed to different specific topics.
Data warehouse physical architecture design
EDW, ODS and DM in the data warehouse are designed to meet different requirements, respectively: the EDW stores all historical data, the ODS stores data over a period of time, and the DM is an aggregate of detailed data of the underlying data warehouse. They also differ in their requirements for data processing, so that separate storage schemes (schemes) are created at design time.
Web Services data access module design
The Web Services data access module is a core module of the multi-source operation and data source acquisition and aggregation system and is a key for successfully collecting and integrating data. And establishing communication with heterogeneous data sources through Web service, realizing data exchange through SOAP protocol, generating SQL sentences after verifying and analyzing the acquired XML document data, and storing the data into a data warehouse by an application program through calling a JDBC interface to finish data integration.
In the embodiment, the sensor in a certain area is set as a domain, and a medium node is arranged in the domain and used for processing data and carrying out subsequent transmission. On the basis, the invention selects the newly collected data and the historical data of the intermediate node to carry out similarity judgment, and directly takes the similarity parameters (namely the transformation compensation sequence) of the newly collected data and the historical data of the intermediate node as parameters to be transmitted after judging the similar data to carry out transmission. However, in order to prevent wasteful overhead and reduce transmission efficiency, it is necessary to compare the sizes of the data segments before and after processing at this time, and new data segments are used for transmission after a reduction. After receiving, the receiving end correspondingly judges whether the similarity parameter (namely the transformation compensation sequence) exists or not, and solves the data through inverse operation. In the scheme, the data which is transmitted to the receiving end is stored in the receiving end, if similar data exist, new data are not required to be retransmitted, only the transformation relation between the two data is transmitted, the transmission bandwidth can be greatly saved, and the transmission efficiency is improved.
While the foregoing describes the illustrative embodiments of the present invention so that those skilled in the art may understand the present invention, the present invention is not limited to the specific embodiments, and all inventive innovations utilizing the inventive concepts are herein within the scope of the present invention as defined and defined by the appended claims, as long as the various changes are within the spirit and scope of the present invention.

Claims (8)

1. The utility model provides a mill's equipment data high-efficient collection analytic system which characterized in that: the high-efficiency collection and analysis system for the plant equipment data comprises a sensor for collecting the plant equipment raw data and an intermediate node connected with the sensor; the intermediate node is loaded with a data preprocessing program, and the data preprocessing program comprises:
step 1, an intermediate node receives data to be preprocessed;
step 2, extracting characteristics of the data to be preprocessed, searching and matching similar data in an intermediate node historical transmission database, and recording similar data labels, wherein the similar data is used as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step 5;
step 3, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step 4, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the step 5 if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step 5, judging the size of the data to be transmitted and the transmission threshold value, returning to execute the step 1 if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step 6;
and 6, the intermediate node performs compression coding on the data to be transmitted and outputs the data.
2. The plant data efficient collection and analysis system of claim 1, wherein: the step 2 comprises the following steps:
step 2.1, defining a feature extraction frame, wherein the width of the feature extraction frame is w;
step 2.2, traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and 2.3, matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
3. The plant data efficient collection and analysis system of claim 1, wherein: in step 2, the data to be preprocessed is analog data, including:
step (1), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (2), N maxima or minima are detected from the new characteristic curve and are designated as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (3), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) i ,Δt i )|i=1,2,...,N};
Step (4), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (5) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (6), peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (7), updating and defining the curve formed by the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (8), the historical data feature S= (S) passing through the steps (1) to (7) in the intermediate node historical transmission database 1 ,s 2 ,...,s m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.
4. The plant data efficient collection and analysis system according to claim 2 or 3, wherein: step 3 includes;
step 3.1, determining source data sub-elements in source data and target data sub-elements in target data by taking a feature extraction frame as a unit;
step 3.2, defining a transformation equation as: e=d+ηxs; the compensation equation is: i "=αi' +β;
wherein eta is a preset weight value, I ' represents a source data sub-element after deformation, and I ' ' represents a source data sub-element after amplitude compensation; connectivity s of adjacent source data sub-elements, s=0 representing non-connectivity, s=1 representing connectivity;
step 3.3, determining the distance d from the source data subelement to the target data subelement, and the amplitude compensation coefficients alpha and beta;
and 3.4, defining a source data sub-element label, a target data sub-element label, a distance transformation parameter d, connectivity s, an amplitude transformation parameter alpha and beta as a transformation compensation sequence.
5. The plant data efficient collection and analysis system of claim 1, wherein: the data preprocessing program further includes:
and 7, judging whether the intermediate node database overflows or not by the intermediate node, and if so, eliminating the library data in the intermediate node historical transmission database according to the warehousing sequence.
6. A method for efficiently collecting and analyzing factory equipment data is characterized in that: the high-efficiency collection and analysis method for the plant equipment data is based on the high-efficiency collection and analysis system for the plant equipment data according to any one of claims 1 to 5, and comprises the following steps:
step one, an intermediate node receives factory equipment data collected by a sensor and defines the factory equipment data as data to be preprocessed;
step two, the intermediate node extracts the characteristics of the data to be preprocessed, searches and matches similar data in the intermediate node historical transmission database, records similar data labels, and uses the similar data as reference data; if the similar data cannot be searched and matched, defining the data to be preprocessed as the data to be transmitted, and directly executing the step five;
step three, calculating a transformation compensation sequence by taking reference data as a source and data to be preprocessed as a target;
step four, defining data to be transmitted by using similar data labels and transformation compensation sequences; comparing the size of the data to be preprocessed with the size of the data to be transmitted, executing the fifth step if the data to be preprocessed is smaller than the data to be transmitted, otherwise, updating the data to be preprocessed into the data to be transmitted;
step five, judging the sizes of the data to be transmitted and the transmission threshold value, returning to execute the step one if the data to be transmitted is smaller than the transmission threshold value, otherwise executing the step six;
step six, the intermediate node compresses and codes the data to be transmitted and outputs the data to the data receiving end;
step seven, after receiving the data, the data receiving end decodes the data and then segments the data, judging whether the data segment has a transformation compensation sequence, if so, executing the step eight, otherwise, executing the step nine;
step eight, determining reference data according to the transformation compensation sequence parameters, and solving data to be preprocessed;
and step nine, carrying out subsequent processing on the data.
7. The method for efficient collection and analysis of plant data according to claim 6, wherein:
the second step comprises:
defining a feature extraction frame, wherein the width of the feature extraction frame is w;
traversing the data to be preprocessed by using a feature extraction frame to finish feature extraction;
and (C) matching the extracted features with the features of the historical data in the historical transmission database of the intermediate node, and defining the historical data with the matching rate larger than a preset threshold value as similar data.
8. The method for efficient collection and analysis of plant data according to claim 6, wherein: in the second step, the data to be preprocessed is analog data, including:
step (a), acquiring a time domain curve of data to be preprocessed, defining a self-adaptive time window, wherein one side of the time window is a left endpoint or a right endpoint, the time window comprises a maximum value and a minimum value, a vertical line defining the left endpoint is a first central line, and a mean value of the vertical lines defining the left endpoint is a second central line; respectively taking the first central line and the second central line as symmetry planes, and symmetrically calculating real-time characteristic time sequence signals in a time window to obtain a new real-time characteristic curve;
step (b), N maxima or minima are detected from the new characteristic curve and designated as { (v) i ,t i ) I=0, 1,., N }, where N is a natural number greater than 3;
step (c), calculating the time difference between adjacent maxima or minima to obtain an extremum interval database { (v) i ,Δt i )|i=1,2,...,N};
Step (d), defining the width of the feature extraction frame as w and the moving speed of the feature extraction frame as v;
w=(max(Δt i )-min(Δt i ))×p;
wherein p is a preset proportional value of the width of the feature extraction frame and the width of the feature curve, i is more than or equal to 1 and less than or equal to N;
step (e) of determining a peak threshold range (V 1 ,V 2 ) The method comprises the steps of carrying out a first treatment on the surface of the Determining a time interval threshold range (T) from a longitudinal scan 1 ,T 2 );
Step (f) of setting the peak threshold range (V 1 ,V 2 ) And a time interval threshold range (T 1 ,T 2 ) The formed area is defined as a trusted area of the standard feature points;
step (g), updating and defining a curve composed of the trusted areas of the standard feature points as an optimized feature curve function T=(s) 1 ,s 2 ,...,s n ) Wherein n is the length of the correction characteristic curve;
step (h), the historical data feature S= (S) passing through the steps (a) to (g) in the intermediate node historical transmission database 1 ,s 2 ,...,s m ) And m is the characteristic length of the historical data, the consistency contrast is calculated by using a DTW algorithm, and the consistency contrast is lower than the judgment of a preset threshold value, and if not, the consistency contrast is judged to be consistent.
CN202210087098.3A 2022-01-25 2022-01-25 Efficient collection and analysis system and method for factory equipment data Active CN114661802B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210087098.3A CN114661802B (en) 2022-01-25 2022-01-25 Efficient collection and analysis system and method for factory equipment data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210087098.3A CN114661802B (en) 2022-01-25 2022-01-25 Efficient collection and analysis system and method for factory equipment data

Publications (2)

Publication Number Publication Date
CN114661802A CN114661802A (en) 2022-06-24
CN114661802B true CN114661802B (en) 2024-04-05

Family

ID=82025865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210087098.3A Active CN114661802B (en) 2022-01-25 2022-01-25 Efficient collection and analysis system and method for factory equipment data

Country Status (1)

Country Link
CN (1) CN114661802B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091070A (en) * 2014-07-07 2014-10-08 北京泰乐德信息技术有限公司 Rail transit fault diagnosis method and system based on time series analysis
US10650621B1 (en) * 2016-09-13 2020-05-12 Iocurrents, Inc. Interfacing with a vehicular controller area network
CN113934720A (en) * 2021-10-18 2022-01-14 北京八分量信息科技有限公司 Data cleaning method and equipment and computer storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091070A (en) * 2014-07-07 2014-10-08 北京泰乐德信息技术有限公司 Rail transit fault diagnosis method and system based on time series analysis
US10650621B1 (en) * 2016-09-13 2020-05-12 Iocurrents, Inc. Interfacing with a vehicular controller area network
CN113934720A (en) * 2021-10-18 2022-01-14 北京八分量信息科技有限公司 Data cleaning method and equipment and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A New Image Encryption Algorithm Based on 2D-LSIMM Chaotic Map;Zhang, Huacheng等;2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI);20200826;全文 *
数字化制造设备引导数据传输准确性矫正仿真;牧涛;;计算机仿真;20200415(第04期);全文 *

Also Published As

Publication number Publication date
CN114661802A (en) 2022-06-24

Similar Documents

Publication Publication Date Title
CN107256219B (en) Big data fusion analysis method applied to mass logs of automatic train control system
US20040267686A1 (en) News group clustering based on cross-post graph
CN111177276A (en) Spark calculation framework-based kinetic energy data processing system and method
CN111950927A (en) Acquisition and management system for multisource heterogeneous big data of intelligent factory
CN102171680A (en) Efficient large-scale filtering and/or sorting for querying of column based data encoded structures
CN111552813A (en) Power knowledge graph construction method based on power grid full-service data
CN111636891B (en) Real-time shield attitude prediction system and construction method of prediction model
CN113361559B (en) Multi-mode data knowledge information extraction method based on deep-width combined neural network
CN113011386B (en) Expression recognition method and system based on equally divided characteristic graphs
CN114418177B (en) New product material distribution prediction method based on digital twin workshops for generating countermeasure network
CN107944465A (en) A kind of unsupervised Fast Speed Clustering and system suitable for big data
CN116128254A (en) Embedded intelligent manufacturing resource configuration method and terminal system based on edge calculation
CN111680027A (en) Method and system for realizing intelligent cloud management based on knowledge drive
CN114661802B (en) Efficient collection and analysis system and method for factory equipment data
CN113535422A (en) Cloud platform system for data cleaning and event processing of industrial big data
CN114689351A (en) Equipment fault predictive diagnosis system and method
CN113822754A (en) Logistic-SVM-based risk model establishing method
CN112651829A (en) System for fusing and managing bank data by applying big data technology and middle platform architecture
CN111125198A (en) Computer data mining clustering method based on time sequence
CN117009921B (en) Optimized data processing method and system of data fusion engine
CN115269704B (en) Multi-element heterogeneous agricultural data management system
CN117992549A (en) Heterogeneous big data storage and management system based on multisource
Wang Research on Manufacturing Data Space Storage based on Data Space
Qiang et al. Similarity determination based on data types in heterogeneous databases using neural networks
Quraan et al. Integration Approaches for Heterogeneous Big Data: A Survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant