KR20180026301A

KR20180026301A - Big Data Analysis System for Smart Factory

Info

Publication number: KR20180026301A
Application number: KR1020160113506A
Authority: KR
Inventors: 양원모; 이성용; 구경호; 서승현; 양남건; 김우겸; 강진웅
Original assignee: 주식회사 포스코아이씨티
Priority date: 2016-09-02
Filing date: 2016-09-02
Publication date: 2018-03-12
Also published as: KR101892351B1

Abstract

A big data analysis system for a smart factory according to an embodiment of the present invention, which can store a large amount of data collected in a continuous process in a big data storage based on a distributed file system, is a big data analysis system for a smart factory for processing collection data collected in a continuous process where a plurality of processes are connected. The big data analysis system includes a first sorting data fetching unit for reading load data for each process among data collected in the continuous process; a file generation unit for generating the load data as a file; and a big data storage where a file generated by the file generation unit is stored.

Description

[0001] Big Data Analysis System for Smart Factory [0002]

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to factory data processing, and more particularly to real-time processing of large capacity data for continuous processing.

A plurality of processes for producing the finished product using the raw material are successively performed and the processes of the respective processes are mixed with each other or the state of the output of the specific process is changed and supplied to the subsequent process, The production method is called continuous process production method. The steel industry, the energy industry, the paper industry, or the oil refining industry are representative industries to which the continuous process production method is applied.

For example, the steel industry is made up of a plurality of processes such as a milling process, a steelmaking process, a performance process, and a rolling process. The ironmaking process is a process of producing molten iron (iron charcoal), which iron ores are melted by the heat that comes from burning the coal into the blast furnace. The steelmaking process is a process to remove impurities from the molten metal. The molten steel and the molten iron are put together in the converter, and the impurities are removed by blowing oxygen. The casting process is a process in which liquid iron is solidified, and molten steel from which impurities have been removed is injected into a mold and cooled and solidified while passing through a continuous casting machine to be made of an intermediate material such as slab, bloom, or billet. The rolling process is a process of making steel into steel or wire, and slabs, blooms, or billets produced in the performance process are passed through rolls to produce steel sheets by stretching or thinning.

In the case of this type of continuous process production, unlike in the industry where a single process production method is applied, raw material or intermediate material moves at high speed, so data collection cycle is short and data amount is large, and noise, And so on. Therefore, there is a tendency that measurement errors occur frequently, and the intermediate materials are mixed with each other or the position of the material moves according to the working method.

Accordingly, a system capable of processing a large amount of data in real time and analyzing the relationship between data generated by each process is required in an industry in which a continuous process is applied to a production method.

However, a general factory data processing system disclosed in, for example, Korean Patent Laid-Open Publication No. 10-2015-0033847 (titled " Digital Factory Production Capacity Management System Reflecting Real Time Factory Situation ", published on May 20, Processing system) is for processing and analyzing data generated in a single process, it is not possible to process a large amount of data generated in a continuous process in real time, but also to analyze the relationship between data generated in each process There is a problem that it can not be done.

It is a technical feature of the present invention to provide a big data analysis system for a smart factory that can store a large amount of data collected in a continuous process in a big data store based on a distributed file system .

Another object of the present invention is to provide a big data analysis system for a smart factory that can store data collected in a continuous process into load data and no load data and store the same in a distributed file system.

It is another technical object of the present invention to provide a big data analysis system for a smart factory that can divide and store data collected in a continuous process by a predetermined number of units.

Another object of the present invention is to provide a big data analysis system for a smart factory capable of processing data collected in a continuous process into a file in parallel.

In order to achieve the above object, a big data analysis system for a smart factory according to an embodiment of the present invention includes a big data analysis system for a smart factory for processing collected data collected in a continuous process in which a plurality of processes are connected A first sorting data fetching unit for reading the load data for each process among the data collected in the continuous process; A file generation unit for generating the load data as a file; And a big data storage in which files generated by the file generation unit are stored.

According to the present invention, since a large amount of data collected in a continuous process is stored in a big data store based on a distributed file system, microdata currently collected can be processed in real time.

In addition, according to the present invention, since the data collected in the continuous process is divided into load data and no-load data and is stored in the distributed file system, the file retrieving speed can be improved and there is no need to scan no- The query execution time can be shortened.

In addition, according to the present invention, since data collected in a continuous process is divided and stored in units of a predetermined number, an effect of preventing out-of-memory generation of a memory queue in which data generated in a continuous process is temporarily stored have.

Further, according to the present invention, by implementing a plurality of file generation units that process data collected in a continuous process into files, the file creation job can be processed in parallel and the processing speed can be further improved.

1 is a diagram illustrating a smart factory architecture including a distributed parallel processing system for processing data for a continuous process in real time according to an embodiment of the present invention.
2 is a block diagram illustrating a configuration of an interface system according to an embodiment of the present invention.
3 is a diagram illustrating a configuration of an interface system including a plurality of interface processing units and a plurality of queue storage units.
FIG. 4 is a block diagram of a distributed parallel processing system according to an embodiment of the present invention. Referring to FIG.
5 is a diagram illustrating a configuration of a distributed parallel processing system including a plurality of real-time processing units and a plurality of memory units.
6 is a conceptual diagram illustrating a distributed parallel processing method of data mapping and sorting operations.
FIG. 7 is a diagram specifically illustrating a configuration of a big data analysis system according to an embodiment of the present invention.
FIG. 8 is a diagram specifically illustrating a configuration of a big data analysis system according to another embodiment of the present invention.
9 is a diagram showing an example of load data and no-load data.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

The meaning of the terms described herein should be understood as follows.

The word " first, "" second," and the like, used to distinguish one element from another, are to be understood to include plural representations unless the context clearly dictates otherwise. The scope of the right should not be limited by these terms.

It should be understood that the terms "comprises" or "having" does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

It should be understood that the term "at least one" includes all possible combinations from one or more related items. For example, the meaning of "at least one of the first item, the second item and the third item" means not only the first item, the second item or the third item, but also the second item and the second item among the first item, Means any combination of items that can be presented from more than one.

FIG. 1 is a diagram illustrating a smart factory architecture including a distributed parallel processing system that processes data for a continuous process in real time according to an embodiment of the present invention.

1, the smart factory architecture according to the present invention is composed of layers such as a data collection device 1, a network 2, a smart factory platform 1000, and an application system 3.

The data collection device 1 collects data generated in the continuous process. In one embodiment, a continuous process is a process in which a plurality of processes for producing finished products are successively performed using raw materials, the products of each process are mixed with each other, or the state of the output of a specific process is changed and supplied to a subsequent process . The steel process is a representative example of this continuous process. Hereinafter, for convenience of explanation, it is assumed that the continuous process is a steel process.

Since the steel manufacturing process is composed of various processes such as a steelmaking process, a steelmaking process, a performance process, and a rolling process, the data collecting device 10 can perform various processes such as a steelmaking process, a steelmaking process, a performance process, And collects generated micro data. Here, micro data refers to raw data as data collected through various sensors and the like. Hereinafter, microdata will be referred to as collected data for convenience of explanation.

To this end, the data acquisition device 1 comprises various instruments, sensors, actuators and the like for collecting data. The data collecting apparatus 1 may further include a P / C, a PLC (Programmable Logic Controller), and a DCS (Distributed Control System) for integrating or controlling data collected by an instrument, a sensor, and an actuator.

The network 2 transfers the large amount of data collected by the data collection device 1 to the smart factory platform 1000. In one embodiment, the network 2 according to the present invention may include but is not limited to a network cable, a gateway, a router, or a wireless access point (AP).

The smart factory platform 1000 receives a large amount of collected data collected by the data collection device 1 via the network 2. [ The smart factory platform 1000 processes the received large amount of collected data in real time. The smart factory platform 100 not only determines the presence or absence of facilities or materials in real time based on the collected collected data, but also stores collected data in a big data storage (not shown) for big data analysis, Provides inquiry and analysis service on stored data.

In one embodiment, the smart factory platform 100 according to the present invention includes an interface system 100, a distributed parallel processing system 200, and a big data analysis system 300, as shown in FIG. In addition, the smart factory platform 1000 may further include a service system 400, a management system 500, and a security system 500.

The interface system 100 provides connection means for connection with heterogeneous devices of Level 0 to Level 2 through various protocols and pre-processes the collected data collected by the data collection device 1 to standardize the collected data.

Hereinafter, the interface system 100 according to the present invention will be described in detail with reference to FIG.

2 is a diagram illustrating a configuration of an interface system 100 according to an embodiment of the present invention. As shown in FIG. 2, the interface system 100 according to the present invention includes an interface processing unit 110 and a queue storage unit 120.

The interface processing unit 110 preprocesses the collected data to process the collected data collected from the continuous process. In one embodiment, the interface processing unit 110 may pre-process the collected data by normalizing the collected data. To this end, the interface processing unit 110 includes a receiving unit 111, a parsing unit 112, a normalizing unit 113, a filtering unit 114, and a transmitting unit 115, as shown in Fig. 2 do.

The receiving unit 111 receives the collected data collected by the sensors included in the data collecting apparatus 1. In one embodiment, the receiving unit 110 receives using one or more communication methods. For example, the receiving unit 111 may use various communication methods such as iBA, OLE for Process Control (OPC), TCP / IP, and the like. That is, the communication methods used by the respective gateways and the like constituting the network 2 may vary according to the types of sensors included in the data collecting apparatus 1, and therefore the receiving unit 111 according to the embodiment of the present invention And supports all communication methods capable of communicating with various networks (2).

The parsing unit 112 parses the collected data received via the receiving unit 111. [

Specifically, the collected data collected by the data collection device 1 may have a structure in which a group ID including a plurality of item IDs, a collection time, and a plurality of measured values are repeated. In this case, the item ID is used to identify the measured property, which is a value indicating which property of the equipment, material, or product is measured during the continuous process, and may be temperature or humidity. The group ID can be a representative value of a certain plant, grouped into several items by position or each process. However, the data structure in the embodiment of the present invention is not limited to this, and may include the collected time in the group ID itself.

In this manner, when the collected data is composed of a group ID, a collecting time, and a plurality of measured values repeatedly without any other distinction, the distributed parallel processing system 200 generates a group ID, an item ID, Since a process of analyzing a plurality of measured values is separately required, real-time analysis such as abnormality of equipment or quality of a product may become difficult.

Thus, the parsing unit 112 parses the collected data in a meaningful unit based on a preset layout, for the linkage processing of the collected data collected from the continuous process.

In one embodiment, the parsing unit 112 parses the collection data by group ID, and matches each of the plurality of item IDs and the plurality of measurements each included in the group ID to obtain a single item ID, a collection time, Respectively, as shown in Fig.

According to this embodiment, parsing unit 112 may parse the collection data with reference to message layout store 117 where the message layout for the collection data specification is defined. The message layout storage unit 117 may be included in the interface processing unit 110 or another apparatus in a separate configuration. However, the present invention is not limited to this, so the information about the message layout may be included in the parsing unit 112.

The normalization unit 113 normalizes the collected data parsed by the parsing unit 112. [ In one embodiment, the normalization unit 113 is configured to determine, for each collected data having a single item ID, a collected time, and a single measurement from the parsing unit 112, an item ID To the standard item ID and unifies the unit and the number of digits of the measurement value to standardize the parsed data.

Specifically, even though each sensor or actuator included in the data collection device 1 measures the same property, it can have different item IDs depending on the characteristics of the company in which they are manufactured and the factory in which they are manufactured. Since the parsed data based on the different item IDs are transmitted to the distributed parallel processing system 200 as they are, the process of analyzing the non-standardized item ID and the plurality of measured values is separately required. Or the performance of the distributed parallel processing system 200 that monitors the quality of the product or the like may be degraded. Therefore, the standardization unit 113 according to the embodiment of the present invention can change the item ID included in each collected data to the standard item ID so that the data measuring the same attribute has the same item ID.

In this manner, the standardization unit 113 pre-processes the collected data so as to have the same standard item ID for the measured values of the same property, and collects collected data collected from the continuous process based on each standard item ID .

In addition, the format of the measurement values included in the collected data collected through the data collection device 1 may differ depending on the type of the data collection device 1 such as a sensor or an actuator. As a result, in order to process the collected data of the different types collected from the continuous process, it is necessary to additionally perform a process of converting data units and length according to the type of each device. Therefore, It is impossible to perform the link processing in real time. Accordingly, the standardization unit 113 according to the embodiment of the present invention standardizes the collected data so that the distributed parallel processing system 200 can process a large amount of data in real time.

The standardization unit 113 can standardize the item ID, the unit of measurement value, and the number of digits by referring to the standard conversion reference storage unit 118 included in the interface processing unit 110 or another apparatus in a separate configuration, Since the invention is not limited to this, the standardization unit 113 may include information on a standard conversion standard for standardization.

The filtering unit 114 determines whether or not to store the standardized data in the standardization unit 113 in the queue storage unit 120 according to a predetermined filtering criterion. For example, the class is previously set according to the type of collected collected data, and the filtering unit 114 can determine whether to store data in the queue storage unit 120 according to the class. In one embodiment, the rating may be determined based on the importance of the standardized item data, but not limited thereto.

The filtering unit 114 may filter the standardized data with reference to the filtering reference storage 119 in which the reference for filtering the data that needs to be stored in the queue storage unit 120 is stored. The filtering reference storage unit 119 may be included in the interface processing unit 110 or another apparatus in a separate configuration. However, since the present invention is not limited to this, information on the filtering criterion may be stored in the filtering unit 114.

The transmission unit 115 stores the filtered data through the filtering unit 114 in a queue of the queue storage unit 120. [ In one embodiment, the transmission unit 115 stores the filtered data in the queue 121 of the queue storage unit 120 for each group ID or standard item ID.

That is, in the embodiment of the present invention, the collected data for each process having different formats are parsed and standardized and converted into a certain format, and the standardized data is stored for each group ID or standard item ID, Since the data can be checked, the collected data collected from the continuous process can be processed in real time.

In one embodiment, the queue storage unit 120 may comprise a plurality of queues. According to this embodiment, when the transmission unit 115 stores the preprocessed collection data in any one of the plurality of the queue storage units 120, the same data is also stored in the remaining queue storage unit 120 Copied and stored. At this time, the transmission unit 115 may store the collected data in the queue storage unit 121 having a small load in consideration of the load of the plurality of queue storage units 121.

In addition, the transmission unit 115 may determine whether to store the collected data according to the operation mode of the plurality of the queue storage units 120. [ More specifically, the transmission unit 115 can stop storing data when the operation mode of the plurality of queue storage units 120 is the standby mode. At this time, the operation modes of the plurality of queue storage units 120 may be determined based on the number of the queue storage units 120 that are normally operated among the plurality of queue storage units 120.

On the other hand, when the operation mode of the plurality of queue storage units 120 is the standby mode, the reception unit 111 also stops receiving the collected data. That is, when the receiving unit 111 continuously receives the collected data from the data collecting apparatus 1, even though the queue storing unit 120 operates abnormally and the collected data is not stored in real time, The failure of the interface processing unit 110 causes a failure in the collected data processing operation.

Accordingly, in the embodiment of the present invention, when the operation mode of the queue storage unit 120 is the standby mode, the interface processing unit 110 prevents the occurrence of a failure by stopping the reception and storage of the collected data, ), The reception and storage of the collected data can be performed again.

The interface processing unit 110 according to the embodiment of the present invention includes at least one of the collected data merging unit 116, the message layout storing unit 117, the standard conversion reference storing unit 118, and the filtering reference storing unit 119 As shown in FIG.

The collection data merging unit 116 merges the collection data and transfers the collected data to the parsing unit 112 to improve the processing performance in the continuous process. In one embodiment, the collection data merge unit 116 merges the collection data received via the receiving unit 111 at certain time intervals (e.g., 0.1 second, 1 second, 1 minute, etc.). That is, due to the nature of the continuous process, the collection data may be delivered to the parsing unit 112 in a very short period (e.g., 5 ms to 20 ms), where the parsing unit 112 parses the data successively, The real-time processing performance of the system may deteriorate.

Accordingly, the collected data merging unit 116 directly transfers the collected data required for real-time monitoring to the parsing unit 112 without merging, and merges the remaining collected data at predetermined time intervals into a file form, 112). In this case, whether or not the collected data is necessary for real-time monitoring can be set according to the importance of the collected data. For example, if abnormality occurs, collected data collected from facilities or materials requiring immediate action is collected, .

The parsing unit 112 refers to the message layout storage unit 117 and stores the data in a binary format or a separate data transmission standard It can be used as a basis for interpreting data.

In the standard conversion reference storage unit 118, a standard item ID obtained by standardizing item IDs of various sensors constituting the data collecting apparatus 1, and a reference unit and a digit number corresponding to each standard item ID are stored. In other words, it is preferable that collecting data having a standardized item ID and a unit of unit and a number of digits are collected from a time when a large amount of collected data is collected from various sensors. However, according to characteristics of each process, In the embodiment of the present invention, a standard item ID in which item IDs such as various sensors are standardized in advance, and a reference unit and a digit number corresponding to each standard item ID are stored in advance in order to efficiently utilize it in future analysis.

According to this embodiment, the above-described standardization unit 113 can change the item ID of the parsed data to the standard item ID by referring to the standard conversion reference storage unit 118, and can unify the unit and the number of digits.

The filtering reference storage 119 stores a criterion for filtering data that need to be stored in the queue storage unit 120 among the standardized data. The filtering unit 114 described above includes a filtering reference storage unit 119, It is possible to filter data to be stored in the queue storage unit 120 among the standardized data.

The queue storage unit 120 includes a queue 121 as an area for temporarily storing data processed in the interface processing unit 110 before real-time processing.

The queue 121 is a storage for storing the preprocessed data in the interface processing unit 110 for a predetermined period of time and may store data on a disc basis rather than a memory in order to prevent data loss. The space for storing data in the plurality of queues 121 may be divided into a topic and a plurality of partitions within the same topic may be divided and processed in parallel.

In one embodiment, a group ID unique to each data group fetched from the queue storage unit 120 may be allocated to the distributed parallel processing system 200, and the data fetch address may be managed for each unique group ID , And the data can be stored and provided in the form of a queue for reading and writing data sequentially.

In the above-described embodiment, it is described that the collected data is preprocessed through one interface processing unit 110 and one queue storage unit 120. However, in the modified embodiment, as shown in FIG. 3, Processing unit 110 and a plurality of the queue storage units 120. [0034] FIG.

According to this embodiment, the plurality of interface processing units 110 can be extended in a form to be added in accordance with the scale of the data collecting apparatus 1 and the physical location of the factory, and each of the interface processing units 110 And can be implemented in a redundant structure for high availability (HA).

That is, each of the interface processing units 110 is provided as an operation server and a backup server, and when a normal operation server is operated and a failure occurs in the operation server, the backup server is automatically activated, Can be performed continuously without interruption.

The plurality of queue storage units 120 are implemented in a clustering structure. When data is stored in one queue storage unit 120, data is copied to another queue storage unit 120, The service can be continuously provided by referring to the other queue storage unit 120 even when a failure occurs.

In addition, when standardization of collected collected data is completed, the interface processing unit 110 selects one of the plurality of queue storage units 120 and stores the standardized data. In this case, a criterion for selecting the queue storage unit 120 for storing data may be selected from various rules. For example, a method of selecting a queue storage unit 120 having the lowest load or sequentially selecting the queue storage unit 120, It is possible to previously store and select the queue storage unit 120 to be stored for each collected sensor.

1, the distributed parallel processing system 200 maps process identifiers to standardized standardized data in the interface device 100, and performs mapping process of collected data collected in each process, . Hereinafter, the distributed parallel processing system 200 according to the present invention will be described in more detail with reference to FIG.

FIG. 4 is a diagram illustrating a configuration of a distributed parallel processing system 200 according to an embodiment of the present invention. 4, the distributed parallel processing system 200 according to the present invention includes a real-time processing unit 210 and a memory unit 220.

The real-time processing unit 210 generates mapping data by mapping the process identifier to the data standardized by the interface system 100, and arranges the mapping data so that the inter-area data such as operation-facility-quality can be linked and analyzed.

4, the real-time processing unit 210 includes a fetch performing module 211, a loading performing module 212, a process mapping performing module 213, a data correcting module 212, A module 215, and a data alignment performing module 216. [ The real-time processing unit 210 may further include an equipment abnormality detection execution module 217 and a quality abnormality detection execution module 218.

In one embodiment, the plurality of execution modules 211 to 218 shown in FIG. 4 may be implemented as an application that is distributed to the real-time processing unit 210 and implements each function, After creating a work space in the processing unit 210, a plurality of threads are created to perform functions assigned to each of the execution modules 211 to 218.

Hereinafter, functions of each of the plurality of execution modules 211 to 218 will be described in detail.

The fetch execution module 211 reads the standardized data from the queue 121 of the interface system 100 and stores it in the collected data storage unit 221 of the memory unit 220. In one embodiment, the fetch execution module 211 stores location information of each of a plurality of queues 121 included in the queue storage unit 120, The data can be read.

At this time, the interface unit 110 stores the collected collected data in the queue 121 for each group ID or standard item ID for the linkage processing of the collected data collected from the continuous process, ) Can be read by group ID or standard item ID.

The loading performing module 212 loads the data stored in the collected data storing unit 221 and transfers the loaded data to the process mapping executing module 213.

The process mapping execution module 213 maps a process identifier for identifying a process in which the corresponding data is performed to the standardized data transmitted from the loading execution module 212 and transmits the mapping data to the data correction execution module 215 .

In one embodiment, the process mapping module 213 may map at least one of a facility identifier of a facility performing each process or a material identifier of a material processed by the facility to standardized data as a process identifier. To this end, the mapping performing module 213 may include a facility mapping performing module 213a and a material mapping performing module 213b.

The facility mapping performing module 213a maps the facility identifier to which the corresponding data is collected to the standardized data loaded by the loading performing module 212. [ In one embodiment, the facility identifier may be a facility number assigned to each facility.

In one embodiment, the facility mapping performing module 213a may extract the facility identifier of the facility where the collected data is generated based on the collection time of the collected data and the property information of the sensor that collected the collected data. At this time, the attribute information of the sensor may include the standard item ID as the information for identifying the measured attribute as described above, but the present invention is not limited thereto.

That is, as described above, the data standardized by the interface device 100 includes a single item having a standard item ID, a collected time, and a standardized unit and a number of digits, The facility mapping performing module 213a can confirm the facility operated at the relevant time based on the collection time information of the collected data. Since the equipment attribute information of the facility measured by the sensor is included in the attribute information of the sensor, the facility mapping performing module 213a extracts the facility identifier of the facility measured by the sensor having the standard item ID specified at a specific time .

As a result, the mapping result data to which the equipment identifier is mapped includes the standard item ID, the collected time, the equipment identifier, and the single measurement value.

The material mapping performing module 213b maps the material identifier of the material processed through the equipment corresponding to the equipment identifier to the mapping data to which the equipment identifier is mapped. In one embodiment, the material identifier may be a material identifier assigned to each material. In the embodiment described above, the material mapping execution module 213b has been described as mapping the material identifier to the mapping data to which the equipment identifier is mapped. However, in the illustrated embodiment, The material mapping performing module 213b may map the material identifier to the standardized data transmitted from the loading performing module 212. [

The material mapping execution module 213b can extract the material identifier of the material processed in the equipment corresponding to the equipment identifier mapped to the mapping data based on the work instruction information performed in each process.

For example, when a first material having a first material identifier is generated through a first facility in the first process, the material mapping execution module 213b performs a process for mapping the first identifier The identifier is further mapped. When the second material having the second material identifier is generated through the second facility performing the second process, the material mapping execution module 213b performs mapping to the mapping data in which the facility identifier of the second facility is mapped And further maps the second material identifier.

Since the data to which the equipment identifier is mapped by the equipment mapping performing module 213a includes a single measurement value having the standard item ID, the collection time, the equipment identifier, and the standardized unit and the digit, the material mapping execution module 213b outputs The mapping data comprising the standard item ID, the collected time, the equipment identifier, the material identifier, and the single measurement value.

In this way, the distributed parallel processing system 200 according to the embodiment of the present invention maps the process identifier including at least one of the equipment identifier and the material identifier to the standardized data, It is possible to confirm that the data is collected during the process.

In addition, since the collected data is standardized through the interface system 100, the distributed parallel processing system 200 can map the equipment identifier and the material identifier to the standardized data having the structured structure, . &Lt; / RTI >

On the other hand, the collected data collected through the data collecting apparatus 1 includes load data, which is data collected while the equipment processes the material, and no-load data, which is data collected while the equipment is not processing the material can do.

In the case of no-load data, since there is no material identifier to be further mapped to the collection data to which the equipment identifier is mapped, the material mapping performing unit 213b can directly store the mapping data to which the equipment identifier is mapped in the sorting data storing unit 222 . In this case, the load data and the no-load data may be stored separately in the alignment data storage unit 222. [

The data correction execution module 215 corrects the mapping data by adding missing data when there is missing data among the mapping data to which the process identifier is mapped. In one embodiment, the data correction performing module 215 may match the collection time included in the mapping data to a predetermined collection period, for correction of the mapping data. That is, since the collected data should be generated at a predetermined collection period through the data collection device 1, the data correction execution module 215 matches the collection time of each of the mapped data to the collection period.

For example, when the collection period is 20 ms, the data collection device 1 must generate the collection data every 20 ms. Therefore, the data correction execution module 215 determines that the collection time is 15:01:01, The collection time of the mapping data is changed to 15:01:11 0000 ms and the collection time of the mapping data having the collection time of 15:01:11 0050 ms can be changed to 15:01:11 0040 ms.

If the collection period is 20 ms, the collected data should be transmitted in 20 ms. If some collected data is missing, the collected period of each collected data becomes longer than 20 ms. Therefore, the data correction execution module 215 corrects the missing data using the mapping data at the closest position to the area where the missing data should be collected, or the mapping data of the adjacent collection time with the time at which the missing occurred.

The data alignment performing module 216 aligns the mapping data corrected by the data correction executing module 215 or the mapping data in which the process identifier is mapped, for the linking process between the data of each process.

In one embodiment, the data sorting performing module 216 sequentially sorts the mapping data in which the same material identifier is mapped, according to the collection time. That is, the data sorting performing module 216 may sort the mapping data in chronological order in the material unit having the same material identifier for the linkage processing between collected data collected in the continuous process.

The data sorting performing module 216 arranges the mapping data sorted according to the time order on the basis of the position where the corresponding data is collected on the material corresponding to the same material identifier.

For the alignment of the mapping data, the collation performing module 216 may use at least one of the length of the material, the speed of movement of the material, and the collection period of the collection data to collect the collected data on the material Can be determined. For example, the collation performing module 216 may determine the collection location where the collection data is collected for each cycle on the material, based on the product of the movement speed of the material and the collection period of the collection data and the total length of the material. Thus, the mapping data arranged in time can be respectively sorted into data measured at a predetermined position in one direction on the material.

The data sorting performing module 216 calculates the distance between the reference points set at predetermined intervals on each material and the collection position of each mapping data for the association processing of the collection data collected from the first process and the second process, , And generates reference data at each reference point based on the calculated measurement value.

The collation performing module 216 sequentially sorts the reference data at the reference points and the collected data arranged in time in one direction on the material. The data sorting performing module 216 stores the sorted data in the sorting data storing unit 222 of the memory unit 220. In one embodiment, one direction may be at least one of a longitudinal direction of the material, a width direction of the material, and a thickness direction of the material, but the present invention is not limited thereto.

In the first and second steps, each collected data is collected at different collection periods and the length, width, or thickness of the material passing through the first process is different from the length, width, or thickness of the material passing through the second process It may be difficult to manage the change in the relationship of the measured values measured in a specific region of the material on each process in conjunction with each other. Therefore, in the present invention, in order to perform an association process between the collected data collected from the first process and the second process, measurement values at a plurality of reference points set on the material processed in each process are calculated, and measurement Value is used to manage the mapping data between each process.

Hereinafter, an example in which the data alignment performing module 216 aligns the reference data in the longitudinal direction on the material will be described in detail.

The first reference points are set at predetermined intervals in the longitudinal direction of the first material processed in the first process and the second reference points are set at predetermined intervals in the length direction of the processed second material in the second process. In this case, a first material identifier corresponding to the first material is mapped to the first reference data at the first reference points, and a second material identifier corresponding to the second material is mapped to the second reference data at the second reference points. The identifier is mapped. Thus, the first reference data and the second reference data are linked based on the first material identifier and the second identifier on the material pedestal (not shown) on which the material identifier is mapped for each material.

That is, the material family diagram is connected in the form of a material seed layer tree. By referring to the material family diagram, mapping data of each process through the material identifiers assigned to the materials generated while sequentially passing through the first and second processes Can be linked to each other.

The data sorting module 216 stores the mapping data and the reference data aligned in the longitudinal direction of the material in the sorting data storage unit 222 of the memory unit 220 as described above.

In one example, the collation performing module 216 may store alignment data having material identifiers associated with each other based on the material family diagram in the same space of the alignment data store 222. [ This makes it possible to utilize collectively collected collected data together in the same storage space when linking collected data.

In addition, the data sorting performing module 216 stores the first sorting data in which the mapping data is sorted in time, the second sorting data in which the mapping data is sorted based on the collecting position, and the reference data in the reference points, (222).

That is, the data sorting performing module 216 stores the first sorting data, the second sorting data, and the reference data in a separate storage location, respectively, so that the first sorting data and the second sorting data corresponding to the actually measured values The time and the collecting position may be separately used, and the reference data corresponding to the virtual value calculated based on the actual data may be separately used, but the present invention is not limited thereto.

In addition, the data sorting execution module 216 stores an event indicating completion of sorting in the completion event storage unit 223 so that the sorting result can be used in the large data analysis system 300. [

As such, the real-time processing unit 210 maps the process identifier such as the equipment identifier or the material identifier to the collected data, and arranges the mapping data so that the collected data can be processed.

The equipment abnormality detection execution module 217 receives the data to which the equipment identifier is mapped from the equipment mapping execution module 213a and determines whether or not the equipment abnormality is determined according to a predetermined equipment abnormality criterion. If it is determined that an abnormality has occurred in the specific equipment, the equipment abnormality detection execution module 217 stores the determination result in the abnormality detection result storage unit 224 of the memory unit 220 as a result of the determination.

At this time, if it is determined that an abnormality has occurred in the facility when the collected data collected for the predetermined period of time exceeds a predetermined reference value, for example, The present invention is not limited thereto.

The quality abnormality detection execution module 218 loads the alignment data from the alignment data storage 222 and determines whether the quality abnormality is abnormal according to a predetermined quality abnormality criterion based on the alignment data. If it is determined that an abnormality has occurred in the quality of the specific material, the quality abnormality detection execution module 218 stores the determination result in the abnormality detection result storage unit 224 of the memory unit 220.

In one embodiment, the quality anomaly detection execution module 218 generates macro data for use as a reference of the quality abnormality determination equation through operations such as averaging and error prediction of collected data, It is possible to determine the quality abnormality based on the result.

In this way, the distributed parallel processing system 200 according to the present invention arranges the standardized data on a material basis for the linkage analysis between the respective processes, and on the basis of the collected data or the sort data, , It is possible to predict in advance a facility failure.

Meanwhile, the real-time processing unit 210 may further include a facility information storage unit 219a, a work instruction information storage unit 219b, a sensor attribute information storage unit 219c, and a quality determination model storage unit 219d .

The facility information storage unit 219a stores information on which plant or which position the facility is located, maintenance history information on the facility, equipment information to be mapped to data, The equipment abnormality judgment criterion is stored.

The work instruction information storage unit 219b stores instruction information for a work in the MES (Manufacturing Execution System) system, material identifier information generated in a specific facility as the work, a quality index for performing the work, and material information And a quality abnormality judgment criterion for judging whether the quality of the material is abnormal or not is stored for each material.

In the sensor attribute information storage unit 219c, information such as the type, unit, collection period, data of the equipment identifier, factory / process, and the like collected by the sensor are stored. have.

Since the equipment abnormality criterion and the quality abnormality criterion are stored in advance in the quality judgment model storage unit 219d, the quality abnormality sensing execution module 218 can refer to the quality judgment model storage unit 219d to judge quality abnormality have.

The memory unit 220 stores various data generated by the real-time processing unit 210 and specifically includes a collection data storage unit 221, an alignment data storage unit 222, a completion event storage unit 223, And a result storage unit 224.

The standardized data read from the queue 121 through the fetch performing module 211 before the mapping, correction, and sorting are stored are stored in the collected data storage unit 221 according to standard item IDs, The loading performing module 212 loads the standardized data from the collected data storing unit 221 and transfers the data to the equipment mapping performing module 213a.

The mapping data in which at least one of the equipment identifier and the material identifier is mapped by the facility mapping performing module 213a and the material mapping executing module 213b is stored in the sorting data storing unit 222, And stored in a unit-aligned state. In one embodiment, load data in which both the equipment identifier and the material identifier are mapped and no-load data in which only the equipment identifier is mapped may be separately stored in the sorting data storage unit 222. [

The completion event storage unit 223 stores the data whose data are corrected according to the collection period and the missing data is stored in the sorting data storage unit 222. When the sorting result is stored in the sorting data storage unit 222, Notice that the event is saved. Accordingly, the big data analysis system 300 monitors the completion event storage unit 223 and extracts alignment data from the alignment data storage unit 222 when a new completion event occurs.

Specifically, the completion event includes a corresponding event transmission time, a data collection time, key information for reading data from the sorting data storage unit 222, a partition and directory information for storing an event, and the like. Accordingly, when a new completion event is acquired in the completion event storage unit 223, the big data analysis system 300 uses the key information included in the completion event to store data corresponding to the completion event in the alignment data storage unit 222 ) And the partition and the directory of the storage device (e.g.

The abnormality detection result storage unit 224 stores the abnormality detection result of the specific facility detected by the equipment abnormality detection execution module 217 and the abnormality detection result of the specific material detected by the quality abnormality detection execution module 218.

Therefore, according to the present invention, the user accesses the anomaly detection result storage unit 224 via a separate anomaly detection system (not shown) built on the outside of the smart factory platform 1000, It is possible to check whether the quality of a specific facility or a specific material is abnormal.

However, the present invention is not limited to this, and if an abnormality occurs in the quality of a specific facility or a specific material, it is also possible to directly transmit the result to the abnormality detection monitoring system so that the user can immediately check the abnormality.

In the above-described embodiment, the distributed parallel processing system 200 has been described as mapping and arranging the standardized data through one real time processing unit 210 and one memory unit 220, but in a modified embodiment 4 and 5, the distributed parallel processing system 200 may map and align standardized data using a plurality of real-time processing units 210a, 210b, and 210c and a plurality of memory units 220 have.

Hereinafter, a distributed parallel processing system according to a modified embodiment will be described with reference to FIGS. 4 and 5. FIG.

5 is a diagram schematically illustrating a configuration of a distributed parallel processing system including a plurality of real-time processing units and a plurality of memory units.

4, the distributed parallel processing system 200 includes a plurality of real-time processing units 210a, 210b, and 210c and a plurality of memory units 220a, 220b, and 220c.

One or more execution modules 211 to 218 among a plurality of execution modules 211 to 218 for mapping and aligning standardized data are distributed and distributed to the plurality of real-time processing units 210a, 210b, and 210c. As described above, according to the present invention, the plurality of real-time processing units 210a, 210b and 210c execute the execution modules 211 to 218 distributed to the real-time processing units 210a to 210c, respectively, 210b, and 210c, thereby preventing an overload due to execution of all the execution modules 211 to 218 in a single real-time processing unit.

That is, the plurality of real-time processing units 210a, 210b, 210c and 210d include a fetch performing module 211, a loading performing module 212, a facility mapping performing module 213a, a material mapping performing module 213b, At least one execution module of the module 215, the data alignment execution module 216, the equipment abnormality detection execution module 217, and the quality abnormality detection execution module 218 are distributed and parallel processed, (220), the large-capacity data transmitted from the interface system (100) is processed in real time.

In this case, a plurality of execution modules 211 to 218 performing the same function may be distributed to one of the real-time processing units 210a, 210b, and 210c. For example, a plurality of facility mapping execution modules 213a may be disposed in one real-time processing unit 210a, 210b, 210c.

In one embodiment, the plurality of real-time processing units 210a, 210b, and 210c may be configured as a clustering structure. Since the plurality of real-time processing units 210a, 210b, and 210c have a clustering structure, if a failure occurs in a specific real-time processing unit, the execution modules 211 to 218, which are being executed by the real- So that it is possible to secure availability.

Although the distributed parallel processing system 200 includes three real-time processing units 210a, 210b, and 210c in FIG. 5, the real-time processing unit may be additionally provided according to data processing capabilities.

The plurality of memory units 220 stores data processed by the plurality of real-time processing units 210a, 210b, and 210c. In an embodiment, the plurality of memory units 220 may have a clustering structure such as the above-described queue storage unit 120 in order to increase processing performance and ensure availability at the time of failure.

That is, when data is stored in one memory unit 220, data is copied and stored in another memory unit 220, so that even if a failure occurs in the specific memory unit 220, Can be provided.

In one embodiment, the plurality of memory units 220 may be provided in a redundant structure for High Availability (HA). That is, each memory unit 220 includes a master instance M and a slave instance S. In this case, the master instance M included in the first memory unit 220a and the slave instance S included in the second memory unit 220b operate as a pair, and the master instance M included in the second memory unit 220b The master instance M and the slave instances S included in the first memory unit 220a operate as a pair.

In this case, when the alignment data is stored in the master instance M of the first memory unit 220a, the alignment data is also copied and stored in the slave instance S of the second memory unit 220b, and the second memory unit 220b The sorting data is replicated and stored in the slave instance S of the first memory unit 220a when the sorting data is stored in the master instance M of the first memory unit 220a.

In one embodiment, the alignment data recorded in the slave instance S can be backed up as a file in the form of a scripter for each piece of data for failure recovery. At this time, a scripter-like file means a file in which a command related to writing or reading of data is stored together with the corresponding data.

Meanwhile, when a master instance M included in the first memory unit 220a operates and a failure occurs, the slave instance S included in the first memory unit 220a is automatically activated, thereby enabling the real-time processing unit 210 may be implemented continuously without interruption.

In one embodiment, the master instance M and the slave instance S of each memory unit 220 are configured in a single-threaded form, and the instance and port are separated for each write and read.

The master instance M and the slave instance S included in each memory unit 220 are connected to the collection data storage unit 221, the alignment data storage unit 222, the completion event storage unit 223, And a detection result storage unit 224, respectively.

Hereinafter, with reference to FIG. 6, a method of performing distributed parallel processing of data mapping and sorting operations standardized will be described as an example.

6, since the first real-time processing unit 210a distributes the fetch performing module 211 to the first real-time processing unit 210a, the first real-time processing unit 210a executes the fetch performing module 211, The CPU 211 connects to the queue 121 to fetch the normalized data and store the fetch data in the master instance M of the first memory unit 220a. At this time, data is also copied and stored in the slave instance S of the second memory unit 220b. In the above-described embodiment, data is stored in the master instance M of the first memory unit 220a, but data may be stored in the master instance M of the second memory unit 220b. In this case, data is copied and stored in the slave instance S of the first memory unit 220a.

The second real-time processing unit 210b includes a loading execution module 212, a facility mapping execution module 213a, and a material mapping execution module 213b. Therefore, the second real- The loading performing module 212 reads the data from the slave instance S of the second memory unit 220b or the slave instance S of the first memory unit 220a by executing the execution module 212. [

The second real-time processing unit 210b reads the slave instance S of the second memory unit 220b or the slave instance S of the first memory unit 220a by executing the facility mapping execution module 213a Maps the equipment identifier to the data, and executes the material mapping execution module 213b, thereby mapping the material identifier to the data to which the equipment identifier is mapped.

The third real-time processing unit 210c executes the data correction execution module 215 to execute the data correction execution module 215 and the data alignment execution module 216 in the third real- Corrects the missing data in the data, and performs the data sorting performing module 216 to sort the corrected mapping data by material and stores the sorted data in the master instance M of the second memory unit 220b. At this time, data is also copied and stored in the slave instance S of the first memory unit 220a. In the above description, the sorting data is stored in the master instance M of the second memory unit 220b, but the sorting data may be stored in the master instance M of the first memory unit 220a. In this case, data is copied and stored in the slave instance S of the second memory unit 220b.

The master instance M included in the first memory unit 220a and the slave instance S included in the second memory unit 220b are configured in a redundant manner, The master instance M included in the second memory unit 220b and the slave instance S included in the first memory unit 220a operate as a pair.

However, according to this embodiment, since the master instance M and the slave instance S are implemented as a single thread, when the master instance M of the first memory unit 220a is down, There is a limit in that the slave instance S of the second memory unit 220b can not serve both the write operation and the read operation during the down time until the master instance M of the second memory unit 220 is normalized.

Thus, in a modified embodiment, the memory unit 220 may be implemented in a triple structure, as shown in FIG. Specifically, each of the memory units 220 according to the modified embodiment includes a master instance M, a first slave instance S1, and a second slave instance S2.

In this case, the master instance M included in the first memory unit 220a operates as a pair with the first slave instances S1 of the second and third memory units 220b and 220c. Accordingly, when data is written to the master instance M included in the first memory unit 220a, data is also copied and stored in the first slave instances S1 of the second and third memory units 220b and 220c do.

The master instance M included in the second memory unit 220b is connected to the first slave instance S1 included in the first memory unit 220a and the second slave instance S1 included in the third memory unit 220c, Lt; RTI ID = 0.0 > S2. &Lt; / RTI > Accordingly, when data is written to the master instance M included in the second memory unit 220b, the data is included in the first slave instance S1 and the third memory unit 220c included in the first memory unit 220a Data is also copied and stored in the second slave instances S2.

The master instance M included in the third memory unit 220c operates as a pair with the second slave instances S1 included in the first memory unit 220a and the second memory unit 220b . Accordingly, when data is written to the master instance M included in the third memory unit 220c, the second slave instances S1 included in the first memory unit 220a and the second memory unit 220b Data is copied and stored.

Referring back to FIG. 2, the big data analysis system 300 stores the data sorted by the distributed parallel processing system 200 in the big data storage space. In addition, the big data analysis system 300 manages data loss and provides inquiry about historical data. Hereinafter, the big data analysis system 300 according to the present invention will be described in detail with reference to FIG.

FIG. 7 is a block diagram illustrating a configuration of a big data analysis system according to an embodiment of the present invention.

7, the big data analysis system 300 includes a large-capacity data processing unit 310, a big data storage unit 320, and a query processing unit 330. As shown in FIG.

The large capacity data processing unit 310 performs distributed parallel processing of the alignment data and the anomaly detection result and includes a completion event receiving unit 311, an alignment data fetching unit 312, a memory queue 313, a file generating unit 314, And an anomaly detection data receiving unit 315.

The completion event receiving unit 311 monitors the completion event storage unit 223 included in the distributed parallel processing system 200 and delivers the completion event to the sorting data fetching unit 312 when the completion event is newly stored.

When the completion event is received from the completion event receiving unit 311, the sorting data fetching unit 312 inquires the sorting data corresponding to the completion event from the sorting data storing unit 222 and stores the sorting data in the memory queue 313. In one embodiment, the sorting data fetch unit 312 uses the key information included in the completion event to check which partitions and directories of the sorting data storage unit 222 store data corresponding to the completion event , The data stored in the sorting data storage unit 222 may be retrieved and stored in the memory queue 313. [

The memory queue 313 temporarily stores the data read by the sorting data fetch unit 312 on the memory before storing it in the big data store 320. [ In one embodiment, the memory queue 313 may be implemented on a file basis if the amount of data is not large and loss prevention is required.

The file creation unit 314 creates the data stored in the memory queue 313 as a physical file and stores it in the big data storage unit 320. [ In one embodiment, the file creation unit 314 may change or compress the file format when storing the file in the big data store 320. [

The anomaly detection data receiving unit 315 monitors the anomaly detection result storage unit 224 included in the distributed parallel processing system 200 and stores the result in the memory queue 313 when a new anomaly detection result is stored.

The big data storage unit 320 stores the file generated by the file creation unit 314. [ In one embodiment, the big data store 320 may be implemented on a Distributed File System basis. For example, the big data storage 320 may be implemented as a Hadoop-based distributed file system.

According to this embodiment, the big data storage 320 is composed of a master node 320a and a data node 320b. The master node 320a stores a large amount of files generated by the large capacity data processing apparatus 300 in the data nodes 320b and generates a job for inquiring data stored in the data nodes 320b And manages the metadata.

Here, the job means a unit for processing a query received from the query processing unit 330 to inquire data stored in the data node 320b. For example, when a query for querying data for one table recorded in the data node 320b is executed, if ten data nodes 320b are selected as a result of querying the data of the table, A job for retrieving and retrieving data for each data node 320b and a job for consolidating data acquired for each data node 320b are executed for the ten data nodes 320b.

Metadata includes a location of a file stored in the data node 320b, a file name, a block ID where a file is stored, and a storage location of a server. For example, when a file is created in the file creation unit 314, the location and file name of the file are stored in the metadata. If the file is larger than the block size and divided into five blocks and stored in three different servers, The ID and the storage location of each server are additionally stored in the metadata.

The meta data is used as location information of the data when distributing the job and loading the data of a specific file when executing a job for inquiring the data stored in the data node 320b.

A large amount of files generated by the mass data processing apparatus 300 are stored in the data node 320b. In one embodiment, the data nodes 320b may be implemented in a plurality of ways, and each data node 320b includes a historical data store 322 and a model store 324.

The historical data storage unit 322 included in each data node 320b stores not only the file generated by the file creation unit 314 but also the large amount of collected data collected in real time by the data collection device 1 do. In one embodiment, the files generated by the file creation unit 314 may be stored in a separate relational database (RDB).

The model storage unit 324 stores a quality determination model and an anomaly prediction model necessary for determining the quality of a material or a product through the service system 400. [

The query processing unit 330 includes a query receiving unit 332, a query execution unit 336, and a query result transmitting unit 338 in a configuration for querying data stored in the big data store 320 and returning data . The query processing unit 330 may further include a query scheduling unit 334. [

Specifically, the query receiving unit 332 receives the query from the user and interprets the received query syntax.

The query execution unit 336 transfers the query received via the query receiving unit 322 to the big data store 320 so that the query is executed and the query execution result is obtained from the big data store 320.

The query result transmitting unit 338 transmits data obtained from the big data store 320 as a result of the query to the user requesting the query.

Meanwhile, when the query received through the query receiving unit 332 is composed of a plurality of sub-queries, the query scheduling unit 334 classifies the received queries into the respective sub-queries and transmits them to the query execution unit 336 do.

The sorting data fetching unit 312 of the large capacity data processing unit 310 reads the data stored in the sorting data storage unit 222 regardless of the type of data and stores the data in the memory queue 313 . However, in another embodiment, the sorting data fetch unit 312 can separately read the load data and the no-load data to improve the processing speed. Hereinafter, the configuration of the big data analysis system according to another embodiment will be described in more detail with reference to FIG.

FIG. 8 is a diagram illustrating a configuration of a big data analysis system according to another embodiment of the present invention.

8, the big data analysis system 300 includes a large-capacity data processing unit 310, a big data storage unit 320, and a query processing unit 330. As shown in FIG. 7, description will be omitted, and only the configurations different from those shown in FIG. 7 among the configurations of the large-capacity data processing unit 310 will be described below .

The mass data processing unit 310 according to another embodiment of the present invention includes a completion event receiving unit 311, a first sorting data fetching unit 312a, a second sorting data fetching unit 312b, a data dividing unit 312c, A memory queue 313, a plurality of file generating units 314a to 314n, and an anomaly detection data receiving unit 315. [

The functions of the completion event receiving unit 311, the memory queue 313, and the abnormality sensing data receiving unit 315 are the same as those shown in FIG. 7, and thus a detailed description thereof will be omitted.

The first sorting data fetching unit 312a reads the load data from the sorting data storing unit 222. [ The load data refers to the collected data collected while each equipment is processing the material, and the load data is mapped to both the equipment identifier and the material identifier. For example, temperature data or flatness data collected during processing of a blade in a roughing mill (RM) or a finishing mill (FM) in a thick plate process may correspond to load data. As shown in Fig. 10, such load data has a characteristic in which a change of data occurs and a variation width of data is also large.

The second sorting data fetching unit 312b reads no-load data from the sorting data storing unit 222. [ No-load data refers to the collected data collected by each facility without processing the material. Only the equipment identifier is mapped to the no-load data, and the material identifier is not mapped. For example, the no-load data may correspond to temperature data or flatness data collected in a roughing mill (RM) or a finishing mill (FM) in a plate process without blade processing. Since the no-load data is measured in a state in which no work is performed, the no-load data has characteristics such that mainly the same value occurs continuously, as shown in Fig.

The first sorting data fetching unit 312a and the second sorting data fetching unit 312b described above may be implemented in plural in order to further improve the processing speed.

In one embodiment, the alignment data storage unit 222 may include a first alignment data storage unit 222a for storing load data and a second alignment data storage unit 222b for storing no-load data. According to this embodiment, the first sorting data fetching unit 312a reads the load data from the first sorting data storing unit 222a, the second sorting data fetching unit 312b reads the second sorting data storing unit 222b Load data is read out.

When the completion event receiving unit 311 receives an event indicating that the storage of the load data is completed in the first sorting data storing unit 222a from the completion event storing unit 223, the completion event receiving unit 311 transmits the event to the first sorting data fetching unit 312a So that the first sorting data fetching unit 312a reads the load data from the first sorting data storing unit 222a. When the completion event receiving unit 311 receives an event indicating that the storage of no-load data is completed in the second sorting data storing unit 222b from the completion event storing unit 223, the completion event receiving unit 311 transmits the event to the second sorting data fetching unit 312b So that the second sorting data fetching unit 312b reads the no-load data from the second sorting data storing unit 222b.

The first sorting data fetching unit 312a uses the key information included in the completion event transmitted from the completion event receiving unit 311 to determine whether the data corresponding to the completion event is stored in any part of the first sorting data storing unit 222a And the directory, the load data stored in the first alignment data storage unit 222a can be read.

In addition, the second data fetch unit 312b may use the key information included in the completion event transmitted from the completion event receiving unit 311 so that the data corresponding to the completion event is stored in the second sorting data storage unit 222b Partition, and directory, the no-load data stored in the second alignment data storage unit 222b can be read.

In the above-described embodiment, the first alignment data storage 222a and the second alignment data storage 222b may be implemented as a queue. If the first sorting data storage unit 222a and the second sorting data storage unit 222b are implemented in the form of a memory cache, the corresponding event is delivered to the completion event receiving unit 311 immediately after the storage event is generated. If the first sorting data storage unit 222a and the second sorting data storage unit 222b are implemented in the form of a queue while the event is lost when the event receiving unit 311 is down, Even if the completion event receiving unit 311 is shut down, the completion event receiving unit 311 can recover from the event being processed before the event is received, .

Further, in accordance with the above-described embodiment, the file generation units 314a to 314n generate load data as a file, write the load data into a load data table (not shown) in the historical data storage 322a, (Not shown) in the historical data storage 322, that is, files generated based on load data and files generated based on no-load data are separately recorded in different tables When data retrieval and analysis is required, scanning of no-load data can be omitted and the processing speed can be improved.

The data dividing unit 312c divides the load data read by the first sorting data fetching unit 312a or the no-load data read by the second sorting data fetching unit 312b into a predetermined number of units, 313).

The reason why the big data analysis system 300 according to another embodiment of the present invention divides data into the predetermined number of units through the data division unit 312c is that a large amount of data of more than 3 million are simultaneously stored in the memory queue The out-of-memory (Out Of Memory) may occur and the system may be shut down. In one embodiment, the data dividing unit 312c divides the load data read by the first sorting data fetching unit 312a or the no-load data read by the second sorting data fetching unit 312b into all-in-one units May be stored in the memory queue 313.

The file generation units 314a to 314n generate data stored in the memory queue 313 as physical files. 9, since the big data analysis system 300 according to another embodiment of the present invention is implemented by the plurality of file generation units 314a to 314n, the plurality of file generation units 314a to 314n The file creation job can be processed in parallel, which can speed up the file creation job. According to this embodiment, the plurality of file generation units 314a through 314n may be clustered with each other.

Referring again to FIG. 1, the service system 400 is a structure for reusing standardized processing processes and business standards as services. The service system 400 is a repository of business know- - facilitates linkage between controls and invokes and executes an analysis model that includes a quality determination model or an anomaly prediction model for a material or product and proceeds with the analysis results.

Since the analysis model is stored in advance in the model storage unit 324 shown in FIGS. 7 and 9, when the execution call event for the analysis model is input, the service system 400 stores data necessary for the analysis model in the model storage unit 324, And provides the result.

That is, the service system 400 directly receives and analyzes the data processed by the distributed parallel processing system 200, or when the processed data is stored in the big data storage 320 of the big data analysis system 300, And can analyze the corresponding data with reference to the storage 320.

The management system 500 manages configuration files for configurations for management of individual configurations belonging to the smart factory platform 1000 and UI / UX management data, individual monitoring of each configuration, linkage between predetermined setting values Information management, processing performance of the entire system, and integrated monitoring information.

The security system 600 performs authentication, authorization and access control for the user, and manages security for the data itself and security for the transmission path.

The application system 3 processes and provides screens and data necessary for the user based on the smart factory platform 1000.

It will be understood by those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

1: data collecting device 2: network
3: Application System 1000: Smart Factory Platform
100: Interface system 200: Distributed parallel processing system
300: Big Data Analysis System 310: Mass Data Processing Unit
320: Data store 330: Query processor

Claims

A big data analysis system for a smart factory for processing collected data collected in a continuous process in which a plurality of processes are connected,
A first sorting data fetch unit for reading load data for each process among the data collected in the continuous process;
A file generation unit for generating the load data as a file; And
And a big data store in which the file generated by the file generating unit is stored.

The method according to claim 1,
Wherein the first sorting data fetching unit comprises:
Wherein the load data is read out from the collection data in which the equipment identifier of the facility in which the collected data is generated and the material identifier of the material generated in the facility are mapped in the load data, Analysis system.

The method according to claim 1,
Further comprising a second sorting data fetching unit for extracting no-load data among the data collected in the continuous process,
Wherein the second sorting data fetching unit reads, from the collected data, the collected data to which the equipment identifier of the facility in which the collected data is generated is mapped to the no-load data. .

The method according to claim 2 or 3,
Wherein the facility identifier is extracted based on a time at which the collected data is collected and attribute information of the sensor that collected the collected data.

3. The method of claim 2,
Wherein the material identifier is extracted based on work instruction information for each process.

3. The method of claim 2,
Wherein the first sorting data fetching unit comprises:
And the collection data in which the mapping data in which the equipment identifier and the material identifier are mapped are arranged in at least one of a collection time sequence and a material unit processed in the process is read out as the load data. system.

The method according to claim 6,
Wherein the first sorting data fetching unit comprises:
The collected data that is unit-aligned on the basis of the reference data at each reference point calculated based on the distance between the reference points set at predetermined intervals on the material and the collecting position of the collecting data is read out as the load data and,
Wherein the collecting position is determined using at least one of a length of the material processed in the process, a moving speed of the material, and a collection period of the collection data.

The method according to claim 6,
The first sorting data fetching unit reads the collected data in which the mapping data and the reference data are aligned in one direction on the material processed in the process into the load data,
The reference data being determined based on a collection point of the collection data and a reference point of the material,
Wherein one direction on the material is at least one of a longitudinal direction of the material, a width direction of the material, and a thickness direction of the material.

The method of claim 3,
Wherein the load data is recorded in a first sorting data store configured in a queue form,
Wherein the no-load data is recorded in a second sorting data store configured in a queue form.

The method according to claim 1,
Further comprising a memory queue in which load data read by said first sorting data fetching unit is temporarily recorded,
Wherein the file generation unit generates load data recorded in the memory queue as a file.

11. The method of claim 10,
Further comprising a data dividing unit for dividing the load data read by the first sorting data fetching unit into a predetermined number of units,
Wherein the data division unit records the load data divided by the predetermined number of units in the memory queue.

The method according to claim 1,
Wherein the file generation unit is implemented in a plurality of units and the plurality of file generation units are clustered so that the plurality of file generation units process file creation jobs of the load data in parallel. .

The method according to claim 1,
The big data store,
A plurality of data nodes in which files generated by the file creation unit are stored; And
And a master node for distributing and storing the file generated by the file generation unit to the plurality of data nodes.

14. The method of claim 13,
The master node,
Wherein when a query for querying a file stored in the plurality of data nodes is received, a job, which is a unit for processing the query, is generated and managed.

14. The method of claim 13,
The master node,
Managing metadata including location information of a file stored in the data node and a file name,
Wherein the location information of the file includes at least one of a storage location of the file, an ID of a block in which the file is stored, and a location information of a data node in which the file is stored in the data node. Big data analysis system for.

14. The method of claim 13,
Further comprising a second sorting data fetching unit for extracting no-load data among the data collected in the continuous process,
Wherein the file generation unit generates the load data as a file and stores the file in a first table of the data node and generates the file as the file and stores the file in a second table of the data node Big data analysis system.

The method according to claim 1,
Further comprising a query processing unit for executing a query input from a user to query the big data store and returning a query execution result to a user.

18. The apparatus of claim 17, wherein the query processing unit
A query scheduling unit that classifies the query into the plurality of sub queries if the query received from the user includes a plurality of sub queries;
A query execution unit for transmitting the sub-queries classified by the query scheduling unit to the big data store to execute the sub-query, and obtaining the query execution result from the big data store; And
And a query result transmission unit for returning a query execution result obtained by the query execution unit to a user.

The method according to claim 1,
A second sorting data fetching unit for extracting no-load data among the data collected in the continuous process; And
When the first completion event informing the generation of the load data is received, the first completion event is transmitted to the first sorting data fetching unit so that the first sorting data fetching unit reads the load data, Further comprising a completion event receiving unit for transmitting the second completion event to the second sorting data fetching unit so that the second sorting data fetching unit reads out the no-load data when a second completion event informing of generation is received Big Data Analysis System for Smart Factories.

20. The method of claim 19,
Wherein the first completion event includes a first key corresponding to the load data and the first sorting data fetching unit is configured to extract, from the partition and the directory of the first sorting data store in which the load data is stored Reads the load data,
Wherein the second completion event includes a second key corresponding to the no-load data, and the second sorting data fetch unit uses the second key to extract a partition from the partition and the directory of the second sorting data store in which the no- And reading out the no-load data.