CN116389475A - Kafka-based industrial enterprise real-time ubiquitous interconnection method - Google Patents

Kafka-based industrial enterprise real-time ubiquitous interconnection method Download PDF

Info

Publication number
CN116389475A
CN116389475A CN202310294544.2A CN202310294544A CN116389475A CN 116389475 A CN116389475 A CN 116389475A CN 202310294544 A CN202310294544 A CN 202310294544A CN 116389475 A CN116389475 A CN 116389475A
Authority
CN
China
Prior art keywords
kafka
information
data
layer
industrial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310294544.2A
Other languages
Chinese (zh)
Inventor
王里程
王亚杰
邵光达
魏铭濡
韩日东
熊鑫
黄永梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anshan Iron And Steel Group Information Industry Co ltd
Original Assignee
Anshan Iron And Steel Group Information Industry Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anshan Iron And Steel Group Information Industry Co ltd filed Critical Anshan Iron And Steel Group Information Industry Co ltd
Priority to CN202310294544.2A priority Critical patent/CN116389475A/en
Publication of CN116389475A publication Critical patent/CN116389475A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a real-time ubiquitous interconnection method of industrial enterprises based on Kafka, which is characterized in that an edge computing service provides industrial Internet of things data for Kafka through Kafka connection, an enterprise management information system provides management information system data for Kafka through Kafkaconnection, layered data reporting is realized based on Kafka, mode information sharing is realized based on Schema registry, and finally real-time ubiquitous interconnection of industrial enterprises is realized. The invention has the advantages that: the edge computing service provides industrial Internet of things data for Kafka through KafkaConnect; the enterprise management information system provides management information system data for the Kafka through the KafkaConnect; the multi-layer data layer-by-layer reporting is realized based on Kafka; mode information sharing is achieved based on Schema Registry.

Description

Kafka-based industrial enterprise real-time ubiquitous interconnection method
Technical Field
The invention relates to the field of industrial Internet of things, in particular to a Kafka-based industrial enterprise real-time ubiquitous interconnection method.
Background
With the deep informatization, the problem of data integration of industrial enterprises is increasingly prominent. The largest two types of data sources of the industrial enterprise are industrial Internet of things data and enterprise management information system data, the industrial Internet of things data is derived from equipment, the enterprise management information system data is derived from an information system, and how to realize real-time and reliable ubiquitous connection between the two becomes a key for solving the problem.
Traditional enterprise data integration based on ETL has three problems:
1. the real-time performance is not enough. The traditional ETL generally performs data synchronization in a unit of day, and reduces the influence on the performance of the target database at the expense of data instantaneity. However, to increase the real-time performance, the reading frequency must be increased, thereby increasing the performance burden of the target database.
2. The data support for the industrial Internet of things is insufficient. The equipment quantity of the iron and steel enterprises is large, the total quantity of the measuring points is large, the production process data generation frequency is high, the data quantity is large, and the time sequence data is the most main process data in the iron and steel industry. The storage of the time sequence data does not use a traditional relational database, and the traditional ETL mode cannot support the integration of industrial Internet of things data.
3. Insufficient support is provided for the layer-by-layer aggregation of multi-layer data. Industrial enterprises typically have multiple levels of organizations, such as workshops-production lines-factories-groups, each with its own information system. The traditional mode needs a great deal of development work to support multi-level data summarization, and cannot realize automatic and low-cost layer-by-layer summarization of data.
Kafka is both a message engine system and a distributed stream processing platform. The system supports multiple partitions and multiple copies, has high read-write performance, high fault tolerance and horizontal expansion capability, and is an enterprise data bus widely used at present.
The Schema Registry shares the mode information of the message body between the producer and the consumer of the message by means of a Registry, so that the size of the message body can be reduced by using Avro, protobuf coding formats, the storage capacity can be improved, the serialization time can be shortened, and the performance can be improved.
Kafka Connect is a data integration technique based on Kafka. Kafka Connect provides excellent fault tolerance and scalability because it operates as a distributed service and ensures that all registered and configured connectors are always operational. For example, even if a certain Kafka Connect endpoint in a cluster fails, the remaining Kafka Connect endpoints restart any connectors that were previously running on the now terminated endpoint, thereby minimizing downtime and eliminating management activity. The task of Kafka Connect is divided into two types, namely "source" and "sink", which are responsible for data collection and "sink" which are responsible for data delivery. Kafka MirrorMaker 2.0.0 and Debezium are two typical Kafka Connect applications.
Kafka MirrorMaker 2.0.0 (MM 2) aims to more easily mirror or copy topics from one Kafka cluster to another. It uses the Kafka Connect framework to simplify configuration and scaling. It dynamically detects changes to the theme and ensures that the source and target theme properties are synchronized, including offset and partitioning.
CDC (Change Data Capture ) is a data synchronization mechanism that constantly monitors changes to the original data system, extracts them and distributes them to upstream systems, eliminating the process of bulk data loading by implementing incremental loading of data in near real time.
Debezium is an open source implementation of CDC technology. The debezum connectors typically operate by deploying them to the Kafka Connect service and configuring one or more connectors to monitor the upstream database and generate data change events for all changes they see in the upstream database. These data change events will be written to Kafka where they can be used independently by many different applications.
Stream data refers to data that arrives continuously in a large, fast, time-varying stream format, and stream computation is a real-time computation of stream data by pointers.
The MQTT is a lightweight publish/subscribe (MQTT) form of information transport protocol, and the design principle is minimum network bandwidth and device resource requirements, while ensuring reliability and a reliable level of information delivery, and the design concept is open, simple, lightweight, and easy to implement. The protocol provides one-to-many messaging, decoupling applications, providing network connectivity using TCP/IP, and three message publishing qualities of service, which also makes it an ideal protocol for device-to-device (M2M), internet of things (IoT), and mobile terminal push services.
Publication number CN112565333a discloses a data transfer method based on Kafka-Connect, describing a working mode of Kafka Connect, but not describing a specific scenario of application of Kafka Connect in industrial enterprises.
The method fully utilizes the characteristic that an MQTT lightweight protocol supports simultaneous connection of millions of devices in communication, introduces a Kafka cluster to make up the defect that the MQTT protocol does not support load balancing, and meets the application requirements in the high concurrency scene through the characteristic of high sequential writing speed of a disk, thereby greatly improving the transmission speed of the message and supporting the storage and asynchronous processing of real-time data streams. But the method is not implemented based on Kafka Connec.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a real-time ubiquitous interconnection method for industrial enterprises based on Kafka, which is ecological in Kafka technology, realizes the fusion of industrial Internet of things data and enterprise management information system data, generally establishes a multi-level organization of workshops, production lines, factories and groups for industrial enterprises, establishes a layer-by-layer information reporting mechanism and realizes comprehensive collaboration of cloud edge ends.
In order to achieve the above purpose, the present invention is realized by the following technical scheme:
the industrial enterprise real-time ubiquitous interconnection method based on Kafka specifically comprises the following steps:
1) Industrial internet of things data access: receiving an industrial Internet of things message based on an MQTT protocol, receiving the MQTT message through a data integration tool Kafka Connect, and writing the message into a Kafka queue;
2) Enterprise management information system data access: receiving database change information based on a CDC technology, and writing the change information into a Kafka queue when the database is changed;
3) Reporting information layer by layer: synchronizing Schema information of the lower Kafka cluster into the upper layer Kafka Schema Registry; synchronizing data information and offset information of the lower-layer Kafka cluster to the upper-layer Kafka cluster;
4) Data fusion: and carrying out data fusion based on a stream computing technology, thereby finally realizing the fusion of the industrial Internet of things data and the enterprise management information system data.
The industrial Internet of things data access specifically comprises the following steps:
s11, deploying an MQTT receiving gateway at the edge side or cloud;
s12, deploying a Kafka cluster, a Kafka Connect cluster and starting Confluent Schema Registry service on the edge side or the cloud side;
s13, an edge computing service is deployed at the edge, industrial Internet of things information is collected through OPC-UA, MODBUS or PROFIBUS industrial protocols, information Schema information is registered in Confluent Schema Registry, the information is encoded through an Avro format, and the information is uploaded to an MQTT gateway through an MQTT protocol;
s14, configuring a Kafka Connect task, reading industrial Internet of things information in the MQTT through a source connector of the MQTT, and writing the industrial Internet of things information into a queue of the Kafka.
The enterprise management information system data access specifically comprises the following steps:
s21, configuring a Debezium database, and enabling a database binlog;
s22, configuring a Kafka Connect task of a Debezium database;
s23, a Kafka task of the Debezium database monitors an upstream database server, captures all changes of the configured database, encodes the changes through an Avro format, and writes the changes into a queue of the Kafka.
The information is reported layer by layer, which comprises the following steps:
s31, realizing the copy combination function from the lower stage Confluent Schema Registry to the upper stage Confluent Schema Registry;
s32, configuring Kafka MirrorMaker in an upper-layer Kafka cluster, wherein Kafka MirrorMaker of the upper layer synchronizes information of a lower-layer Kafka cluster in real time in a pulling mode;
s33, if the lower layer Kafka cluster uses the code containing Avro, protoBuf, the upper layer Kafka uses the Schema information synchronized in the step S31 to analyze the message content.
The data fusion specifically comprises the following steps:
s41, using a stream computing technology, and realizing multi-stream combination of industrial Internet of things data and enterprise management information system data according to a service scene;
s42, persisting the data after multi-stream combination into various types of databases.
Compared with the prior art, the invention has the beneficial effects that:
1. the edge computing service provides industrial Internet of things data for Kafka through Kafka Connect;
2. the enterprise management information system provides management information system data for the Kafka through the Kafka Connect;
3. the multi-layer data layer-by-layer reporting is realized based on Kafka;
4. mode information sharing is achieved based on Schema Registry.
Drawings
FIG. 1 is a schematic diagram of the composition of a Kafka-based industrial enterprise real-time ubiquitous interconnection method.
Detailed Description
The present invention will be described in detail below with reference to the drawings of the specification, but it should be noted that the practice of the present invention is not limited to the following embodiments.
Referring to fig. 1, a Kafka-based industrial enterprise real-time ubiquitous interconnection method is realized based on a software system of a Kafka cluster, a Kafka Connect cluster, a Schema Registry, an edge computing service and an enterprise management information system, wherein the Kafka is a core component, the edge computing service provides industrial internet of things data for the Kafka through the Kafka Connect, the enterprise management information system provides management information system data for the Kafka through the Kafka Connect, layered report of multi-level data is realized based on the Kafka, mode information sharing is realized based on the Schema Registry, and finally real-time ubiquitous interconnection of the industrial enterprise is realized;
the method specifically comprises the following steps:
1. the industrial Internet of things data access specifically comprises the following steps:
s11, deploying an MQTT gateway, such as EMQX, at the edge side or cloud;
s12, deploying a Kafka cluster, a Kafka Connect cluster and a starting scheme Registry service on the edge side or the cloud side;
s13, an edge computing service is deployed at the edge, industrial Internet of things information is acquired through an industrial protocol of OPC-UA, MODBUS, PROFIBUS, message mode information is registered in a Schema Registry, and a message is encoded through an Avro format and uploaded to an MQTT gateway through an MQTT protocol;
s14, configuring a Kafka Connect task, reading industrial Internet of things information in the MQTT through a source connector of the MQTT, and writing the industrial Internet of things information into a queue of the Kafka.
2. The enterprise management information system data access specifically comprises the following steps:
s21, configuring a database, and enabling the database binlog;
s22, configuring a Kafka Connect task of the debezum;
s23, a Kafka task of the debezum monitors an upstream database server, captures all changes of the configured databases, encodes the changes through an Avro format, and writes the changes into a queue of the Kafka.
3. Reporting information layer by layer:
industrial enterprises generally have multi-level organizations of workshops, production lines, factories and groups, and industrial Internet of things and management information system information are generally managed in a hierarchical manner, so that the problem of information layer-by-layer transmission must be solved in order to realize information fusion of the whole enterprise range;
the method specifically comprises the following steps:
s31, realizing the copy merging function from the lower level Schema Registry to the upper level Schema Registry;
s32, configuring Kafka MirrorMaker in an upper-layer Kafka cluster, wherein Kafka MirrorMaker of the upper layer synchronizes information of a lower-layer Kafka cluster in real time in a pulling mode;
s33, if Avro, protoBuf codes are used in the lower-layer Kafka cluster, the upper-layer Kafka analyzes the content of the message by using the Schema information synchronized in the step S31.
4. The data fusion specifically comprises the following steps:
s41, using Kakfa KSQL or Flink to realize stream calculation;
s42, persisting the data calculated in real time into various types of databases.
The following examples are given by way of illustration of detailed embodiments and specific procedures based on the technical scheme of the present invention, but the scope of the present invention is not limited to the following examples. The methods used in the examples described below are conventional methods unless otherwise specified.
[ example 1 ]
A unified integrated platform of a group level is constructed for a certain large industrial enterprise, a plurality of production bases are arranged under the group, each production base is provided with a plurality of production factories, and therefore, not only is real-time data of mass field industrial Internet of things acquired, but also real-time change data of hundreds of existing enterprise management information systems are acquired, and cross-service real-time query and decision analysis of enterprise global data are realized; the specific scheme is as follows:
1. an edge computing service is deployed at each production plant, collects on-site PLC device information, and uploads the on-site PLC device information to a private cloud of a production base through an MQTT protocol.
2. And deploying an MQTT gateway and a Kafka cluster on the private cloud of the production base, receiving PLC equipment information uploaded by a production factory and writing the PLC equipment information into the Kafka cluster of the private cloud of the production base.
3. And (3) deploying Debezium on a database of the base management information system, collecting real-time change information of the database of the base management information system, and writing the change information into a Kafka cluster for producing the private cloud of the base.
4. The clique level deploys a set of Kafka clusters and synchronizes pattern information from various production sites via KafkaMM2 and pulls data in real time.
5. And realizing real-time calculation of data in the Kafka cluster of the group level through the Flink, and writing the result into a database.
The edge computing service provides industrial Internet of things data for Kafka through Kafka Connect; the enterprise management information system provides management information system data for the Kafka through the Kafka Connect; the multi-layer data layer-by-layer reporting is realized based on Kafka; mode information sharing is achieved based on Schema Registry.

Claims (5)

1. The industrial enterprise real-time ubiquitous interconnection method based on Kafka is characterized by comprising the following steps of:
1) Industrial internet of things data access: receiving an industrial Internet of things message based on an MQTT protocol, receiving the MQTT message through a data integration tool Kafka Connect, and writing the message into a Kafka queue;
2) Enterprise management information system data access: receiving database change information based on a CDC technology, and writing the change information into a Kafka queue when the database is changed;
3) Reporting information layer by layer: synchronizing Schema information of the lower Kafka cluster into the upper layer Kafka Schema Registry; synchronizing data information and offset information of the lower-layer Kafka cluster to the upper-layer Kafka cluster;
4) Data fusion: and carrying out data fusion based on a stream computing technology, thereby finally realizing the fusion of the industrial Internet of things data and the enterprise management information system data.
2. The Kafka-based industrial enterprise real-time ubiquitous interconnection method is characterized in that the industrial internet of things data access comprises the following steps:
s11, deploying an MQTT receiving gateway at the edge side or cloud;
s12, deploying a Kafka cluster, a Kafka Connect cluster and starting Confluent Schema Registry service on the edge side or the cloud side;
s13, an edge computing service is deployed at the edge, industrial Internet of things information is collected through OPC-UA, MODBUS or PROFIBUS industrial protocols, information Schema information is registered in Confluent Schema Registry, the information is encoded through an Avro format, and the information is uploaded to an MQTT gateway through an MQTT protocol;
s14, configuring a Kafka Connect task, reading industrial Internet of things information in the MQTT through a source connector of the MQTT, and writing the industrial Internet of things information into a queue of the Kafka.
3. The Kafka-based industrial enterprise real-time ubiquitous interconnection method according to claim 1, wherein the enterprise management information system data access comprises the following steps:
s21, configuring a Debezium database, and enabling a database binlog;
s22, configuring a Kafka Connect task of a Debezium database;
s23, a Kafka task of the Debezium database monitors an upstream database server, captures all changes of the configured database, encodes the changes through an Avro format, and writes the changes into a queue of the Kafka.
4. The Kafka-based industrial enterprise real-time ubiquitous interconnection method according to claim 1, wherein the information is reported layer by layer, and specifically comprises the following steps:
s31, realizing the copy combination function from the lower stage Confluent Schema Registry to the upper stage Confluent Schema Registry;
s32, configuring Kafka MirrorMaker in an upper-layer Kafka cluster, wherein Kafka MirrorMaker of the upper layer synchronizes information of a lower-layer Kafka cluster in real time in a pulling mode;
s33, if the lower layer Kafka cluster uses the code containing Avro, protoBuf, the upper layer Kafka uses the Schema information synchronized in the step S31 to analyze the message content.
5. The Kafka-based industrial enterprise real-time ubiquitous interconnection method according to claim 1, wherein the data fusion comprises the following steps:
s41, using a stream computing technology, and realizing multi-stream combination of industrial Internet of things data and enterprise management information system data according to a service scene;
s42, persisting the data after multi-stream combination into various types of databases.
CN202310294544.2A 2023-03-24 2023-03-24 Kafka-based industrial enterprise real-time ubiquitous interconnection method Pending CN116389475A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310294544.2A CN116389475A (en) 2023-03-24 2023-03-24 Kafka-based industrial enterprise real-time ubiquitous interconnection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310294544.2A CN116389475A (en) 2023-03-24 2023-03-24 Kafka-based industrial enterprise real-time ubiquitous interconnection method

Publications (1)

Publication Number Publication Date
CN116389475A true CN116389475A (en) 2023-07-04

Family

ID=86962685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310294544.2A Pending CN116389475A (en) 2023-03-24 2023-03-24 Kafka-based industrial enterprise real-time ubiquitous interconnection method

Country Status (1)

Country Link
CN (1) CN116389475A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076508A (en) * 2023-10-18 2023-11-17 江苏数兑科技有限公司 Method for supporting batch data processing by stream data processing system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076508A (en) * 2023-10-18 2023-11-17 江苏数兑科技有限公司 Method for supporting batch data processing by stream data processing system
CN117076508B (en) * 2023-10-18 2023-12-29 江苏数兑科技有限公司 Method for supporting batch data processing by stream data processing system

Similar Documents

Publication Publication Date Title
CN109492040B (en) System suitable for processing mass short message data in data center
CN106126346B (en) A kind of large-scale distributed data collection system and method
CN112100265A (en) Multi-source data processing method and device for big data architecture and block chain
CN103927218A (en) Event dispatching method and system
CN111225069B (en) Distributed market data processing system and method
CN111885439B (en) Optical network integrated management and duty management system
CN116389475A (en) Kafka-based industrial enterprise real-time ubiquitous interconnection method
CN107181805B (en) A method of realizing that global orderly is recurred under micro services framework
CN110096545A (en) One kind being based on big data platform data processing domain architecting method
CN111092901A (en) Method for equipment access and data storage in industrial internet platform
CN112559634A (en) Big data management system based on computer cloud computing
CN114374701B (en) Transparent sharing device for sample model of multistage linkage artificial intelligent platform
CN111427869A (en) Log system based on block chain
CN110213156A (en) A kind of span centre heart group's instant communicating method and system
CN116431324A (en) Edge system based on Kafka high concurrency data acquisition and distribution
CN113485793B (en) Online elastic expansion method for multi-source heterogeneous data access channel based on container technology
CN115237989A (en) Mine data acquisition system
CN115712681A (en) Method and system for realizing real-time data integration based on Flink CDC
CN114385684A (en) BaaS platform data service publishing method and system
CN104333578A (en) Distributed data exchange system and method
US20180232406A1 (en) Big data database system
Selim et al. Distributed Hash Table Based Design of Soft System Buses
JP2013065259A (en) Data transfer system, transfer origin system, transfer destination system, and program
CN111193614A (en) Cross-regional server system and method for connecting different regional network environments in the world
CN113421131B (en) Intelligent marketing system based on big data content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination