CN108154341A - United Dispatching platform and method of work based on data flow and workflow - Google Patents
United Dispatching platform and method of work based on data flow and workflow Download PDFInfo
- Publication number
- CN108154341A CN108154341A CN201711370210.XA CN201711370210A CN108154341A CN 108154341 A CN108154341 A CN 108154341A CN 201711370210 A CN201711370210 A CN 201711370210A CN 108154341 A CN108154341 A CN 108154341A
- Authority
- CN
- China
- Prior art keywords
- data
- workflow
- data flow
- engine
- united dispatching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Abstract
The present invention provides a kind of United Dispatching platforms and method of work based on data flow and workflow, can effectively combine the advantage of data flow and workflow so that task scheduling configurationization, stream compression parametrization.The labyrinth of multi-source data is can adapt to, completes the entire ETL processes of data, while complete the management for a variety of calculating task life cycles.This platform can combine the management of data flow and the scheduling of workflow, so that data from be flowed into calculate, result output is finally arrived, forms the datamation flow tube platform of complete set, this platform has taken into account the direction controlling of data flow and the period control of workflow.
Description
Technical field
The present invention relates to data stream managements and workflow management technology field, and in particular to one kind is based on data flow and work
The United Dispatching platform and method of work of stream.
Background technology
Workflow engine gradually becomes focus of greatest concern in recent years, due to the diversity of task now, task
Complexity, an effective task life cycle management platform increasingly become the important need of enterprises and individuals, but this
Class platform often only supports the management for workflow, such as (OOZIE is a kind of frame to OOZIE, it can be multiple Map/
In Reduce job mix a to logical unit of work), and more and more scene be the task scheduling platform of enterprise simultaneously
It needs to take into account the management for data flow, although usually under scene to the process demand of data being only the download of data, decompression,
The simple operations such as conversion and landing, but since existing workflow platform can not complete such data management function so that number
It is opened according to flow management and the complete cutting of Workflow Management, results in the waste of resource.
Workflow platform of the prior art, such as OOZIE, support the management and running of workflow, and difficulty is configured, using not
Just, it visualizes coarse etc..Such other workflow framework can not effectively be extended, can not when there is data flow to need management
The effective operation for carrying out data, can only depend on task in itself, increase additional development cost.
Invention content
The United Dispatching platform based on data flow and workflow that the present invention designs, except NiFi, (Apache NiFi are one
Easy-to-use, powerful, reliable component-based development platform, for handling and distributing data) original component, while also merged and independently ground
50 kinds of components of hair, including the data ciphering and deciphering of data flow, data buffer storage, the backup of data cross-server etc., Yi Jigong
Spark scheduling in flowing, (Spark and Strom are Distributed Calculation frame for Storm scheduling and the scheduling of regular query task etc.
Frame), this platform assembly can be used by being configured, and arbitrary function, which can configure, to be developed, and disclosure satisfy that nearly all for data
Conversion processing and task management and running function, solve the technical issues of work on hand levelling platform is inconvenient for use, extension is difficult.
The technical solution adopted by the present invention is as follows:
A kind of United Dispatching platform based on data flow and workflow, including data flow platform and workflow platform, data
Stream engine operates data, forms data flow platform, and workflow engine cooperation data flow platform completes the access of data, shape
Into workflow platform.
Further, the data flow engine include data input elements, data cleansing component, data output component and from
Definitions component, data flow engine receive data, and data input elements are adapted to various data sources, and data cleansing component completes data
Cleaning, data output component complete the output of data, and Custom component is extended the function of data flow engine.
Further, the data flow engine carries out data the management of data flow and the quality of data.
Further, the calculating task of the data flow engine is encapsulated in workflow engine.
Further, workflow engine receives the data input of data flow engine, and calls and manage Computational frame, then
The data of feedback are exported by data flow engine.
Further, the management of a variety of data is supported based on the United Dispatching platform of data flow and workflow, it can be flexible
Configuration have component complete puppy parc data parsing, while can also write according to demand Custom component complete for
The support of a variety of data formats and different agreement type.
Further, plug-in type exploitation is supported based on the United Dispatching platform of data flow and workflow, Function Extension is flexible.
Further, the calculating task frame supported based on the United Dispatching platform of data flow and workflow is extensive, can be with
The support for all kinds of working frames is completed in several ways, can be completed by the modes such as interface and script scheduling for list
The scheduling of machine and distributed computing framework.
Further, United Dispatching platform based on data flow and workflow can be by data flow and workflow integration to one
It rises, is faithful to one's husband to the end, the conversion landing for data can either be completed, and other Computational frames can be called to complete data in standard
Rotation in workflow.
A kind of method of work of the United Dispatching platform based on data flow and workflow, includes the following steps:
Step 1, data flow platform and workflow platform, such as NIFI workflow engines are disposed;
Step 2, configuration data stream engine is developed, including input module, data cleansing component, data output component and is made by oneself
Adopted component;
Step 3, configuration work stream engine is developed, including task scheduling component, abnormal task monitor component and task parameters
Change component;
Step 4, common tasks dispatch interface is developed and writes, writing data mainly for general big data platform is included in group
Part, such as Spark, Storm etc.;
Step 5, the United Dispatching platform based on data flow and workflow carries out data processing the scheduling pipe of life cycle
Reason,, all can be by this based on data flow and workflow to the destruction of task from the starting of task to the landing of data result
The management of United Dispatching platform.
Further, the data flow engine is NiFi data flow engines, and the workflow engine draws for NiFi workflows
It holds up.
Further, the United Dispatching platform based on data flow and workflow passes through general purpose module or Custom component pair
Data are parsed.
Further, the workflow engine supports Spark Computational frames, Hadoop Computational frames, Storm Computational frames
And custom task.
The present invention had not only supported the operation for data, but also can support the management for scheduler task, and expansible, can
Programming, copes with the application scenarios of various situations, has the advantages that:
1st, it is developed based on NiFi, illustrative is apparent, and function is more easy to extend.
2nd, the management of data flow and the management of workflow have been annexed, the function of general-purpose type is completely converted into assembly type opens
Hair.
3rd, data flow and the readability of duty cycle are improved, and function cutting is clear and definite, use can be freely combined.
Description of the drawings
Fig. 1 is the United Dispatching working platform flow diagram the present invention is based on data flow and workflow.
Specific embodiment
The present invention is applied to the comprehensive condition of data management and task management, in original frame foundation, completion pair
In the customization of component, the management of multiple format data can be supported, the interaction of multi-protocol data, the landing of diversiform data,
The scheduling of multi-platform task.And the scheduling of task, the management of data and the publication of function can be routinely carried out, and energy
Enough depend on life cycle management of this platform completion for all kinds of computing platforms.Hereinafter, in conjunction with the accompanying drawings and embodiments to this hair
It is bright to be further elaborated.
Fig. 1 is the United Dispatching working platform flow diagram the present invention is based on data flow and workflow, and data enter in figure
Mouth is multiple types data source, and due to the diversity of data source, data are probably derived from a variety of different databases, by different
Agreement is transmitted, and is existed by diversified mode, so the data input elements of data flow engine is needed to be adapted to
Various data sources, while the reception of data is completed, cleaning and output services.In addition to existing basic data processing component can make
With the function of Custom component extended data stream engine can also be developed.Data flow engine only to data carry out data flow and
The management of the quality of data, further calculating task are packaged in inside the specific scheduler task of workflow, and workflow engine is just
It is that dynamic receives the data input of data flow engine and calls and manage all kinds of Computational frames, the feedback of data passes through data again
Stream engine is exported.
Data flow engine operates data with method of service in Fig. 1, just becomes data flow platform.Work at the same time stream
Engine and all kinds of Computational frames are combined together, the task management of execution cycle property, and cooperation data flow platform completes the access of data,
It is formed final workflow platform.The United Dispatching that two platforms combine referred to as based on data flow and workflow is put down
Platform.
With multi-source data and multiple business system and scene is saved as, if unification of the structure based on data flow and workflow
Dispatching platform, it is necessary first to complete the deployment for Apache Nifi, while dispose data input elements in the present invention, data
Cleaning assembly and data output component selective can dispose (such as HTTP resolution components etc.) Custom component.So
Data input elements and data output component are configured afterwards, cleaning assembly are configured and layout, log-on data levelling
Platform.After data flow platform can manage data, by own frame application configuration such as Spark to workflow platform,
Carry out the management and running of life cycle.
When data flow platform and workflow platform are started to work, data source is once data are generated, and data flow is just
Start to access data, clean and management and control flows to, be flowed into the Computational frame of Workflow Management, the mode of inflow can be with
Direct interaction formula flows into, and may be inserted into shared database table, can also be flowed by being stored as the modes such as shared file.Workflow
The difference that mode is flowed into according to data takes different scheduling strategies, completes the calling for types of applications, calculates number of results
According to the data after calculating carry out persistence by data flow platform, and that completes data result does kind of a landing.
When data source is extensive, data result is varied, and such data will be landed by cleaning and being parsed data
Or flow directly into all kinds of operation systems.All kinds of operation systems often only support the data of single source and standardization, so should
Platform can use ready-made data acquisition components, the cleaning of the complete paired data such as protocol analysis component and data write-in component.
The United Dispatching platform based on data flow and workflow can carry out all kinds of operation systems the management of life cycle simultaneously, right
It is monitored in real time in the data inflow and outflow of the operating status and operation system of operation system.By the write-ins of data processing and
Outflow process is most clearly shown on component platform.United Dispatching platform based on data flow and workflow effectively shields respectively
The development task of a operation system, without being adapted to Various types of data, it is possible to complete the access of multi-source data, while can complete
Management for each generic task, the large size including operation system continue task, also include simple interim batch processing task, by him
Unified effective management get up.
The present invention can further improve the availability of existing job scheduling platform, by Workflow Management and data stream management
Combine, and can plug-in type write self-defined major key, support the towed page, reduce development cost and study into
This, meets the requirement that data are controllable, and rule can match.
Although the invention has been described by way of example and in terms of the preferred embodiments, but it is not for limiting the present invention, any this field
Technical staff without departing from the spirit and scope of the present invention, may be by the methods and technical content of the disclosure above to this hair
Bright technical solution makes possible variation and modification, therefore, every content without departing from technical solution of the present invention, and according to the present invention
Any simple modifications, equivalents, and modifications made to above example of technical spirit, belong to technical solution of the present invention
Protection domain.
Claims (10)
1. a kind of United Dispatching platform based on data flow and workflow, which is characterized in that including data flow platform and workflow
Platform, data flow engine operate data, form data flow platform, and workflow engine cooperation data flow platform completes data
Access, formed workflow platform.
A kind of 2. United Dispatching platform based on data flow and workflow as described in claim 1, which is characterized in that the number
Include data input elements, data cleansing component, data output component and Custom component according to stream engine, data flow engine receives
Data, data input elements are adapted to various data sources, and data cleansing component completes the cleaning of data, and data output component completes number
According to output, Custom component is extended the function of data flow engine.
A kind of 3. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that the number
According to stream engine data are carried out with the management of data flow and the quality of data.
A kind of 4. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that the number
It is encapsulated in workflow engine according to the calculating task of stream engine.
A kind of 5. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that workflow
Engine receives the data input of data flow engine, and calls and management Computational frame, then by data flow engine by feedback
Data export.
A kind of 6. United Dispatching platform based on data flow and workflow as claimed in claim 5, which is characterized in that workflow
Engine is called by interface and script and management Computational frame.
7. a kind of work of United Dispatching platform based on data flow and workflow that any one is provided in claim 1-6
Method, which is characterized in that include the following steps:
Step 1, data flow platform and workflow platform are disposed;
Step 2, configuration data stream engine is developed, including input module, data cleansing component, data output component and self-defined group
Part;
Step 3, configuration work stream engine is developed, including task scheduling component, abnormal task monitor component and task parameters group
Part;
Step 4, develop and write common tasks dispatch interface;
Step 5, the United Dispatching platform based on data flow and workflow carries out data processing the management and running of life cycle.
8. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature
It is, the data flow engine is NiFi data flow engines, and the workflow engine is NiFi workflow engines.
9. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature
It is, the United Dispatching platform based on data flow and workflow solves data by general purpose module or Custom component
Analysis.
10. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature
It is, the workflow engine support Spark Computational frames, Hadoop Computational frames, Storm Computational frames and self-defined
Business.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711370210.XA CN108154341A (en) | 2017-12-18 | 2017-12-18 | United Dispatching platform and method of work based on data flow and workflow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711370210.XA CN108154341A (en) | 2017-12-18 | 2017-12-18 | United Dispatching platform and method of work based on data flow and workflow |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108154341A true CN108154341A (en) | 2018-06-12 |
Family
ID=62467630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711370210.XA Pending CN108154341A (en) | 2017-12-18 | 2017-12-18 | United Dispatching platform and method of work based on data flow and workflow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108154341A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344189A (en) * | 2018-09-19 | 2019-02-15 | 浪潮软件集团有限公司 | Big data calculation method and device based on NiFi |
CN110347741A (en) * | 2019-07-18 | 2019-10-18 | 普元信息技术股份有限公司 | The system and its control method of the outputting result quality of data are effectively promoted in big data treatment process |
CN111177247A (en) * | 2019-12-30 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Data conversion method, device and storage medium |
CN112637356A (en) * | 2020-12-28 | 2021-04-09 | 国电电力发展股份有限公司 | Data synchronous transmission method, system, medium and terminal of remote data center |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452450A (en) * | 2007-11-30 | 2009-06-10 | 上海市电力公司 | Multiple source data conversion service method and apparatus thereof |
CN106874461A (en) * | 2017-02-14 | 2017-06-20 | 北京慧正通软科技有限公司 | A kind of workflow engine supports multi-data source configuration security access system and method |
CN107392357A (en) * | 2017-06-30 | 2017-11-24 | 安徽四创电子股份有限公司 | A kind of public transport based on big data platform is precisely gone on a journey service system and method |
CN107451666A (en) * | 2017-07-15 | 2017-12-08 | 西安电子科技大学 | Breaker based on big data analysis assembles Tracing back of quality questions system and method |
-
2017
- 2017-12-18 CN CN201711370210.XA patent/CN108154341A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452450A (en) * | 2007-11-30 | 2009-06-10 | 上海市电力公司 | Multiple source data conversion service method and apparatus thereof |
CN106874461A (en) * | 2017-02-14 | 2017-06-20 | 北京慧正通软科技有限公司 | A kind of workflow engine supports multi-data source configuration security access system and method |
CN107392357A (en) * | 2017-06-30 | 2017-11-24 | 安徽四创电子股份有限公司 | A kind of public transport based on big data platform is precisely gone on a journey service system and method |
CN107451666A (en) * | 2017-07-15 | 2017-12-08 | 西安电子科技大学 | Breaker based on big data analysis assembles Tracing back of quality questions system and method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344189A (en) * | 2018-09-19 | 2019-02-15 | 浪潮软件集团有限公司 | Big data calculation method and device based on NiFi |
CN109344189B (en) * | 2018-09-19 | 2021-05-14 | 浪潮软件股份有限公司 | Big data calculation method and device based on NiFi |
CN110347741A (en) * | 2019-07-18 | 2019-10-18 | 普元信息技术股份有限公司 | The system and its control method of the outputting result quality of data are effectively promoted in big data treatment process |
CN110347741B (en) * | 2019-07-18 | 2023-05-05 | 普元信息技术股份有限公司 | System for effectively improving output result data quality in big data processing process and control method thereof |
CN111177247A (en) * | 2019-12-30 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Data conversion method, device and storage medium |
CN111177247B (en) * | 2019-12-30 | 2023-10-20 | 腾讯科技(深圳)有限公司 | Data conversion method, device and storage medium |
CN112637356A (en) * | 2020-12-28 | 2021-04-09 | 国电电力发展股份有限公司 | Data synchronous transmission method, system, medium and terminal of remote data center |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108154341A (en) | United Dispatching platform and method of work based on data flow and workflow | |
CN110704518B (en) | Business data processing method and device based on Flink engine | |
US10185644B2 (en) | Service implementation based debugger for service oriented architecture projects | |
CN103559118A (en) | Security auditing method based on aspect oriented programming (AOP) and annotation information system | |
WO2018126964A1 (en) | Task execution method and apparatus and server | |
US20160182652A1 (en) | Systems and/or methods for cloud-based event-driven integration | |
AU2016322817B2 (en) | Application provisioning system for requesting configuration updates for application objects across data centers | |
CN104021452A (en) | Method for integrating various service systems at cloud computing server side | |
CN105893055B (en) | Flow engine hardware and software platform triggering method | |
JP2011118879A (en) | Location independent execution of user interface operations | |
CN112422638A (en) | Data real-time stream processing method, system, computer device and storage medium | |
CN114205230A (en) | Method, system, medium and electronic device for configuring cloud native network element | |
CN105718601A (en) | Dynamic business integrating model and application method thereof | |
CN103197927B (en) | A kind of method that realizes of Workflow and system thereof | |
CN109144512B (en) | Method and system for generating API | |
CN112686580B (en) | Workflow definition method and system capable of customizing flow | |
CN113467972A (en) | Communication interface construction method, communication interface construction device, computer equipment and storage medium | |
CN109614096B (en) | Method for converting use cases and activities in modeling process based on UML (unified modeling language) requirements | |
CN106559493B (en) | Service issuing method and service delivery system | |
CN114416314B (en) | Service arrangement method based on API gateway | |
JP2010049439A (en) | System construction method using software model and modeling device | |
CN107291455B (en) | Method and system for realizing transfer service based on factory mode | |
Miyamoto et al. | An approach for synthesizing intelligible state machine models from choreography using petri nets | |
CN103092620B (en) | A kind of Microsoft Exchange Server 2010 Web service integrated development method | |
JP2009099015A (en) | User interface integrated system and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180612 |