CN108154341A - United Dispatching platform and method of work based on data flow and workflow - Google Patents

United Dispatching platform and method of work based on data flow and workflow Download PDF

Info

Publication number
CN108154341A
CN108154341A CN201711370210.XA CN201711370210A CN108154341A CN 108154341 A CN108154341 A CN 108154341A CN 201711370210 A CN201711370210 A CN 201711370210A CN 108154341 A CN108154341 A CN 108154341A
Authority
CN
China
Prior art keywords
data
workflow
data flow
engine
united dispatching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711370210.XA
Other languages
Chinese (zh)
Inventor
曲洋
陈有为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxun Position Network Co Ltd
Original Assignee
Qianxun Position Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxun Position Network Co Ltd filed Critical Qianxun Position Network Co Ltd
Priority to CN201711370210.XA priority Critical patent/CN108154341A/en
Publication of CN108154341A publication Critical patent/CN108154341A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Abstract

The present invention provides a kind of United Dispatching platforms and method of work based on data flow and workflow, can effectively combine the advantage of data flow and workflow so that task scheduling configurationization, stream compression parametrization.The labyrinth of multi-source data is can adapt to, completes the entire ETL processes of data, while complete the management for a variety of calculating task life cycles.This platform can combine the management of data flow and the scheduling of workflow, so that data from be flowed into calculate, result output is finally arrived, forms the datamation flow tube platform of complete set, this platform has taken into account the direction controlling of data flow and the period control of workflow.

Description

United Dispatching platform and method of work based on data flow and workflow
Technical field
The present invention relates to data stream managements and workflow management technology field, and in particular to one kind is based on data flow and work The United Dispatching platform and method of work of stream.
Background technology
Workflow engine gradually becomes focus of greatest concern in recent years, due to the diversity of task now, task Complexity, an effective task life cycle management platform increasingly become the important need of enterprises and individuals, but this Class platform often only supports the management for workflow, such as (OOZIE is a kind of frame to OOZIE, it can be multiple Map/ In Reduce job mix a to logical unit of work), and more and more scene be the task scheduling platform of enterprise simultaneously It needs to take into account the management for data flow, although usually under scene to the process demand of data being only the download of data, decompression, The simple operations such as conversion and landing, but since existing workflow platform can not complete such data management function so that number It is opened according to flow management and the complete cutting of Workflow Management, results in the waste of resource.
Workflow platform of the prior art, such as OOZIE, support the management and running of workflow, and difficulty is configured, using not Just, it visualizes coarse etc..Such other workflow framework can not effectively be extended, can not when there is data flow to need management The effective operation for carrying out data, can only depend on task in itself, increase additional development cost.
Invention content
The United Dispatching platform based on data flow and workflow that the present invention designs, except NiFi, (Apache NiFi are one Easy-to-use, powerful, reliable component-based development platform, for handling and distributing data) original component, while also merged and independently ground 50 kinds of components of hair, including the data ciphering and deciphering of data flow, data buffer storage, the backup of data cross-server etc., Yi Jigong Spark scheduling in flowing, (Spark and Strom are Distributed Calculation frame for Storm scheduling and the scheduling of regular query task etc. Frame), this platform assembly can be used by being configured, and arbitrary function, which can configure, to be developed, and disclosure satisfy that nearly all for data Conversion processing and task management and running function, solve the technical issues of work on hand levelling platform is inconvenient for use, extension is difficult.
The technical solution adopted by the present invention is as follows:
A kind of United Dispatching platform based on data flow and workflow, including data flow platform and workflow platform, data Stream engine operates data, forms data flow platform, and workflow engine cooperation data flow platform completes the access of data, shape Into workflow platform.
Further, the data flow engine include data input elements, data cleansing component, data output component and from Definitions component, data flow engine receive data, and data input elements are adapted to various data sources, and data cleansing component completes data Cleaning, data output component complete the output of data, and Custom component is extended the function of data flow engine.
Further, the data flow engine carries out data the management of data flow and the quality of data.
Further, the calculating task of the data flow engine is encapsulated in workflow engine.
Further, workflow engine receives the data input of data flow engine, and calls and manage Computational frame, then The data of feedback are exported by data flow engine.
Further, the management of a variety of data is supported based on the United Dispatching platform of data flow and workflow, it can be flexible Configuration have component complete puppy parc data parsing, while can also write according to demand Custom component complete for The support of a variety of data formats and different agreement type.
Further, plug-in type exploitation is supported based on the United Dispatching platform of data flow and workflow, Function Extension is flexible.
Further, the calculating task frame supported based on the United Dispatching platform of data flow and workflow is extensive, can be with The support for all kinds of working frames is completed in several ways, can be completed by the modes such as interface and script scheduling for list The scheduling of machine and distributed computing framework.
Further, United Dispatching platform based on data flow and workflow can be by data flow and workflow integration to one It rises, is faithful to one's husband to the end, the conversion landing for data can either be completed, and other Computational frames can be called to complete data in standard Rotation in workflow.
A kind of method of work of the United Dispatching platform based on data flow and workflow, includes the following steps:
Step 1, data flow platform and workflow platform, such as NIFI workflow engines are disposed;
Step 2, configuration data stream engine is developed, including input module, data cleansing component, data output component and is made by oneself Adopted component;
Step 3, configuration work stream engine is developed, including task scheduling component, abnormal task monitor component and task parameters Change component;
Step 4, common tasks dispatch interface is developed and writes, writing data mainly for general big data platform is included in group Part, such as Spark, Storm etc.;
Step 5, the United Dispatching platform based on data flow and workflow carries out data processing the scheduling pipe of life cycle Reason,, all can be by this based on data flow and workflow to the destruction of task from the starting of task to the landing of data result The management of United Dispatching platform.
Further, the data flow engine is NiFi data flow engines, and the workflow engine draws for NiFi workflows It holds up.
Further, the United Dispatching platform based on data flow and workflow passes through general purpose module or Custom component pair Data are parsed.
Further, the workflow engine supports Spark Computational frames, Hadoop Computational frames, Storm Computational frames And custom task.
The present invention had not only supported the operation for data, but also can support the management for scheduler task, and expansible, can Programming, copes with the application scenarios of various situations, has the advantages that:
1st, it is developed based on NiFi, illustrative is apparent, and function is more easy to extend.
2nd, the management of data flow and the management of workflow have been annexed, the function of general-purpose type is completely converted into assembly type opens Hair.
3rd, data flow and the readability of duty cycle are improved, and function cutting is clear and definite, use can be freely combined.
Description of the drawings
Fig. 1 is the United Dispatching working platform flow diagram the present invention is based on data flow and workflow.
Specific embodiment
The present invention is applied to the comprehensive condition of data management and task management, in original frame foundation, completion pair In the customization of component, the management of multiple format data can be supported, the interaction of multi-protocol data, the landing of diversiform data, The scheduling of multi-platform task.And the scheduling of task, the management of data and the publication of function can be routinely carried out, and energy Enough depend on life cycle management of this platform completion for all kinds of computing platforms.Hereinafter, in conjunction with the accompanying drawings and embodiments to this hair It is bright to be further elaborated.
Fig. 1 is the United Dispatching working platform flow diagram the present invention is based on data flow and workflow, and data enter in figure Mouth is multiple types data source, and due to the diversity of data source, data are probably derived from a variety of different databases, by different Agreement is transmitted, and is existed by diversified mode, so the data input elements of data flow engine is needed to be adapted to Various data sources, while the reception of data is completed, cleaning and output services.In addition to existing basic data processing component can make With the function of Custom component extended data stream engine can also be developed.Data flow engine only to data carry out data flow and The management of the quality of data, further calculating task are packaged in inside the specific scheduler task of workflow, and workflow engine is just It is that dynamic receives the data input of data flow engine and calls and manage all kinds of Computational frames, the feedback of data passes through data again Stream engine is exported.
Data flow engine operates data with method of service in Fig. 1, just becomes data flow platform.Work at the same time stream Engine and all kinds of Computational frames are combined together, the task management of execution cycle property, and cooperation data flow platform completes the access of data, It is formed final workflow platform.The United Dispatching that two platforms combine referred to as based on data flow and workflow is put down Platform.
With multi-source data and multiple business system and scene is saved as, if unification of the structure based on data flow and workflow Dispatching platform, it is necessary first to complete the deployment for Apache Nifi, while dispose data input elements in the present invention, data Cleaning assembly and data output component selective can dispose (such as HTTP resolution components etc.) Custom component.So Data input elements and data output component are configured afterwards, cleaning assembly are configured and layout, log-on data levelling Platform.After data flow platform can manage data, by own frame application configuration such as Spark to workflow platform, Carry out the management and running of life cycle.
When data flow platform and workflow platform are started to work, data source is once data are generated, and data flow is just Start to access data, clean and management and control flows to, be flowed into the Computational frame of Workflow Management, the mode of inflow can be with Direct interaction formula flows into, and may be inserted into shared database table, can also be flowed by being stored as the modes such as shared file.Workflow The difference that mode is flowed into according to data takes different scheduling strategies, completes the calling for types of applications, calculates number of results According to the data after calculating carry out persistence by data flow platform, and that completes data result does kind of a landing.
When data source is extensive, data result is varied, and such data will be landed by cleaning and being parsed data Or flow directly into all kinds of operation systems.All kinds of operation systems often only support the data of single source and standardization, so should Platform can use ready-made data acquisition components, the cleaning of the complete paired data such as protocol analysis component and data write-in component. The United Dispatching platform based on data flow and workflow can carry out all kinds of operation systems the management of life cycle simultaneously, right It is monitored in real time in the data inflow and outflow of the operating status and operation system of operation system.By the write-ins of data processing and Outflow process is most clearly shown on component platform.United Dispatching platform based on data flow and workflow effectively shields respectively The development task of a operation system, without being adapted to Various types of data, it is possible to complete the access of multi-source data, while can complete Management for each generic task, the large size including operation system continue task, also include simple interim batch processing task, by him Unified effective management get up.
The present invention can further improve the availability of existing job scheduling platform, by Workflow Management and data stream management Combine, and can plug-in type write self-defined major key, support the towed page, reduce development cost and study into This, meets the requirement that data are controllable, and rule can match.
Although the invention has been described by way of example and in terms of the preferred embodiments, but it is not for limiting the present invention, any this field Technical staff without departing from the spirit and scope of the present invention, may be by the methods and technical content of the disclosure above to this hair Bright technical solution makes possible variation and modification, therefore, every content without departing from technical solution of the present invention, and according to the present invention Any simple modifications, equivalents, and modifications made to above example of technical spirit, belong to technical solution of the present invention Protection domain.

Claims (10)

1. a kind of United Dispatching platform based on data flow and workflow, which is characterized in that including data flow platform and workflow Platform, data flow engine operate data, form data flow platform, and workflow engine cooperation data flow platform completes data Access, formed workflow platform.
A kind of 2. United Dispatching platform based on data flow and workflow as described in claim 1, which is characterized in that the number Include data input elements, data cleansing component, data output component and Custom component according to stream engine, data flow engine receives Data, data input elements are adapted to various data sources, and data cleansing component completes the cleaning of data, and data output component completes number According to output, Custom component is extended the function of data flow engine.
A kind of 3. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that the number According to stream engine data are carried out with the management of data flow and the quality of data.
A kind of 4. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that the number It is encapsulated in workflow engine according to the calculating task of stream engine.
A kind of 5. United Dispatching platform based on data flow and workflow as claimed in claim 2, which is characterized in that workflow Engine receives the data input of data flow engine, and calls and management Computational frame, then by data flow engine by feedback Data export.
A kind of 6. United Dispatching platform based on data flow and workflow as claimed in claim 5, which is characterized in that workflow Engine is called by interface and script and management Computational frame.
7. a kind of work of United Dispatching platform based on data flow and workflow that any one is provided in claim 1-6 Method, which is characterized in that include the following steps:
Step 1, data flow platform and workflow platform are disposed;
Step 2, configuration data stream engine is developed, including input module, data cleansing component, data output component and self-defined group Part;
Step 3, configuration work stream engine is developed, including task scheduling component, abnormal task monitor component and task parameters group Part;
Step 4, develop and write common tasks dispatch interface;
Step 5, the United Dispatching platform based on data flow and workflow carries out data processing the management and running of life cycle.
8. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature It is, the data flow engine is NiFi data flow engines, and the workflow engine is NiFi workflow engines.
9. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature It is, the United Dispatching platform based on data flow and workflow solves data by general purpose module or Custom component Analysis.
10. a kind of method of work of the United Dispatching platform based on data flow and workflow as claimed in claim 7, feature It is, the workflow engine support Spark Computational frames, Hadoop Computational frames, Storm Computational frames and self-defined Business.
CN201711370210.XA 2017-12-18 2017-12-18 United Dispatching platform and method of work based on data flow and workflow Pending CN108154341A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711370210.XA CN108154341A (en) 2017-12-18 2017-12-18 United Dispatching platform and method of work based on data flow and workflow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711370210.XA CN108154341A (en) 2017-12-18 2017-12-18 United Dispatching platform and method of work based on data flow and workflow

Publications (1)

Publication Number Publication Date
CN108154341A true CN108154341A (en) 2018-06-12

Family

ID=62467630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711370210.XA Pending CN108154341A (en) 2017-12-18 2017-12-18 United Dispatching platform and method of work based on data flow and workflow

Country Status (1)

Country Link
CN (1) CN108154341A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344189A (en) * 2018-09-19 2019-02-15 浪潮软件集团有限公司 Big data calculation method and device based on NiFi
CN110347741A (en) * 2019-07-18 2019-10-18 普元信息技术股份有限公司 The system and its control method of the outputting result quality of data are effectively promoted in big data treatment process
CN111177247A (en) * 2019-12-30 2020-05-19 腾讯科技(深圳)有限公司 Data conversion method, device and storage medium
CN112637356A (en) * 2020-12-28 2021-04-09 国电电力发展股份有限公司 Data synchronous transmission method, system, medium and terminal of remote data center

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452450A (en) * 2007-11-30 2009-06-10 上海市电力公司 Multiple source data conversion service method and apparatus thereof
CN106874461A (en) * 2017-02-14 2017-06-20 北京慧正通软科技有限公司 A kind of workflow engine supports multi-data source configuration security access system and method
CN107392357A (en) * 2017-06-30 2017-11-24 安徽四创电子股份有限公司 A kind of public transport based on big data platform is precisely gone on a journey service system and method
CN107451666A (en) * 2017-07-15 2017-12-08 西安电子科技大学 Breaker based on big data analysis assembles Tracing back of quality questions system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452450A (en) * 2007-11-30 2009-06-10 上海市电力公司 Multiple source data conversion service method and apparatus thereof
CN106874461A (en) * 2017-02-14 2017-06-20 北京慧正通软科技有限公司 A kind of workflow engine supports multi-data source configuration security access system and method
CN107392357A (en) * 2017-06-30 2017-11-24 安徽四创电子股份有限公司 A kind of public transport based on big data platform is precisely gone on a journey service system and method
CN107451666A (en) * 2017-07-15 2017-12-08 西安电子科技大学 Breaker based on big data analysis assembles Tracing back of quality questions system and method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344189A (en) * 2018-09-19 2019-02-15 浪潮软件集团有限公司 Big data calculation method and device based on NiFi
CN109344189B (en) * 2018-09-19 2021-05-14 浪潮软件股份有限公司 Big data calculation method and device based on NiFi
CN110347741A (en) * 2019-07-18 2019-10-18 普元信息技术股份有限公司 The system and its control method of the outputting result quality of data are effectively promoted in big data treatment process
CN110347741B (en) * 2019-07-18 2023-05-05 普元信息技术股份有限公司 System for effectively improving output result data quality in big data processing process and control method thereof
CN111177247A (en) * 2019-12-30 2020-05-19 腾讯科技(深圳)有限公司 Data conversion method, device and storage medium
CN111177247B (en) * 2019-12-30 2023-10-20 腾讯科技(深圳)有限公司 Data conversion method, device and storage medium
CN112637356A (en) * 2020-12-28 2021-04-09 国电电力发展股份有限公司 Data synchronous transmission method, system, medium and terminal of remote data center

Similar Documents

Publication Publication Date Title
CN108154341A (en) United Dispatching platform and method of work based on data flow and workflow
CN110704518B (en) Business data processing method and device based on Flink engine
US10185644B2 (en) Service implementation based debugger for service oriented architecture projects
CN103559118A (en) Security auditing method based on aspect oriented programming (AOP) and annotation information system
WO2018126964A1 (en) Task execution method and apparatus and server
US20160182652A1 (en) Systems and/or methods for cloud-based event-driven integration
AU2016322817B2 (en) Application provisioning system for requesting configuration updates for application objects across data centers
CN104021452A (en) Method for integrating various service systems at cloud computing server side
CN105893055B (en) Flow engine hardware and software platform triggering method
JP2011118879A (en) Location independent execution of user interface operations
CN112422638A (en) Data real-time stream processing method, system, computer device and storage medium
CN114205230A (en) Method, system, medium and electronic device for configuring cloud native network element
CN105718601A (en) Dynamic business integrating model and application method thereof
CN103197927B (en) A kind of method that realizes of Workflow and system thereof
CN109144512B (en) Method and system for generating API
CN112686580B (en) Workflow definition method and system capable of customizing flow
CN113467972A (en) Communication interface construction method, communication interface construction device, computer equipment and storage medium
CN109614096B (en) Method for converting use cases and activities in modeling process based on UML (unified modeling language) requirements
CN106559493B (en) Service issuing method and service delivery system
CN114416314B (en) Service arrangement method based on API gateway
JP2010049439A (en) System construction method using software model and modeling device
CN107291455B (en) Method and system for realizing transfer service based on factory mode
Miyamoto et al. An approach for synthesizing intelligible state machine models from choreography using petri nets
CN103092620B (en) A kind of Microsoft Exchange Server 2010 Web service integrated development method
JP2009099015A (en) User interface integrated system and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180612