CN111352616A - Real-time calculation visualization development system and application method thereof - Google Patents

Real-time calculation visualization development system and application method thereof Download PDF

Info

Publication number
CN111352616A
CN111352616A CN202010104631.3A CN202010104631A CN111352616A CN 111352616 A CN111352616 A CN 111352616A CN 202010104631 A CN202010104631 A CN 202010104631A CN 111352616 A CN111352616 A CN 111352616A
Authority
CN
China
Prior art keywords
data
real
flow
task
development
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010104631.3A
Other languages
Chinese (zh)
Inventor
张寒
张毅
孙迁
王广邦
谢之虬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Cloud Computing Co Ltd
Original Assignee
Suning Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Cloud Computing Co Ltd filed Critical Suning Cloud Computing Co Ltd
Priority to CN202010104631.3A priority Critical patent/CN111352616A/en
Publication of CN111352616A publication Critical patent/CN111352616A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a real-time calculation visualization development system and an application method thereof. Selecting a corresponding component according to the data type of the service scene; on the flow configuration interface, connecting each component to form a complete flow in a dragging mode, and performing visual business flow configuration; and assembling the configured flow information into a Table & SQL API code which can be executed by the Flink, calling a compiling and packaging command, generating a task jar packet of the Flink, and issuing the task jar packet to the Flink cluster for running. The invention simplifies the code process by a visual development mode, supports more scenes which are not supported by SQL development, avoids the calling scene of a black box in the development process, reduces the development threshold of real-time tasks, enables operation and maintenance personnel to develop the real-time tasks and improves the development efficiency.

Description

Real-time calculation visualization development system and application method thereof
Technical Field
The invention relates to the field of flink-based real-time computing task development, in particular to a real-time computing visual development system and an application method thereof.
Background
The online computing development platform ocdp is a platform developed based on a Flink distributed open source data processing framework and oriented to stream (Native Streaming) processing. Besides the low-delay stream processing capability, the flexible operation state and stream window and the high-efficiency stream and data fault-tolerant mechanism, two development modes of visualization task development and SQL development, online debugging, online operation and maintenance management are provided.
The current scenario developed based on flink: at present, flink tasks are developed in two basic classification modes, wherein the first mode is developed in a SQL writing mode, and the second mode is developed in a jar package mode. Through SQL development, if the logic of calculation is complex, the development process is also relatively complex, and some scenes of current SQL development are unsupported, and can be solved in a jar package mode under the condition of common unsupported, but the calling of the jar package is a black box in the development process, and the specific content of the jar package is not clear when the calling is carried out each time.
Disclosure of Invention
The invention aims to provide a real-time calculation visualization development system and an application method thereof.
The technical solution for realizing the purpose of the invention is as follows: a real-time computing visualization development system, comprising:
the flow configuration interface module is used for selecting corresponding components according to the data types of the service scenes;
and the general processing module is used for connecting all the components in a dragging mode to form a complete flow, and realizing parameter configuration or/and data processing in the components by adopting a system function or/and a user-defined function to complete visual business flow configuration.
The overall processing module comprises:
the flow configuration module is used for connecting all the components in a dragging mode to form a complete flow;
the UDFS module is used for realizing parameter configuration or/and data processing in the component by utilizing a system function or/and a custom function;
furthermore, the total processing module also comprises a process monitoring module which is used for monitoring the process tasks and giving an alarm.
The sequence of connecting the components is as follows: starting with the input source component, intermediately connecting the data processing components, and ending with the output source component.
Further, the data processing components include filtering filter, conversion, routing, merging, grouping, dual-stream join, association dimension table, json parsing and/or deduplication.
A method of applying a real-time computing visualization development system, the method comprising:
selecting a corresponding component according to the data type of the service scene;
on the flow configuration interface, connecting each component to form a complete flow in a dragging mode, and performing visual business flow configuration;
and assembling the configured flow information into a Table & SQL API code which can be executed by the Flink, calling a compiling and packaging command, generating a task jar packet of the Flink, and issuing the task jar packet to the Flink cluster for running.
Furthermore, the connection of the components to form a complete flow comprises three stages of input source, data processing and output source.
Further, the input source supports access to data stored on an external storage system, a file in a specified encoding format, or a messaging system.
Further, the data processing is used for routing, filtering, grouping or eliminating duplication of the data to obtain the data required by the target system.
Further, the output source is used for loading the data after the conversion processing to a target database or file.
Further, the method also comprises the step of supporting task version rollback and comparison, and the task version rollback, comparison or/and debugging are carried out.
Furthermore, the method also comprises single-node branch debugging, which is used for judging whether the business logic of single data processing has problems, processing the data in the current link if the business logic has problems, and checking the debugging result and outputting the result on the page.
Further, the method further comprises a task test, wherein the task test is used for carrying out grammar check on the flow after the task configuration is finished, and the running log of the system can be checked during the test.
Compared with the prior art, the invention has the following remarkable advantages: (1) according to the invention, through a visual development mode, the code process is simplified, more scenes which are not supported by SQL development are supported, and the calling scene of a black box in the development process is avoided; (2) the user does not need to consider the code logic generated by SQL, JSON data is converted into a flight task for submitting and running through the modes of visual drag generation and a console, the real-time task development threshold is reduced, operation and maintenance personnel can also develop the real-time task, and the development efficiency is improved. (3) Based on a unified JSON data protocol, different access modes can be provided, high expansibility is achieved, and more development scenes are met.
Drawings
FIG. 1 is a flow chart of a real-time computation visualization development system of the present invention.
FIG. 2 is a thread diagram of a visualization drag process of the application method of the real-time computation visualization development system.
FIG. 3 is a flow configuration interface diagram.
Fig. 4 is a flowchart of the process development of the present invention.
FIG. 5 is a flow configuration interface diagram.
FIG. 6 is a flow configuration interface diagram.
FIG. 7 is a diagram of a canvas console.
FIG. 8 is a diagram of data processing components.
FIG. 9 is a business flow diagram of the present invention.
Detailed Description
The invention is further described below with reference to the figures and examples.
With reference to fig. 1, the real-time computation visualization development system of the present invention includes:
the flow configuration interface module is used for selecting corresponding components according to the data types of the service scenes;
and the general processing module is used for connecting all the components in a dragging mode to form a complete flow, and realizing parameter configuration or/and data processing in the components by adopting a system function or/and a user-defined function to complete visual business flow configuration.
The overall processing module comprises:
the flow configuration module is used for connecting all the components in a dragging mode to form a complete flow;
and the UDFS module realizes parameter configuration or/and data processing in the component by using the system function or/and the custom function.
The general processing module also comprises a process monitoring module which is used for monitoring the process tasks and giving an alarm.
With reference to fig. 2, 3 and 9, the application method of the real-time computation visualization development system of the present invention includes:
selecting a corresponding component according to the data type of the service scene;
on the flow configuration interface, connecting each component to form a complete flow in a dragging mode, and performing visual business flow configuration;
and assembling the configured flow information into a Table & SQL API code which can be executed by the Flink, calling a compiling and packaging command, generating a task jar packet of the Flink, and issuing the task jar packet to the Flink cluster for running.
The sequence of connecting the components is as follows: starting with the input source component, intermediately connecting the data processing components, and ending with the output source component.
The data processing components comprise a filter, a conversion, a route, a merge merger, a grouping group, a dual-flow join, an association dimension table, a json analysis and/or a duplication removal.
With reference to fig. 4, the complete process formed by connecting the components includes three stages, i.e., an input source, data processing, and an output source.
The input source supports access to data stored on an external storage system, in a file or message system in a specified encoding format.
The data processing is used for routing, filtering, grouping or eliminating duplication of data and acquiring data required by a target system.
And the output source is used for loading the data after the conversion processing to a target database or file.
The method further comprises supporting task version rollback and comparison for task backtracking, comparison or/and debugging.
The method also comprises single-node branch debugging, which is used for judging whether the business logic of single data processing has problems or not, processing in the current link if the business logic has problems, and checking the debugging result and outputting the result on the page.
The method also comprises a task test, which is used for carrying out grammar check on the flow after the task configuration is finished, and the running log of the system can be checked during the test.
The platform is packaged on the basis of FLINK Table & SQL API, and provides a function of dynamically generating API codes.
And (3) carrying out visual business process configuration on an interface by a user, assembling the configured information into a Table & SQL API code which can be executed by the Flink by the system, calling a maven compiling and packaging command, and generating a task jar package of the Flink.
The task runs on top of the Flink Cluster.
And providing a visual process configuration interface, configuring the process on the interface by a user, issuing the process as a Flink task to the Flink cluster, and browsing a system function and a user-defined function on the interface by the user.
The user can configure Source and Sink.
The configured information is stored in MySql, and the process definition is converted into json format and stored in MySql.
And after entering the ocdp platform, the user enters a flow development page. The process development configures processing logic for visually configuring the real-time data stream, and obtains data required by a user by performing a series of processing on the data stream. It generally consists of three stages: input source, data processing, output source.
Inputting a source: data stored in an external storage system, such as a database system (MySQL, hbase.), file specifying an encoding format (CSV, Apache [ partial, Avro, ORC ],) or a messaging system (Apache Kafka, RabbitMQ.,). Only Kafka's data and CSV files are currently supported for access, and the data source is typically located at the front end of the data stream for access to a particular type of data set, which has only output ports and no input ports.
Data processing: the data is processed by routing, filtering, grouping, eliminating the weight and the like to obtain the data required by the target system. The data processing class node is typically located in the middle of the data stream for converting the read data set. Such nodes have input ports and also output ports.
An output source: and the data processing device is used for loading the data after the conversion processing to a target database or file. Support generic interfaces for various file formats (e.g., CSV, Apache request, Apache Avro), storage systems (e.g., JDBC, Apache HBase, Apache Cassandra, Elasticissearch), or messaging systems (e.g., Apache Kafka, RabbitMQ). The first stage only supports output to Kafka. The output source class node is typically located at the end of the data stream.
In the process configuration interface, a user can create a process, save the process, delete the process, test the process and release the process.
As shown in fig. 5 and 6, the flow list of the flow package is on the left, the visual canvas is in the middle, and the input, data processing, output component and custom function addition are on the right.
With reference to fig. 7 and 8, a suitable component can be selected according to a service scenario, and each component (starting with an input source component, connecting a processing component in the middle, and ending with an output source component) is connected in a dragging manner to form a complete flow, during which operations such as parameter configuration and data processing in the component need to be completed, and functions such as a system function and a custom function can be used.
The data processing assembly is located in the middle of the process, and data required by the target system is obtained by performing service processing on the data.
The visualization development function currently provides processing components: filtering filter, converting conversion, routing, merging merger, grouping group ply, double-flow join, association dimension table, json analysis and duplication removal.
The relational dimension table, json analysis, duplicate removal and double-stream join cannot be supported by SQL development.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A real-time computing visualization development system, comprising:
the flow configuration interface module is used for selecting corresponding components according to the data types of the service scenes;
and the general processing module is used for connecting all the components to form a complete flow, and realizing parameter configuration or/and data processing in the components by adopting a system function or/and a custom function to complete visual business flow configuration.
2. The real-time computing visualization development system of claim 1, characterized in that: the total processing module comprises a flow configuration module and a UDFS module.
3. The real-time computing visualization development system of claim 1, characterized in that: the general processing module also comprises a process monitoring module which is used for monitoring the process tasks and giving an alarm.
4. The real-time computing visualization development system according to claim 1, wherein the sequence of connecting the components is: starting with the input source component, intermediately connecting the data processing components, and ending with the output source component.
5. The real-time computing visualization development system of claim 4, characterized in that: the data processing components comprise filtering, converting, routing, merging, grouping, double-flow, associated dimension table, json resolving or/and duplicate removing.
6. An application method of a real-time computation visualization development system, the method comprising:
selecting a corresponding component according to the data type of the service scene;
on the flow configuration interface, connecting each component to form a complete flow, and performing visual business flow configuration;
and assembling the configured flow information into a code which can be executed by the Flink, calling a compiling and packaging command, generating a task jar packet of the Flink, and issuing the task jar packet to the Flink cluster for running.
7. The method of application according to claim 6, characterized in that: the connection of each component to form a complete flow comprises three stages of input source, data processing and output source;
the input source supports access to data stored in an external storage system, a file or message system of a specified encoding format Accordingly;
the data processing is used for routing, filtering, grouping or eliminating the weight of the data to obtain the data required by the target system;
and the output source loads the data after the conversion processing to a target database or a file.
8.The method of application according to claim 6, characterized in that: the method also includes supporting task version rollback and comparison for task backtrackingComparison and/or error correction.
9.The method of application according to claim 6, characterized in that: the method also includes single-node branch debugging, using Judging whether the business logic of single data processing has problems, if so, processing the data in the current link and displaying the data in the page And checking the debugging result and outputting the result.
10. The method of application according to claim 6, characterized in that: the method also comprises a task test, which is used for carrying out grammar check on the flow after the task configuration is finished, and the running log of the system can be checked during the test.
CN202010104631.3A 2020-02-20 2020-02-20 Real-time calculation visualization development system and application method thereof Pending CN111352616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010104631.3A CN111352616A (en) 2020-02-20 2020-02-20 Real-time calculation visualization development system and application method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010104631.3A CN111352616A (en) 2020-02-20 2020-02-20 Real-time calculation visualization development system and application method thereof

Publications (1)

Publication Number Publication Date
CN111352616A true CN111352616A (en) 2020-06-30

Family

ID=71195724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010104631.3A Pending CN111352616A (en) 2020-02-20 2020-02-20 Real-time calculation visualization development system and application method thereof

Country Status (1)

Country Link
CN (1) CN111352616A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111813402A (en) * 2020-07-07 2020-10-23 北京亚鸿世纪科技发展有限公司 Dragging development component and event interaction definer for data visualization development
CN111831718A (en) * 2020-07-16 2020-10-27 北京思特奇信息技术股份有限公司 Data extraction method, device and medium
CN112287007A (en) * 2020-10-30 2021-01-29 常州微亿智造科技有限公司 Industrial production data real-time processing method and system based on Flink SQL engine
CN112506497A (en) * 2020-11-30 2021-03-16 北京九章云极科技有限公司 Data processing method and data processing system
CN112765166A (en) * 2021-01-06 2021-05-07 深圳市欢太科技有限公司 Data processing method, device and computer readable storage medium
CN113590094A (en) * 2021-07-29 2021-11-02 国泰君安证券股份有限公司 One-stop task development, deployment, operation and maintenance platform system, method, device, storage and storage medium based on Flink
CN114925241A (en) * 2022-04-24 2022-08-19 杭州悦数科技有限公司 Method, system, electronic device and storage medium for processing graph data
CN115357309A (en) * 2022-10-24 2022-11-18 深信服科技股份有限公司 Data processing method, device and system and computer readable storage medium
CN117289924A (en) * 2023-10-13 2023-12-26 河北云在信息技术服务有限公司 Visual task scheduling system and method based on Flink

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105700888A (en) * 2016-01-14 2016-06-22 国网山东省电力公司物资公司 Visualization rapid developing platform based on jbpm workflow engine
WO2018072445A1 (en) * 2016-10-20 2018-04-26 南京南瑞继保电气有限公司 Running method for embedded type virtual device and system
CN110764753A (en) * 2019-09-18 2020-02-07 亚信创新技术(南京)有限公司 Business logic code generation method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105700888A (en) * 2016-01-14 2016-06-22 国网山东省电力公司物资公司 Visualization rapid developing platform based on jbpm workflow engine
WO2018072445A1 (en) * 2016-10-20 2018-04-26 南京南瑞继保电气有限公司 Running method for embedded type virtual device and system
CN110764753A (en) * 2019-09-18 2020-02-07 亚信创新技术(南京)有限公司 Business logic code generation method, device, equipment and storage medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111813402A (en) * 2020-07-07 2020-10-23 北京亚鸿世纪科技发展有限公司 Dragging development component and event interaction definer for data visualization development
CN111831718A (en) * 2020-07-16 2020-10-27 北京思特奇信息技术股份有限公司 Data extraction method, device and medium
CN112287007A (en) * 2020-10-30 2021-01-29 常州微亿智造科技有限公司 Industrial production data real-time processing method and system based on Flink SQL engine
CN112506497A (en) * 2020-11-30 2021-03-16 北京九章云极科技有限公司 Data processing method and data processing system
CN112506497B (en) * 2020-11-30 2021-08-24 北京九章云极科技有限公司 Data processing method and data processing system
CN112765166A (en) * 2021-01-06 2021-05-07 深圳市欢太科技有限公司 Data processing method, device and computer readable storage medium
CN113590094A (en) * 2021-07-29 2021-11-02 国泰君安证券股份有限公司 One-stop task development, deployment, operation and maintenance platform system, method, device, storage and storage medium based on Flink
CN114925241A (en) * 2022-04-24 2022-08-19 杭州悦数科技有限公司 Method, system, electronic device and storage medium for processing graph data
CN115357309A (en) * 2022-10-24 2022-11-18 深信服科技股份有限公司 Data processing method, device and system and computer readable storage medium
CN117289924A (en) * 2023-10-13 2023-12-26 河北云在信息技术服务有限公司 Visual task scheduling system and method based on Flink

Similar Documents

Publication Publication Date Title
CN111352616A (en) Real-time calculation visualization development system and application method thereof
AU2020203145B2 (en) Processing data from multiple sources
US9576037B2 (en) Self-analyzing data processing job to determine data quality issues
US9391831B2 (en) Dynamic stream processing within an operator graph
CN111008020B (en) Method for analyzing logic expression into general query statement
CN115480753A (en) Application integration system and corresponding computer device and storage medium
CN109684319A (en) Data clean system, method, apparatus and storage medium
CN115934097A (en) Method, device, storage medium and electronic device for generating executable statement
CN116560626A (en) Data processing method, system, equipment and storage medium based on custom rules
US20180063290A1 (en) Networked device management data collection
CN109783626A (en) Problem generation method, intelligent Answer System, medium and computer system
CN103279356B (en) The automatic generation method and device of Makefile file
CN113703739B (en) Cross-language fusion calculation method, system and terminal based on omiga engine
CN112463628A (en) Self-adaptive evolution method of autonomous unmanned system software based on model base framework
CN112100984A (en) Data conversion method and system from EBOM to SBOM
CN112330202B (en) Control intention work order processing method based on arrangement control flow service fulfillment
CN107968722B (en) Method for converting interface control file into AFDX (avionics full Duplex switched Ethernet) network equipment configuration file
US20160364221A1 (en) Source code generation device, source code generation method, and recording medium
CN112671567B (en) 5G core network topology discovery method and device based on service interface
CN112527385A (en) Data processing method, device, working node and storage medium
CN111104390A (en) Method and system for merging and checking multiple CSV files
CN111562937A (en) Code method level defect early warning method
US20130080998A1 (en) Extracting business rules of a service-oriented architecture (soa) system
KR102004592B1 (en) Device for formal verification of automotive control software and methods, recording medium for performing the method
CN118012490A (en) Method, device and equipment for universal data exchange configuration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200630