CN114564480A - Data processing method and device based on Flink platform, electronic equipment and storage medium - Google Patents

Data processing method and device based on Flink platform, electronic equipment and storage medium Download PDF

Info

Publication number
CN114564480A
CN114564480A CN202210198267.0A CN202210198267A CN114564480A CN 114564480 A CN114564480 A CN 114564480A CN 202210198267 A CN202210198267 A CN 202210198267A CN 114564480 A CN114564480 A CN 114564480A
Authority
CN
China
Prior art keywords
data
data processing
layer
flink
flink platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210198267.0A
Other languages
Chinese (zh)
Inventor
叶盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Technology Group Co Ltd, Secworld Information Technology Beijing Co Ltd filed Critical Qianxin Technology Group Co Ltd
Priority to CN202210198267.0A priority Critical patent/CN114564480A/en
Publication of CN114564480A publication Critical patent/CN114564480A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2448Query languages for particular applications; for extensibility, e.g. user defined types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • G06F9/44526Plug-ins; Add-ons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the application provides a data processing method, a data processing device, electronic equipment and a storage medium based on a Flink platform, wherein the method comprises the following steps: presetting a plurality of layers of data layers according to a data structure of a Flink platform, wherein the plurality of layers of data layers comprise interfaces based on different data types; calling corresponding data processing operators through interfaces of different data types of the multiple data layers; generating a data processing task according to the data processing operator; and sending the data processing task to the Flink platform for operation. By implementing the embodiment, multi-level flexible configuration can be realized based on the Flink platform, and a data processing task is easy to construct.

Description

Data processing method and device based on Flink platform, electronic equipment and storage medium
Technical Field
The application relates to the technical field of big data, in particular to a data processing method and device based on a Flink platform, electronic equipment and a storage medium.
Background
The large Flink data platform can use a data processing operator to process data, but after the current large data processing engine packages the Flink, only a single expansion mode is reserved, and all advantages of the Flink cannot be fully exerted.
Disclosure of Invention
An object of the embodiment of the present application is to provide a data processing method and apparatus based on a Flink platform, an electronic device, and a storage medium, which can be expanded in different layers, develop operators in different layers, and improve data processing capability of the Flink platform.
In a first aspect, an embodiment of the present application provides a data processing method based on a Flink platform, including:
presetting a plurality of data layers according to a data structure of a Flink platform, wherein the plurality of data layers comprise interfaces based on different data types;
calling corresponding data processing operators through interfaces of different data types of the multiple data layers;
generating a data processing task according to the data processing operator;
and sending the data processing task to the Flink platform for operation.
In the implementation process, the data structure of the Flink platform is analyzed, multiple data layers are arranged according to the data structure of the Flink platform, the multiple data layers comprise data interfaces based on different data types, and the data processing operators realize the interfaces of the different data layers, so that the data operators are plugged in. Based on the embodiment, multi-level flexible configuration can be realized based on Flink, and a data processing task is easy to construct.
Further, the multi-layered data layer includes:
the system comprises a primary data layer, a tableAPI layer, a FlinkSQL layer and a data set layer;
wherein, the data type of the interface of the original data layer is a basic data structure;
the data type of the interface of the tableAPI layer is a data structure based on a two-dimensional table;
the data type of the interface of the FlinkSQL layer is a data structure based on a database table, and the FlinkSQL layer is configured with an execution function based on a query structured query language variable;
the data type of the interface of the data set layer is a data structure based on a Flink data set.
In the implementation process, the data type of the interface of the original data layer is a basic data structure, based on the basic data structure, the data processing operator can process data of multiple basic data types, the data type of the interface of the TableAPI layer is a data structure based on a two-dimensional table, the data processing operator of the interface of the TableAPI layer can process data of the data structure based on the two-dimensional table, the FlinkSQL layer is configured with an execution function based on a query structured query language variable, and a user can directly configure SQL statements, so that the data operator of the interface of the FlinkSQL layer has a structured query capability, the data type of the interface of the data set layer is a data structure interface based on a Flink data set, and the data processing operator of the interface can implement an API based on the data set. Based on the above embodiment, by using the data operators corresponding to different data layers, various data processing requirements can be satisfied.
Further, the step of generating a data processing task according to the data processing operator includes:
determining a connection sequence between the called data processing operators, and generating a data processing task by taking the data processing operators as nodes according to the connection sequence so that the Flink platform performs operation according to the connection sequence of the data processing operators in the data processing task.
In the implementation process, the data processing tasks are generated according to the connection sequence, and the Flink platform performs operation according to the connection sequence of the data processing operators in the data processing tasks. Based on the above embodiments, customization of data processing tasks can be achieved.
Further, after the step of generating the data processing task by using the data processing operator as a node according to the connection order, the method further includes:
adding a first plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer, so that the first plug-in calling module calls a data conversion plug-in of the Flink platform to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the tableAPI layer; alternatively, the first and second electrodes may be,
and adding a second plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer, so as to call the data conversion plug-in of the Flink platform through the second plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the FlinkSQL layer.
In the implementation process, because the data structures of the original data layer, the TableAPI layer and the FlinkSQL layer are different, a calling module needs to be added between the original data layer and the TableAPI layer and the FlinkSQL layer, and the calling module realizes data type compatibility among different data layers by calling a data conversion plug-in of the Flink platform.
Further, before the step of generating a data processing task according to the data processing operator, the method further includes: and receiving configuration parameters, and configuring the data processing operator according to the configuration parameters.
In the implementation process, the data processing operator can be configured by receiving the configuration parameters, so that the data processing capacity of the data processing operator is improved.
Further, before the step of generating a data processing task according to the data processing operator, the method further includes:
and packaging the data operator corresponding to the original data layer by using a wrapper so that the Flink platform can call the packaged data processing operator, wherein the packaged data processing operator has at least one of synchronous, asynchronous and aggregation functions.
In the implementation process, because the data processing operator corresponding to the original data layer is based on the basic data structure, in order to enable the data processing operator to be called by the data layer of the Flink platform, the data processing operator corresponding to the original data layer needs to meet the interface calling requirement of the Flink platform, and the data processing operator can be packaged based on a wrapper, so that the data processing operator corresponding to the packaged original data layer meets the interface calling requirement of the Flink platform.
Further, the interface of the raw data layer implements a consumer interface in the set of message queue interfaces.
In the implementation process, because the Flink platform has a collection and calling function, in order to avoid calling the data processing operator corresponding to the original data layer by the Flink platform, the interface of the original data layer implements a consumer interface in the message queue interface group, so that the data processing operator corresponding to the original data layer and the Flink platform are decoupled.
In a second aspect, an embodiment of the present application provides a data processing apparatus based on a Flink platform, including:
the data layer setting module is used for presetting a plurality of layers of data layers according to the data structure of the Flink platform, wherein the plurality of layers of data layers comprise interfaces based on different data types;
the data processing operator acquisition module is used for calling corresponding data processing operators through interfaces of different data types of the multilayer data layer;
the data task generating module is used for generating the data processing task from the data processing operator;
and the data task sending module is used for sending the data processing task to the Flink platform for operation.
In a third aspect, an electronic device provided in an embodiment of the present application includes: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any of the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having instructions stored thereon, which, when executed on a computer, cause the computer to perform the method according to any one of the first aspect.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the above-described techniques.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of a data processing method based on a Flink platform according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a data processing apparatus based on a Flink platform according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, an embodiment of the present application provides a data processing method based on a Flink platform, including:
s1: presetting a plurality of layers of data layers according to a data structure of a Flink platform, wherein the plurality of layers of data layers comprise interfaces based on different data types;
s2: calling corresponding data processing operators through interfaces of different data types of the multiple data layers;
in the object-oriented language, the data processing operator inherits the interfaces of different data types according to the standards corresponding to the interfaces of different data types of the multilayer data layers, so that the corresponding data processing operator is called through the interfaces of different data types of the multilayer data layers.
In a possible implementation mode, a plug-in frame is generated in advance, the interface standard of the plug-in frame is the same as the standard corresponding to the interfaces of different data types of the multiple data layers, and multiple data processing operators can be called simultaneously through the driving frame, so that the data processing capacity is improved.
S3: generating a data processing task according to the data processing operator;
s4: and sending the data processing task to a Flink platform for operation.
In the above embodiment, each layer of data layer may include interfaces of multiple data types, and the division of the multiple layer of data layer is based on the processing time and speed of the Flink platform for data of different data structures and the data characteristics of common big data.
Illustratively, in an object oriented language, one of the interface classes may define, distinguish between objects performing different functions based on the class. The data processing operators in the above implementation are objects corresponding to classes that implement interfaces corresponding to the data layers.
In the implementation process, the data structure of the Flink platform is analyzed, multiple data layers are arranged according to the data structure of the Flink platform, the multiple data layers comprise data interfaces based on different data types, and the data processing operators realize the interfaces of the different data layers, so that the data operators are plugged in. Based on the embodiment, multi-level flexible configuration can be realized based on the Flink platform, and a data processing task is easy to construct.
In one possible embodiment, further, the multi-layer data layer includes:
the system comprises a primary data layer, a tableAPI layer, a FlinkSQL layer and a data set layer;
wherein, the data type of the interface of the original data layer is a basic data structure;
the data type of the interface of the tableAPI layer is a data structure based on a two-dimensional table;
the data type of the interface of the FlinkSQL layer is a data structure based on a database table, and the FlinkSQL layer is configured with an execution function based on a query structured query language variable;
the data type of the interface of the data set layer is a data structure based on a Flink data set.
Illustratively, the data types of the interface of the original data layer include: standard data structures such as map, filter, flatmap, aggregate, join, etc. The tableAPI layer is similar to the FlinkSQL layer, data can be operated in a two-dimensional table mode, the data type of an interface of the data set layer is a data structure based on a Flink platform data set, and the data structure is a representation of a data stream on a Flink platform.
It should be noted that the data structure of the data set layer and the data structure of the other data layer may be converted to each other.
In the implementation process, the data type of the interface of the original data layer is a basic data structure, based on the basic data structure, the data processing operator can process data of multiple basic data types, the data type of the interface of the TableAPI layer is a data structure based on a two-dimensional table, the data processing operator of the interface of the TableAPI layer can process data of the data structure based on the two-dimensional table, the FlinkSQL layer is configured with an execution function based on a query structured query language variable, and a user can directly configure SQL statements, so that the data operator of the interface of the FlinkSQL layer has a structured query capability, the data type of the interface of the data set layer is a data structure interface based on a Flink data set, and the data processing operator of the interface can implement an API based on the data set. Based on the above embodiment, by using the data operators corresponding to different data layers, various data processing requirements can be satisfied.
In one possible embodiment, the step of generating a data processing task from a data processing operator comprises:
determining the connection sequence between the called data processing operators, and generating a data processing task by taking the data processing operators as nodes according to the connection sequence so that the Flink platform performs operation according to the connection sequence of the data processing operators in the data processing task.
In the implementation process, the data processing tasks are generated according to the connection sequence, and the Flink platform performs operation according to the connection sequence of the data processing operators in the data processing tasks. Based on the above embodiments, customization of data processing tasks can be achieved.
In a possible implementation, after the step of generating the data processing task for the node by the data processing operator according to the connection order, the method further includes:
adding a first plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer, so that a data conversion plug-in of a Flink platform is called through the first plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the tableAPI layer; alternatively, the first and second electrodes may be,
and adding a second plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer, so that the data conversion plug-in of the Flink platform is called through the second plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the FlinkSQL layer.
In the above embodiment, different modules are set according to the execution sequence between different nodes in the data processing task, and different data conversion plug-ins of the Flink platform are called to convert the format of the output data stream of the node.
Illustratively, the data types of the interfaces in the TableAPI layer and the FlinkSQL layer are both based on a data structure of a two-dimensional table, the computing platform may obtain a column name of the table through the data structure of the two-dimensional table, and if the data processing operator connected to the data processing operator corresponding to the TableAPI layer/the data processing operator corresponding to the FlinkSQL layer is a data processing operator based on an original data layer, since the data type of the interface based on the original data layer is in a Schema-free format (e.g., json), a first calling module needs to be arranged between the two data processing operators, and the first calling module calls a conversion plug-in of the Flink platform to convert the output data of the data processing operator corresponding to the TableAPI layer and the output data of the data operator corresponding to the FlinkSQL layer into a key-value pair of a column name + value.
The data structure in the data processing operator corresponding to the original data layer may not define Schema, and theoretically, the computing platform may obtain the column name of the table from the processed data through the data processing operators corresponding to the data set layer and the original data layer, but such efficiency is too low and the field of the null value cannot be identified, so when the data processing operator corresponding to the original data layer is subsequently connected with the data processing operator corresponding to the TableAPI layer/the data processing operator corresponding to the FlinkSQL layer, a second calling module is added behind the original data layer, and the module calls the data conversion plug-in, so that the key value pair data can be converted into the two-dimensional table with the Schema. After conversion is realized, operators of different levels can be combined for use, and the effect of flexibly configuring a data processing flow is achieved.
In the above embodiment, the Schema is a data structure, and defines field names and types included in the data.
In the implementation process, because the data structures of the original data layer, the TableAPI layer and the FlinkSQL layer are different, a calling module needs to be added between the original data layer and the TableAPI layer and between the original data layer and the FlinkSQL layer, and the calling module realizes data type compatibility among different data layers by calling a data conversion plug-in of the Flink platform.
In a possible implementation, before the step of generating the data processing task according to the data processing operator, the method further includes: and receiving the configuration parameters, and configuring the data processing operator according to the configuration parameters.
Illustratively, the corresponding configuration parameters of the FlinkSQL layer include sql statements, field mappings, time fields, and the like.
It will be appreciated that the data processing operator has set therein a correlation function for the configuration parameter. In the implementation process, flexible configuration of the data processing operator can be realized by receiving the configuration parameters.
In a possible implementation, before the step of generating the data processing task according to the data processing operator, the method further includes:
wrapping data operators corresponding to the original data layer with a wrapper to enable the Flink platform to invoke the wrapped data processing operators, wherein the wrapped data processing operators have at least one of synchronous, asynchronous, and aggregate functionality.
In the implementation process, because the data processing operator corresponding to the original data layer is based on the basic data structure, in order to enable the Flink data layer to call the data processing operator, the data processing operator corresponding to the original data layer needs to be made to meet the interface call requirement of the Flink platform, and the data processing operator can be packaged based on the wrapper, so that the data processing operator corresponding to the packaged original data layer meets the interface call requirement of the Flink. The data processing operator has at least one of synchronous, asynchronous and aggregation functions after being packaged, and can further meet the calling request of Flink.
Further, the interface of the raw data layer implements the consumer interface in the set of message queue interfaces.
In the implementation process, because the Flink platform has a collection and calling function, in order to avoid calling the data processing operator corresponding to the original data layer by the Flink platform, the interface of the original data layer implements a consumer interface in the message queue interface group, so that the data processing operator corresponding to the original data layer and the Flink platform are decoupled.
Illustratively, if the data operator is written in the java language, the java Consumer interface is implemented.
Example 2
Referring to fig. 2, an embodiment of the present application provides a data processing apparatus based on a Flink platform, including:
the data layer setting module 1 is used for presetting a plurality of data layers according to a data structure of the Flink platform, wherein the plurality of data layers comprise interfaces based on different data types;
the data processing operator acquisition module 2 is used for calling corresponding data processing operators through interfaces of different data types of the multilayer data layers;
the data task generating module 3 is used for generating a data processing task from the data processing operator;
and the data task sending module 4 is used for sending the data processing task to the Flink platform for operation.
In the implementation process, the data structure of the Flink platform is analyzed, multiple data layers are arranged according to the data structure of the Flink platform, the multiple data layers comprise data interfaces based on different data types, and the data processing operators realize the data processing operators of the different data layers, so that the data operators are plugged in. Based on the embodiment, the multi-level flexibly configurable and easily constructed data processing task can be realized based on the Flink platform.
In one possible embodiment, the multi-layer data layer includes:
the system comprises a primary data layer, a tableAPI layer, a FlinkSQL layer and a data set layer;
wherein, the data type of the interface of the original data layer is a basic data structure;
the data type of the interface of the tableAPI layer is a data structure based on a two-dimensional table;
the data type of the interface of the FlinkSQL layer is a data structure based on a database table, and the FlinkSQL layer is configured with an execution function based on a query structured query language variable;
the data type of the interface of the data set layer is a data structure based on a Flink data set.
In a possible implementation manner, the data task generating module 3 is further configured to determine a connection order between the called data processing operators, where the connection order includes a connection order and a connection direction of the data operators; and generating a data processing task by taking the data processing operator as a node according to the connection sequence, so that the Flink platform performs operation according to the connection direction of the data processing operator in the data processing task.
In one possible embodiment, the apparatus further comprises: the device comprises a setting module, a first plugin calling module and a conversion module, wherein the setting module is used for adding the first plugin calling module between a data processing operator corresponding to an original data layer and a data processing operator corresponding to a tableAPI layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer, so that the first plugin calling module calls a data conversion plugin of a Flink platform to convert output data of the data processing operator corresponding to the original data layer or output data of the data processing operator corresponding to the tableAPI layer; or adding a second plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer, so as to call the data conversion plug-in of the Flink platform through the second plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the FlinkSQL layer.
In a possible implementation manner, the apparatus further includes a configuration module, configured to receive the configuration parameter, and configure the data processing operator according to the configuration parameter.
In one possible implementation, the apparatus further includes an adaptation module for wrapping a corresponding data processing operator of the original data layer using a wrapper to enable the Flink platform to invoke the wrapped data processing operator, wherein the wrapped data processing operator has at least one of a synchronous function, an asynchronous function, and an aggregation function.
In one possible implementation, the interface of the raw data layer implements the consumer interface in the set of message queue interfaces.
Example 3
Fig. 3 is a schematic view of an electronic device, and fig. 3 is a block diagram of the electronic device according to an embodiment of the present disclosure. The electronic device may comprise a processor 31, a communication interface 32, a memory 33 and at least one communication bus 34. Wherein the communication bus 34 is used for realizing direct connection communication of these components. In the embodiment of the present application, the communication interface 32 of the electronic device is used for performing signaling or data communication with other node devices. The processor 31 may be an integrated circuit chip having signal processing capabilities.
The Processor 31 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 31 may be any conventional processor or the like.
The Memory 33 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 33 stores computer readable instructions, which, when executed by the processor 31, can be executed by the electronic device to perform the steps related to the above method embodiments.
Optionally, the electronic device may further include a memory controller, an input output unit.
The memory 33, the memory controller, the processor 31, the peripheral interface, and the input/output unit are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, these components may be electrically connected to each other via one or more communication buses 34. The processor 31 is adapted to execute executable modules stored in the memory 33, such as software functional modules or computer programs comprised by the electronic device.
The input and output unit is used for providing a task for a user to create and start an optional time period or preset execution time for the task creation so as to realize the interaction between the user and the server. The input/output unit may be, but is not limited to, a mouse, a keyboard, and the like.
It will be appreciated that the configuration shown in fig. 3 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 3 or may have a different configuration than shown in fig. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof.
An embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a computer, when the computer program is executed by a processor, the method in the method embodiment is implemented, and details are not repeated here to avoid repetition.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

Claims (10)

1. A data processing method based on a Flink platform is characterized by comprising the following steps:
presetting a plurality of layers of data layers according to a data structure of a Flink platform, wherein the plurality of layers of data layers comprise interfaces based on different data types;
calling corresponding data processing operators through interfaces of different data types of the multiple data layers;
generating a data processing task according to the data processing operator;
and sending the data processing task to the Flink platform for operation.
2. The Flink platform-based data processing method of claim 1, wherein the multiple data layers comprise:
the system comprises a primary data layer, a tableAPI layer, a FlinkSQL layer and a data set layer;
wherein, the data type of the interface of the original data layer is a basic data structure;
the data type of the interface of the tableAPI layer is a data structure based on a two-dimensional table;
the data type of the interface of the FlinkSQL layer is a data structure based on a database table, and the FlinkSQL layer is configured with an execution function based on a query structured query language variable;
the data type of the interface of the data set layer is a data structure based on a Flink data set.
3. The Flink platform-based data processing method according to claim 2, wherein said step of generating data processing tasks according to said data processing operators comprises:
determining a connection sequence between the called data processing operators, and generating a data processing task by taking the data processing operators as nodes according to the connection sequence so that the Flink platform performs operation according to the connection sequence of the data processing operators in the data processing task.
4. The Flink platform based data processing method according to claim 3, wherein after the step of generating data processing tasks for nodes by using the data processing operators according to the connection order, the method further comprises:
adding a first plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the tableAPI layer, so that a data conversion plug-in of the Flink platform is called through the first plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the tableAPI layer; alternatively, the first and second electrodes may be,
and adding a second plug-in calling module between the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer according to the connection direction of the data processing operator corresponding to the original data layer and the data processing operator corresponding to the FlinkSQL layer, so as to call the data conversion plug-in of the Flink platform through the second plug-in calling module to convert the output data of the data processing operator corresponding to the original data layer or the output data of the data processing operator corresponding to the FlinkSQL layer.
5. The Flink platform-based data processing method of claim 1, further comprising, prior to the step of generating data processing tasks according to the data processing operators: and receiving configuration parameters, and configuring the data processing operator according to the configuration parameters.
6. The Flink platform-based data processing method of claim 2, further comprising, prior to the step of generating data processing tasks according to the data processing operators:
wrapping the data processing operators corresponding to the original data layer by using a wrapper so that the Flink platform can call the wrapped data processing operators, wherein the wrapped data processing operators have at least one of synchronous, asynchronous and aggregation functions.
7. The Flink platform-based data processing method of claim 2, wherein the interface of the raw data layer implements a consumer interface in a group of message queue interfaces.
8. A data processing device based on a Flink platform, comprising:
the data layer setting module is used for presetting a plurality of layers of data layers according to the data structure of the Flink platform, wherein the plurality of layers of data layers comprise interfaces based on different data types;
the data processing operator acquisition module is used for calling corresponding data processing operators through interfaces of different data types of the multilayer data layer;
the data task generating module is used for generating the data processing task by the data processing operator;
and the data task sending module is used for sending the data processing task to the Flink platform for operation.
9. An electronic device, comprising: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the Flink platform based data processing method according to any of the claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that it has stored thereon instructions which, when run on a computer, cause the computer to execute the Flink platform based data processing method according to any one of claims 1 to 7.
CN202210198267.0A 2022-03-01 2022-03-01 Data processing method and device based on Flink platform, electronic equipment and storage medium Pending CN114564480A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210198267.0A CN114564480A (en) 2022-03-01 2022-03-01 Data processing method and device based on Flink platform, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210198267.0A CN114564480A (en) 2022-03-01 2022-03-01 Data processing method and device based on Flink platform, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114564480A true CN114564480A (en) 2022-05-31

Family

ID=81716683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210198267.0A Pending CN114564480A (en) 2022-03-01 2022-03-01 Data processing method and device based on Flink platform, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114564480A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131139A (en) * 2022-09-02 2022-09-30 创新奇智(南京)科技有限公司 Method, device and medium for obtaining target result based on structural data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131139A (en) * 2022-09-02 2022-09-30 创新奇智(南京)科技有限公司 Method, device and medium for obtaining target result based on structural data

Similar Documents

Publication Publication Date Title
CN108170578B (en) Log collection method and device
CN111443901B (en) Java reflection-based service expansion method and device
CN103455405A (en) Method and system for preventing button from being clicked repeatedly and method and system for unlocking button
JP2011118879A (en) Location independent execution of user interface operations
CN109446648B (en) Simulation service establishing method and device
CN111818175B (en) Enterprise service bus configuration file generation method, device, equipment and storage medium
US20100100894A1 (en) System and Method for Asynchronously Invoking Dynamic Proxy Interface Using Supplemental Interfaces
CN100511140C (en) Method for script language calling multiple output parameter interface by component software system
CN114564480A (en) Data processing method and device based on Flink platform, electronic equipment and storage medium
CN109062906B (en) Translation method and device for program language resources
CN103530724A (en) Manufacturing capacity servitization method based on workflow model
CN117032668A (en) Processing method, device, system and platform of visual rule engine
Byun et al. Efficient and privacy-enhanced object traceability based on unified and linked EPCIS events
CN112015831A (en) Method, device and equipment for operating relational database based on C language
CN116303622A (en) Database query method, device, equipment and storage medium
CN111177269A (en) Block chain data storage and acquisition method and device based on structuralization
CN114879940A (en) Method and system for constructing distributed digital native service platform and electronic equipment
CN110297843A (en) Data query method and system, terminal for B/S system
CN114281500A (en) Data interface integrated configuration method convenient for butt joint with external system
US11100077B2 (en) Event table management using type-dependent portions
CN116186022A (en) Form processing method, form processing device, distributed form system and computer storage medium
Löhr et al. Evolving the data management backbone: Binary osa-cbm and code generation for osa-eai
CN113986754A (en) Interface testing method and device, electronic equipment and storage medium
CN117435177B (en) Application program interface construction method, system, equipment and storage medium
US9405512B2 (en) Rejuvenation of legacy code into resources-oriented architectures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination