CN114816583B - Flink-based data automatic processing method and device and electronic equipment - Google Patents

Flink-based data automatic processing method and device and electronic equipment Download PDF

Info

Publication number
CN114816583B
CN114816583B CN202210608800.6A CN202210608800A CN114816583B CN 114816583 B CN114816583 B CN 114816583B CN 202210608800 A CN202210608800 A CN 202210608800A CN 114816583 B CN114816583 B CN 114816583B
Authority
CN
China
Prior art keywords
task
module
flink
web
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210608800.6A
Other languages
Chinese (zh)
Other versions
CN114816583A (en
Inventor
赵向军
李凡平
王堃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ISSA Technology Co Ltd
Original Assignee
ISSA Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ISSA Technology Co Ltd filed Critical ISSA Technology Co Ltd
Priority to CN202210608800.6A priority Critical patent/CN114816583B/en
Publication of CN114816583A publication Critical patent/CN114816583A/en
Application granted granted Critical
Publication of CN114816583B publication Critical patent/CN114816583B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming

Abstract

The invention provides a method, a device and electronic equipment for automatically processing data based on a Flink, and relates to the technical field of data processing, wherein the method is applied to a webpage loading terminal, and the webpage loading terminal comprises the following steps: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending a job instruction to the job processing module; comprising the following steps: responding to an execution instruction aiming at the web module, and calling the flink task configuration information corresponding to the execution instruction in the database module; the execution instructions include data processing tasks; generating a task configuration file based on the flink task configuration information; and starting a flink job corresponding to the task configuration file by using the job processing module so as to complete the data processing task. The method solves the technical problem of operation blocking of other jobs after the flink task is interrupted, and achieves the technical effect of optimizing data processing.

Description

Flink-based data automatic processing method and device and electronic equipment
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for automatically processing data based on a Flink and electronic equipment.
Background
The current data automation processing scheme based on the data stream Flink is generally as follows: the information of the data processing task configured on the web page by the user is stored through the database module, when the user clicks on the web page to execute the task, the web module reads information data such as the data source connection information, the data processing logic, the connection information of the target library and the like stored in the database module, and then sends the information data to the message queue module, finally the operation processing is carried out after the information data is read by the operation processing module which always monitors the message queue module, and finally the automatic processing of the data is completed.
In practical applications, the data processing tasks to be completed are increased continuously, but the "job processing module" in the conventional mode is likely to cause the current module to interrupt operation due to abnormality of a task job or other reasons, so that normal processing of other task jobs is blocked. That is, the existing data processing technology has the technical problem that after the flank task is interrupted, the operation of the rest of the jobs is blocked.
Disclosure of Invention
Accordingly, an object of the present invention is to provide a method, an apparatus, and an electronic device, so as to alleviate the above-mentioned problems in the prior art.
In order to achieve the above object, the technical scheme adopted by the embodiment of the invention is as follows:
in a first aspect, an embodiment of the present invention provides a method for automatically processing data based on a link, which is applied to a web page loading terminal, where the web page loading terminal includes: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending a job instruction to the job processing module; the method comprises the following steps: responding to an execution instruction aiming at the web module, and calling the flink task configuration information corresponding to the execution instruction in the database module; the execution instruction comprises a data processing task; generating a task configuration file based on the flink task configuration information; and starting a flink job corresponding to the task configuration file by using the job processing module so as to complete the data processing task.
In some possible embodiments, the method further comprises: constructing a data processing task on the preloaded webpage by utilizing the web module; and sending the configuration information of the data processing task to the database module and storing the configuration information.
In some possible embodiments, the executing instructions includes: data input, data cleansing and data output.
In some possible embodiments, the flink task configuration information includes: source component information and sink component information.
In some possible embodiments, the above method further comprises: and sending the task configuration file to the link server through a web interface.
In some possible embodiments, the web interface is generated by the job processing module requesting the web module.
In a second aspect, an embodiment of the present invention provides a data automation processing device based on a link, including:
the webpage loading terminal is used for responding to the execution instruction aiming at the web module and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instruction comprises a data processing task; the webpage loading terminal is further used for generating a task configuration file based on the Flink task configuration information; the webpage loading terminal is further used for starting the link job corresponding to the task configuration file by using the job processing module so as to complete the data processing task.
In some possible embodiments, the above web page loading terminal is further configured to: constructing a data processing task on the preloaded webpage by utilizing the web module; and sending the configuration information of the data processing task to the database module and storing the configuration information.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, and a processor, where the memory stores a computer program executable on the processor, and the processor implements the steps of the method according to any one of the first aspects when the processor executes the computer program.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing machine-executable instructions which, when invoked and executed by a processor, cause the processor to perform the method of any one of the first aspects.
The invention provides a method, a device and electronic equipment for automatically processing data based on a Flink, and relates to the technical field of data processing, wherein the method is applied to a webpage loading terminal, and the webpage loading terminal comprises the following steps: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending a job instruction to the job processing module; comprising the following steps: responding to an execution instruction aiming at the web module, and calling the flink task configuration information corresponding to the execution instruction in the database module; the execution instructions include data processing tasks; generating a task configuration file based on the flink task configuration information; and starting a flink job corresponding to the task configuration file by using the job processing module so as to complete the data processing task. The method solves the technical problem of operation blocking of other jobs after the flink task is interrupted, and achieves the technical effect of optimizing data processing.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a conventional automated data processing system according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a web page loading terminal according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for automatically processing data based on a Flink according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Some embodiments of the present invention are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
Referring to the schematic structural diagram of an existing data automation processing system shown in fig. 1, the current data automation processing scheme based on the data flow Flink is generally: the information of the data processing task configured on the web page by the user is stored through the database module 110, when the user clicks on the web page to execute the task, the web module 120 reads the information data such as the data source connection information, the data processing logic, the connection information of the target library and the like stored in the database module 110, then sends the information data to the message queue module 130, finally reads the information data by the job processing module 140 which always monitors the message queue module, then carries out job processing, and finally completes automatic processing of the data. In practical applications, the data processing tasks to be completed are continuously increased, and the "job processing module" 140 in the conventional mode is likely to cause the current module to interrupt operation due to an abnormality of a task job or other reasons, so as to block normal processing of other task jobs. That is, the existing data processing technology has the technical problem that after the flank task is interrupted, the operation of the rest of the jobs is blocked.
Based on the above, the embodiment of the invention provides a method, a device and electronic equipment for automatically processing data based on a Flink.
For the convenience of understanding the present embodiment, first, a detailed description will be given of a data automation processing method based on a link disclosed in the present embodiment, where the method is applied to a web page loading terminal, and referring to a structural schematic diagram of the web page loading terminal shown in fig. 2, the web page loading terminal includes: a web module 210, a database module 220, and a job processing module 230; the web module 210 is used for data interaction with the database module 220; web module 210 is also used to send job instructions to job processing module 230.
Referring to fig. 3, a flow chart of a method for automation processing of data based on a link is shown, which may be performed by an electronic device. The data automation processing method based on the Flink mainly comprises the following steps of S101 to S103:
s101: responding to an execution instruction aiming at the web module, and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instructions include data processing tasks;
s102: generating a task configuration file based on the Flink task configuration information;
s103: and starting the Flink job corresponding to the task configuration file by using the job processing module so as to complete the data processing task.
Wherein, the executing instruction may include: data input, data cleansing and data output.
The link task configuration information may include: source component information and sink component information. As a specific example, the embodiment of the application writes the information of the source component and sink component of the Flink task into the yaml configuration file, designates the configuration file when the Flink task is started, and then can complete the reading-processing-warehousing of the data according to the configuration items of the source component and sink.
In one embodiment, the method further comprises:
(1) Constructing a data processing task on the preloaded web page by utilizing a web module;
(2) And sending the configuration information of the data processing task to a database module and storing the configuration information.
In one embodiment, the foregoing may further include: and sending the task configuration file to the Flink server through the web interface. Wherein the web interface is generated by the job processing module requesting from the web module. The embodiment of the application provides a data automation processing method based on a Flink, which generates a task configuration file required by the Flink operation according to configuration information of tasks in a database module, provides web interface service of file content, enables data processing tasks to correspond to the Flink operation one by one, and solves the problem that operation of other operations is blocked after the whole Flink task is interrupted due to abnormal operation of one operation in the prior art. In addition, the webpage loading terminal structure provided by the embodiment of the application cancels a message queue module between the web module and the operation processing module in the traditional architecture, simplifies the architecture of the existing method, and solves the architecture design of the defects of the prior art.
It should be noted that, the one-to-one correspondence between the data processing tasks and the link job means: and starting a configured task on the page, and correspondingly starting a Flink job by the background to realize the data processing function of the task. The specific flow is as follows: the web page starts a task; the back-end code calls the interface of the link self-contained uploading jar packet (uploading the link program packet to the link service); the back-end code calls the Flink self-contained interface for running jobs (depending on the jar package uploaded in the previous step, designates yaml configuration files and starts Flink operation); and the Flink job runs to complete the task corresponding to the web page.
As a specific example, an embodiment of the present application provides a method for automatically processing data based on a link, where the method is applied to a web page loading terminal, and the web page loading terminal includes: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending job instructions to the job processing module. The method comprises the following data processing flow:
(1) The user pulls the components of the three categories of 'input', 'cleaning', 'output' in the form of dragging on the web page in the 'web module', selects the information of 'data source', 'cleaning rule', 'target library', and the like, and selects to submit to construct a complete data processing task;
(2) After the designed assembly submits the forming task, the configuration information of the current task is stored in a database module;
(3) Clicking and executing a certain task on a web page in a web module by a user to start a processing job of the corresponding task;
(4) The back end code of the web module pulls configuration information of the corresponding task from the database module and generates a file with a suffix of yaml (the file is used for completing data processing based on the information after being read by the Flink task);
(5) After the back end code of the web module regenerates the yaml file, a Flink task of the job processing module is started by calling a Flink Rest API (a configuration file path of a yaml suffix is appointed for the current Flink task in calling parameters);
(6) The "job processing module" flank task after being called and started, requests an "interface for acquiring file content" provided by the "web module" to acquire the content of the "yaml" file (stored in the web server) generated in the step (4) and writes the content into the "yaml" file (stored in the flank server) designated for the current flank task in the step (5);
(7) After writing new task configuration information (the ". Yaml" file content of the step (4)) into the configuration file (the ". Yaml" file of the step (5)) designated by the current task, the link and data pulling of the "data source" in the step (1) can be completed according to the information, and the execution of the "cleaning rule" for the data, and finally the link and data warehousing of the "target library" can be completed;
(8) The web module regularly monitors the running state of the Flink task through the Flink Rest API and feeds back to the web page.
It should be noted that: the configuration file of the Flink task can only load the file on the Flink server, and the server where the Flink is located and the web server are not the same machine, so that the configuration file of the Flink server and the configuration file of the web server need to be read and generated through a web interface to the Flink server. The web interface is an interface for acquiring file content and is provided by the web module in the step (6); the configuration file is ". Yaml" file in step (4).
The embodiment of the invention also provides a data automation processing device based on the Flink, which comprises: the webpage loading terminal is used for responding to the execution instruction aiming at the web module and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instructions include data processing tasks;
the webpage loading terminal is also used for generating a task configuration file based on the Flink task configuration information; the webpage loading terminal is also used for starting the Flink operation corresponding to the task configuration file by utilizing the operation processing module so as to complete the data processing task.
In one embodiment, the above-mentioned webpage loading terminal may further be used for: constructing a data processing task on the preloaded web page by utilizing a web module; and sending the configuration information of the data processing task to a database module and storing the configuration information.
The invention provides a method, a device and electronic equipment for automatically processing data based on a Flink, wherein the method is applied to a webpage loading terminal, and the webpage loading terminal comprises the following steps: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending a job instruction to the job processing module; comprising the following steps: responding to an execution instruction aiming at the web module, and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instructions include data processing tasks; generating a task configuration file based on the Flink task configuration information; the method solves the technical problem of operation blockage of other jobs after the flank task is interrupted, and achieves the technical effect of optimizing data processing.
The data automation processing device based on the link provided by the embodiment of the application can be specific hardware on equipment or software or firmware installed on the equipment. The device provided in the embodiments of the present application has the same implementation principle and technical effects as those of the foregoing method embodiments, and for a brief description, reference may be made to corresponding matters in the foregoing method embodiments where the device embodiment section is not mentioned. It will be clear to those skilled in the art that, for convenience and brevity, the specific operation of the system, apparatus and unit described above may refer to the corresponding process in the above method embodiment, which is not described in detail herein. The data automation processing device based on the Flink provided by the embodiment of the application has the same technical characteristics as the data automation processing method based on the Flink provided by the embodiment, so that the same technical problems can be solved, and the same technical effects are achieved.
The embodiment of the application also provides electronic equipment, which specifically comprises a processor and a storage device; the storage means has stored thereon a computer program which, when executed by the processor, performs the method of any of the embodiments described above.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device 400 includes: a processor 40, a memory 41, a bus 42 and a communication interface 43, the processor 40, the communication interface 43 and the memory 41 being connected by the bus 42; the processor 40 is arranged to execute executable modules, such as computer programs, stored in the memory 41.
The memory 41 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and the at least one other network element is achieved via at least one communication interface 43 (which may be wired or wireless), which may use the internet, a wide area network, a local network, a metropolitan area network, etc.
Bus 42 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 4, but not only one bus or type of bus.
The memory 41 is configured to store a program, and the processor 40 executes the program after receiving an execution instruction, and the method executed by the apparatus for flow defining disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 40 or implemented by the processor 40.
The processor 40 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in processor 40. The processor 40 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a digital signal processor (Digital Signal Processing, DSP for short), application specific integrated circuit (Application Specific Integrated Circuit, ASIC for short), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA for short), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory 41 and the processor 40 reads the information in the memory 41 and in combination with its hardware performs the steps of the method described above.
Corresponding to the above method, the embodiments of the present application also provide a computer readable storage medium storing machine executable instructions, which when invoked and executed by a processor, cause the processor to execute the steps of the above method.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments provided in the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should be noted that: like reference numerals and letters in the various figures refer to like items and, thus, once an item is defined in one figure, no further definition or explanation of that in the subsequent figure is necessary, and furthermore, the terms "first," "second," "third," etc. are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (10)

1. The data automation processing method based on the Flink is characterized by being applied to a webpage loading terminal, wherein the webpage loading terminal comprises the following steps: the system comprises a web module, a database module and a job processing module; the web module is used for carrying out data interaction with the database module; the web module is also used for sending a job instruction to the job processing module;
the method comprises the following steps:
responding to an execution instruction aiming at the web module, and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instruction comprises a data processing task;
generating a task configuration file based on the Flink task configuration information;
starting a Flink job corresponding to the task configuration file by using the job processing module so as to complete the data processing task;
the method comprises the following data processing flow:
(1) The user pulls the components of the three categories of 'input', 'cleaning', 'output' in the form of dragging on the web page in the 'web module', selects the information of 'data source', 'cleaning rule', 'target library', and the like, and selects to submit to construct a complete data processing task;
(2) After the designed assembly submits the forming task, the configuration information of the current task is stored in a database module;
(3) Clicking and executing a certain task on a web page in a web module by a user to start a processing job of the corresponding task;
(4) The back end code of the web module pulls configuration information of the corresponding task from the database module and generates a file with a suffix of yaml (the file is used for completing data processing based on the information after being read by the Flink task);
(5) After the back end code of the web module regenerates the yaml file, a Flink task of the job processing module is started by calling a Flink Rest API (a configuration file path of a yaml suffix is appointed for the current Flink task in calling parameters);
(6) The "job processing module" flank task after being called and started, requests an "interface for acquiring file content" provided by the "web module" to acquire the content of the "yaml" file (stored in the web server) generated in the step (4) and writes the content into the "yaml" file (stored in the flank server) designated for the current flank task in the step (5);
(7) After writing new task configuration information (the ". Yaml" file content of the step (4)) into the configuration file (the ". Yaml" file of the step (5)) designated by the current task, the link and data pulling of the "data source" in the step (1) can be completed according to the information, and the execution of the "cleaning rule" for the data, and finally the link and data warehousing of the "target library" can be completed;
(8) The web module regularly monitors the running state of the Flink task through the Flink Rest API and feeds back to the web page.
2. The method as recited in claim 1, further comprising:
constructing a data processing task on the preloaded webpage by utilizing the web module;
and sending the configuration information of the data processing task to the database module and storing the configuration information.
3. The method of claim 1, wherein the executing the instructions comprises: data input, data cleansing and data output.
4. The method of claim 1, wherein the Flink task configuration information comprises: source component information and sink component information.
5. The method as recited in claim 1, further comprising:
and sending the task configuration file to the Flink server through a web interface.
6. The method of claim 5, wherein the web interface is generated by the job processing module requesting from the web module.
7. A data automation processing device based on a link, comprising:
the webpage loading terminal is used for responding to an execution instruction aiming at the web module and calling the Flink task configuration information corresponding to the execution instruction in the database module; the execution instruction comprises a data processing task;
the webpage loading terminal is further used for generating a task configuration file based on the Flink task configuration information;
the webpage loading terminal is further used for starting the Flink operation corresponding to the task configuration file by utilizing an operation processing module so as to complete the data processing task;
the device comprises the following data processing flow:
(1) The user pulls the components of the three categories of 'input', 'cleaning', 'output' in the form of dragging on the web page in the 'web module', selects the information of 'data source', 'cleaning rule', 'target library', and the like, and selects to submit to construct a complete data processing task;
(2) After the designed assembly submits the forming task, the configuration information of the current task is stored in a database module;
(3) Clicking and executing a certain task on a web page in a web module by a user to start a processing job of the corresponding task;
(4) The back end code of the web module pulls configuration information of the corresponding task from the database module and generates a file with a suffix of yaml (the file is used for completing data processing based on the information after being read by the Flink task);
(5) After the back end code of the web module regenerates the yaml file, a Flink task of the job processing module is started by calling a Flink Rest API (a configuration file path of a yaml suffix is appointed for the current Flink task in calling parameters);
(6) The "job processing module" flank task after being called and started, requests an "interface for acquiring file content" provided by the "web module" to acquire the content of the "yaml" file (stored in the web server) generated in the step (4) and writes the content into the "yaml" file (stored in the flank server) designated for the current flank task in the step (5);
(7) After writing new task configuration information (the ". Yaml" file content of the step (4)) into the configuration file (the ". Yaml" file of the step (5)) designated by the current task, the link and data pulling of the "data source" in the step (1) can be completed according to the information, and the execution of the "cleaning rule" for the data, and finally the link and data warehousing of the "target library" can be completed;
(8) The web module regularly monitors the running state of the Flink task through the Flink Rest API and feeds back to the web page.
8. The apparatus of claim 7, wherein the web page loading terminal is further configured to: constructing a data processing task on the preloaded webpage by utilizing the web module; and sending the configuration information of the data processing task to the database module and storing the configuration information.
9. An electronic device comprising a memory, a processor, the memory having stored therein a computer program executable on the processor, characterized in that the processor, when executing the computer program, implements the steps of the method of any of the preceding claims 1 to 6.
10. A computer readable storage medium storing machine executable instructions which, when invoked and executed by a processor, cause the processor to perform the method of any one of claims 1 to 6.
CN202210608800.6A 2022-05-31 2022-05-31 Flink-based data automatic processing method and device and electronic equipment Active CN114816583B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210608800.6A CN114816583B (en) 2022-05-31 2022-05-31 Flink-based data automatic processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210608800.6A CN114816583B (en) 2022-05-31 2022-05-31 Flink-based data automatic processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN114816583A CN114816583A (en) 2022-07-29
CN114816583B true CN114816583B (en) 2024-03-19

Family

ID=82518842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210608800.6A Active CN114816583B (en) 2022-05-31 2022-05-31 Flink-based data automatic processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114816583B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115756586B (en) * 2022-11-25 2024-01-19 中电金信软件有限公司 Method and device for executing Flink job, computer equipment and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061715A (en) * 2019-12-16 2020-04-24 北京邮电大学 Web and Kafka-based distributed data integration system and method
CN111930700A (en) * 2020-07-13 2020-11-13 车智互联(北京)科技有限公司 Distributed log processing method, server, system and computing equipment
CN112130976A (en) * 2020-09-21 2020-12-25 厦门南讯股份有限公司 REST-based multi-engine big data task management method
KR102201651B1 (en) * 2020-02-04 2021-01-11 강원대학교산학협력단 Probability-based data stream partitioning method considering task locality and downstream status
CN112286905A (en) * 2020-10-15 2021-01-29 北京沃东天骏信息技术有限公司 Data migration method and device, storage medium and electronic equipment
CN112328458A (en) * 2020-11-27 2021-02-05 杭州安恒信息技术股份有限公司 Data processing method and device based on flink data engine
CN112558995A (en) * 2020-12-24 2021-03-26 恩亿科(北京)数据科技有限公司 Flink integration method and system based on TBDS Hadoop
CN112765166A (en) * 2021-01-06 2021-05-07 深圳市欢太科技有限公司 Data processing method, device and computer readable storage medium
US11010191B1 (en) * 2020-07-02 2021-05-18 Ryan L. Hornbeck Platform-independent interface for generating virtualized multi-service hardware systems and infrastructure
CN112835924A (en) * 2021-02-04 2021-05-25 北京高途云集教育科技有限公司 Real-time computing task processing method, device, equipment and storage medium
CN113010512A (en) * 2021-02-24 2021-06-22 上海中通吉网络技术有限公司 Real-time data processing method, platform and equipment based on Flink
CN113656157A (en) * 2021-08-10 2021-11-16 北京锐安科技有限公司 Distributed task scheduling method and device, storage medium and electronic equipment
US11275726B1 (en) * 2020-12-06 2022-03-15 Kamu Data Inc. Distributed data processing method with complete provenance and reproducibility

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11681667B2 (en) * 2017-07-30 2023-06-20 International Business Machines Corporation Persisting distributed data sets into eventually consistent storage systems
US10348578B2 (en) * 2017-10-18 2019-07-09 Proov Systems Ltd. Software proof-of-concept platform, including simulation of production behavior and/or data
US11250139B2 (en) * 2020-04-27 2022-02-15 Oracle International Corporation Greybox fuzzing for web applications
US20220004369A1 (en) * 2020-07-01 2022-01-06 Johnson Controls Tyco IP Holdings LLP Rule builder and simulator tool for creating complex event processing rules

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061715A (en) * 2019-12-16 2020-04-24 北京邮电大学 Web and Kafka-based distributed data integration system and method
KR102201651B1 (en) * 2020-02-04 2021-01-11 강원대학교산학협력단 Probability-based data stream partitioning method considering task locality and downstream status
US11010191B1 (en) * 2020-07-02 2021-05-18 Ryan L. Hornbeck Platform-independent interface for generating virtualized multi-service hardware systems and infrastructure
CN111930700A (en) * 2020-07-13 2020-11-13 车智互联(北京)科技有限公司 Distributed log processing method, server, system and computing equipment
CN112130976A (en) * 2020-09-21 2020-12-25 厦门南讯股份有限公司 REST-based multi-engine big data task management method
CN112286905A (en) * 2020-10-15 2021-01-29 北京沃东天骏信息技术有限公司 Data migration method and device, storage medium and electronic equipment
CN112328458A (en) * 2020-11-27 2021-02-05 杭州安恒信息技术股份有限公司 Data processing method and device based on flink data engine
US11275726B1 (en) * 2020-12-06 2022-03-15 Kamu Data Inc. Distributed data processing method with complete provenance and reproducibility
CN112558995A (en) * 2020-12-24 2021-03-26 恩亿科(北京)数据科技有限公司 Flink integration method and system based on TBDS Hadoop
CN112765166A (en) * 2021-01-06 2021-05-07 深圳市欢太科技有限公司 Data processing method, device and computer readable storage medium
CN112835924A (en) * 2021-02-04 2021-05-25 北京高途云集教育科技有限公司 Real-time computing task processing method, device, equipment and storage medium
CN113010512A (en) * 2021-02-24 2021-06-22 上海中通吉网络技术有限公司 Real-time data processing method, platform and equipment based on Flink
CN113656157A (en) * 2021-08-10 2021-11-16 北京锐安科技有限公司 Distributed task scheduling method and device, storage medium and electronic equipment

Non-Patent Citations (21)

* Cited by examiner, † Cited by third party
Title
A Performance Analysis of Fault Recovery in Stream Processing Frameworks;Giselle van Dongen;IEEE Access;20210628;第9卷;93745 - 93763 *
Apache Flink in current research;Tilmann Rabl EMAIL logo;Information Technology;20160624;第58卷(第4期);157-165 *
BigBench Workload Executed by using Apache Flink;Sonia Bergamaschi等;Procedia Manufacturing;第11卷;695-702 *
Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka;M. Haseeb Javed;the Fourth IEEE/ACM international Conference on Big Data Computing;1-10 *
Fobian Hueske.Stream Processing With Apache Flink.Stream Processing With Apache Flink,2020,1-220. *
Michael Armbrust 等.Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark.SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data.2018,601-013. *
NAMB: A Quick and Flexible Stream Processing Application Prototype Generator;Alessio Pagliari 等;2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID);61-70 *
Sonia Bergamaschi.BigBench Workload Executed by using Apache Flink.Procedia Manufacturing.2017,第11卷695-702. *
SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink;Oscar Ceballos;Applied Science;第11卷(第15期);1-24 *
State management in Apache Flink: consistent stateful distributed stream processing;Paris Carbone等;Proceedings of the VLDB Endowment;第10卷(第12期);1718–1729 *
Technical Review Of Apache Flink For Big Data;G. Paul Davidson等;International Journal of Aquatic Science;第12卷(第2期);3340-3346 *
一种高效的Flink与MongoDB连接中间件的研究与实现;胡程,叶枫;计算机工程与应用(第23期);64-69 *
基于Elasticsearch的实时大数据统计分析平台的研究与设计;吉喆;CNKI优秀硕士学位论文全文库;20200115(第1期);1-63 *
基于Flink的电商实时计算平台的设计与实现;谢缙;CNKI优秀硕士学位论文(第3期);1-104 *
基于Storm平台的数据恢复节能策略;蒲勇霖等;计算机研究与发展;第58卷(第3期);479-496 *
基于流计算Flink框架的资源调度方法研究;魏碧晴;CNKI优秀硕士学位论文(第1期);1-63 *
大数据与OLAP系统;杜小勇 等;大数据;20150531;第1卷(第1期);55-67 *
大数据流式计算框架Heron环境下的流分类任务调度策略;张译天等;计算机应用;第39卷(第4期);1106-1116 *
大数据流式计算框架Storm的任务迁移策略;鲁亮;计算机研究与发展;20180131;第55卷(第1期);71-92 *
面向云服务的日志处理系统关键技术研发;周超;CNKI优秀硕士学位论文(第5期);1-100 *
面向大数据云平台的资源管理系统;李程;CNKI优秀硕士学位论文全文库;20190515(第5期);1-67 *

Also Published As

Publication number Publication date
CN114816583A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US8868984B2 (en) Relevant alert delivery in a distributed processing system with event listeners and alert listeners
US8825852B2 (en) Relevant alert delivery in a distributed processing system
CN110119306B (en) Method, device and equipment for balancing automatic scheduling of jobs and storage medium
US9535754B1 (en) Dynamic provisioning of computing resources
CN110753084B (en) Uplink data reading method, cache server and computer readable storage medium
CN114816583B (en) Flink-based data automatic processing method and device and electronic equipment
CN103763346A (en) Distributed resource scheduling method and device
CN110109741B (en) Method and device for managing circular tasks, electronic equipment and storage medium
CN110609755A (en) Message processing method, device, equipment and medium for cross-block chain node
CN112181522A (en) Data processing method and device and electronic equipment
CN113064676A (en) Method for remote component sharing mechanism during front-end operation based on JS entrance
US9678793B2 (en) Resource-based job scheduling
CN116089040A (en) Service flow scheduling method and device, electronic equipment and storage medium
CN116016653A (en) Information pushing method and device of blockchain, electronic equipment and storage medium
CN114625502A (en) Word-throwing task processing method and device, storage medium and electronic equipment
CN108062224A (en) Data read-write method, device and computing device based on file handle
CN114285903A (en) Request processing method, device and system and electronic equipment
CN113934566A (en) Exception handling method and device and electronic equipment
US8495033B2 (en) Data processing
EP3495960A1 (en) Program, apparatus, and method for communicating data between parallel processor cores
CN113157405A (en) Method and device for retrying breakpoint of business process
CN111475140A (en) App componentization method based on event-driven architecture and event-driven architecture
CN111444057A (en) Page performance data acquisition method and device and computing equipment
CN112799797A (en) Task management method and device
CN112965698B (en) Automatic driving software architecture platform, construction method and construction device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant