CN113867600A - Development method and device for processing streaming data and computer equipment - Google Patents

Development method and device for processing streaming data and computer equipment Download PDF

Info

Publication number
CN113867600A
CN113867600A CN202110983088.3A CN202110983088A CN113867600A CN 113867600 A CN113867600 A CN 113867600A CN 202110983088 A CN202110983088 A CN 202110983088A CN 113867600 A CN113867600 A CN 113867600A
Authority
CN
China
Prior art keywords
target
streaming data
component
data processing
processing logic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110983088.3A
Other languages
Chinese (zh)
Inventor
唐莹
秦文劭
史志龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202110983088.3A priority Critical patent/CN113867600A/en
Publication of CN113867600A publication Critical patent/CN113867600A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Stored Programmes (AREA)

Abstract

The application relates to a development method, a development device, computer equipment and a storage medium for processing streaming data. The method comprises the following steps: displaying a configuration interface of the streaming data processing logic on the terminal; the terminal can also respond to the selection operation aiming at the identification of the target process assembly, determine the target process assembly and display the target process assembly in the assembly area; and obtaining the streaming data processing logic according to the target flow component. The method can solve the problems of diversity of calculation engines, complexity of data interaction, high threshold of technical capability and the like in the process of streaming data processing, can uniformly manage various data service configurations and flows, simultaneously supports the function of externally providing development customized data service components and the like, can redefine the normal form of streaming operation function output by the obtained streaming data processing logic, provides a standardized and reusable streaming processing component set, and can also provide a one-stop service platform for the development and release of real-time streaming data processing components.

Description

Development method and device for processing streaming data and computer equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a development method and apparatus for processing streaming data, and a computer device.
Background
In the field of network security, a data sequence which can be rapidly reached, can be continuously generated and can be infinitely increased and is generated by various service systems is called streaming data. A service system that processes streaming data is referred to as a streaming processing system.
In the related art, the streaming processing system selects various and complicated calculation tools, integrates different data streams according to different business logic sequences, and finally generates a report or a data product. The method comprises the following steps that data information is transmitted between systems usually in modes of batch files, online calling and the like, two access systems related to data interaction respectively complete development tasks of data exchange operation by filling application materials of various exchange platforms, and then are manually deployed to a test environment; the access system needs to complete the packaging of the development environment on the packaging platform, and then informs the exchange platform to complete the test work of the data joint debugging; the data jobs of the platforms are all related, and each platform has its own development test specification, which results in that a lot of resources are consumed in some complex data job development test processes.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a development method, an apparatus, and a computer device for processing streaming data, which can reduce resource consumption.
A development method of processing streaming data, the method comprising:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
In one embodiment, the flow components include a plurality of input components, a plurality of processing components, and a plurality of output components; the determining the target process component in response to the selection operation directed to the identification of the target process component comprises:
determining a target input component in response to a selection operation for the identification of the target input component;
determining a target processing component in response to a selection operation directed to the identification of the target processing component;
in response to a selection operation directed to the identification of the target output component, the target output component is determined.
In one embodiment, the obtaining streaming data processing logic according to the target process component includes:
and obtaining streaming data processing logic according to the target input assembly, the target processing assembly, the target output assembly and the connection sequence among the target process assemblies.
In one embodiment, after the step of obtaining streaming data processing logic according to the target flow component, the method further comprises:
responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and transmitting the streaming data processing result in the form of an asynchronous message.
In one embodiment, the configuration interface further comprises a running information display area;
the method further comprises the following steps:
and responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly, and displaying the running information in the running information display area.
In one embodiment, the operation information includes one or more of name information of the process component, input data volume information, output data volume information, system throughput information, transmission delay information, processing delay information, and data throughput information of a plurality of target time periods.
In one embodiment, the configuration interface further comprises a process management area for displaying the generated identification of the streaming data processing logic;
after the step of obtaining the streaming data processing logic according to the target process component, the method further includes:
displaying the target streaming data processing logic in the flow management area in response to a selection operation directed to the identification of the target streaming data processing logic.
A development device that processes streaming data, the device comprising:
the display module is used for displaying a configuration interface of the streaming data processing logic, the configuration interface comprises a selection area of the flow components and an assembly area of the flow components, and the selection area comprises an identifier of at least one flow component;
a determination module, configured to determine a target process component in response to a selection operation for an identifier of the target process component, and display the target process component in the assembly area;
and the streaming data processing logic development module is used for obtaining streaming data processing logic according to the target flow assembly.
In one embodiment, the flow components include a plurality of input components, a plurality of processing components, and a plurality of output components; the determining module includes:
a first determination unit configured to determine a target input component in response to a selection operation for an identification of the target input component;
a second determination unit configured to determine a target processing component in response to a selection operation for the identification of the target processing component;
a third determination unit configured to determine the target output component in response to a selection operation for the identification of the target output component.
In one embodiment, the streaming data processing logic development module is specifically configured to obtain the streaming data processing logic according to the target input component, the target processing component, the target output component, and a connection sequence between the target process components.
In one embodiment, the development device for processing streaming data further includes:
the first response module is used for responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and the transmission module is used for transmitting the streaming data processing result in the form of asynchronous messages.
In one embodiment, the configuration interface further comprises a running information display area;
the development device for processing streaming data further comprises: and the second response module is used for responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly and displaying the running information in the running information display area.
In one embodiment, the configuration interface further comprises a process management area for displaying the generated identification of the streaming data processing logic;
after the step of obtaining the streaming data processing logic according to the target process component, the development device for processing streaming data further includes: a third response module for displaying the target streaming data processing logic in the process management area in response to a selection operation directed to the identification of the target streaming data processing logic.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
The development method, the development device, the computer equipment and the storage medium for processing the streaming data display the configuration interface of the streaming data processing logic on the terminal; the terminal can also respond to the selection operation aiming at the identification of the target process assembly, determine the target process assembly and display the target process assembly in the assembly area; and obtaining the streaming data processing logic according to the target flow component. The method can solve the problems of diversity of calculation engines, complexity of data interaction, high threshold of technical capability and the like in the process of streaming data processing, can uniformly manage various data service configurations and flows, simultaneously supports the function of externally providing development customized data service components and the like, can redefine the normal form of streaming operation function output by the obtained streaming data processing logic, provides a standardized and reusable streaming processing component set, and can also provide a one-stop service platform for the development and release of real-time streaming data processing components.
Drawings
FIG. 1 is a flow diagram of a development method to process streaming data in one embodiment;
FIG. 2 is a flow diagram that illustrates the steps of the determine target flow component in one embodiment;
FIG. 3 is a schematic diagram of a configuration interface in a development method for processing streaming data in one embodiment;
FIG. 4 is a diagram illustrating an assembly area in a development method configuration interface for processing streaming data, in accordance with an embodiment;
FIG. 5 is a flow diagram illustrating the steps in one embodiment for transmitting the results of a streaming data process;
FIG. 6 is a diagram that illustrates display of operational information for a target process component in one embodiment;
FIG. 7 is a schematic diagram of a process management interface of a development method for processing streaming data in one embodiment;
FIG. 8 is a block diagram showing the construction of a development apparatus for processing streaming data according to one embodiment;
FIG. 9 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Currently, in the related art, the real-time data processing method at the enterprise level includes: processing a primary stream: all incoming data records, once arrived, are processed one after the other. The real-time data are processed by the processing method, the real-time data can be expressed in a more realistic mode, and the data are processed immediately once arriving, so that the time delay is small. Moreover, the stateful operation of this method is easier to implement because it requires each record to be considered, and a typical native stream processing system costs a lot to achieve low latency and fault tolerance. The load balancing problem of native stream processing is also a priority. For example, data we process is partitioned by key, and if a certain key of a partition is resource intensive, the partition can easily become a bottleneck of the job.
Secondly, micro batch processing: the incoming data is divided into short batches of data for some predefined time interval (typically a few seconds) and flows through the stream processing system. The processing method decomposes the flow calculation into a series of short and small batch processing operations, inevitably weakens the expression force of the system, and necessarily has certain time delay and larger memory occupation. The implementation of operations like state management or join becomes difficult because micro-batch processes must operate on the entire batch of data, requiring all data processing to be completed before further operations can be performed. However, fault tolerance and load balancing of the mini-batch system is very simple to implement, because the mini-batch system will send each batch of data to a different worker node, and use other copies if some data goes wrong. Micro batch processing systems are easily built on top of native stream processing systems, i.e., bounded streams are based on unbounded stream processing.
The technology stack used by the streaming processing system mainly includes:
native stream processing techniques represent: apache Flink, a native stream processing system, provides a high level interface (API). Flink also provides a batch interface (DataSet API) to batch like Spark, but the basis of both processes is quite different, Flink treats batch as a special case of streaming. In Flink, all data is treated as a stream, which is a good abstraction, as this is closer to the real world.
Micro batch technology stack representation: hadoop Spark, including Spark SQL, MLlib and Spark Streaming. The runtime of Spark is established on the batch processing, so the Spark Streaming added subsequently depends on the batch processing, and micro batch processing is realized. The receiver divides the incoming data stream into short batches and processes the micro-batches in a manner similar to Spark jobs. Spark Streaming provides a high-level declarative API (Scala, Java, and Python support).
In summary, the processing technology based on the streaming data is various, and the requirement on the use scene is high, so that the development of the real-time data flow has a high threshold, and the learning cost of the developer is also high.
The existing streaming processing system can involve various and complicated computing components, different data streams are integrated together according to different business logic sequences, and finally, a report or a data product is generated. Under the prior art, the systems usually select batch files, online calling (relating to various message formats) and other modes to transmit data information, and two access systems relating to data interaction respectively complete development tasks of data exchange operation by filling application materials of various exchange platforms and then deploy the data information to a test environment in a manual mode; the access system needs to complete the packaging of the development environment on the packaging platform, and then informs the exchange platform to complete the test work of the data joint debugging; the data operations of the platforms are all related to each other, and each platform has its own development and test specification, so that a lot of manpower resources are consumed in some complex data operation development and test processes, and a lot of communication cost is needed.
Due to the problems of diversity of a computing engine, complexity of data interaction, high requirement of technical capability and the like, how to comprehensively manage hot streaming data and streaming processing operation and ensure the data quality in the streaming operation through a unified and convenient development test mode is also a problem which must be faced in the big data era.
The purpose of the disclosed embodiment is to connect multiple heterogeneous real-time data computing engines in series through a visual streaming operation development, test, release and monitoring integrated platform, and perform one-stop management on development, test, release parameters and environment of streaming data operation through a unified development, test and release process, so that development and operation of developers are facilitated, and the learning cost of the developers is reduced.
In one embodiment, as shown in fig. 1, a development method for processing streaming data is provided, and this embodiment is illustrated by applying this method to a terminal, it is to be understood that this method may also be applied to a server, and may also be applied to a system including a terminal and a server, and implemented by interaction between the terminal and the server, where the terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server may be implemented by an independent server or a server cluster formed by a plurality of servers. In this embodiment, the identification method includes the following steps:
step 101, displaying a configuration interface of the streaming data processing logic.
The configuration interface comprises a selection area of the process assembly and an assembly area of the process assembly, and the selection area comprises at least one identification of the process assembly.
Specifically, the terminal includes a display control of a configuration interface of the streaming data processing logic. In an actual application scene, a user triggers the display control, and the terminal responds to the triggering operation of the user on the display control and displays a configuration interface of the streaming data processing logic. That is, in response to the user's trigger operation on the display control, the terminal displays a configuration interface of the streaming data processing logic, where the configuration interface includes a selection area of the flow component and an assembly area of the flow component. The selection area is used for displaying the identification of at least one type of flow component, and the assembly area is used for displaying the flow component selected by the developer.
Optionally, the terminal may also display a configuration interface of the streaming data processing logic without a triggering operation by the user.
And 102, responding to the selection operation aiming at the identification of the target process component, determining the target process component, and displaying the target process component in the assembly area.
Specifically, the terminal displays the identifications of the plurality of flow components in the selection area of the flow component. The terminal also comprises a display control corresponding to the identifications of the plurality of process components. And the terminal responds to the selection operation of the developer on the display control corresponding to the identification of any flow component, and the display control can display the target flow component selected by the developer in the assembly area.
And 103, obtaining a streaming data processing logic according to the target flow component.
In particular, the target flow component may include at least one flow component of at least one type, i.e., the target flow component may include a plurality of flow components that are classified into different types. Therefore, the terminal can combine the target components according to the assembly sequence configured by the developer to obtain the target streaming data processing logic. Wherein the target streaming data processing logic is to process the streaming data in real-time.
In the development method for processing the streaming data, a configuration interface of the streaming data processing logic is displayed on a terminal; the terminal can also respond to the selection operation aiming at the identification of the target process assembly, determine the target process assembly and display the target process assembly in the assembly area; and obtaining the streaming data processing logic according to the target flow component. The method can solve the problems of diversity of calculation engines, complexity of data interaction, high threshold of technical capability and the like in the process of streaming data processing, can uniformly manage various data service configurations and flows, simultaneously supports the function of externally providing development customized data service components and the like, can redefine the normal form of streaming operation function output by the obtained streaming data processing logic, also provides a standardized and reusable streaming processing component set, and can also provide a one-stop service platform for the development and release of real-time streaming data processing components.
In one embodiment, the flow components include a plurality of input components, a plurality of processing components, and a plurality of output components, since a plurality of flow components belonging to different types need to be cooperatively processed when processing streaming data. Accordingly, as shown in fig. 2, the step 102 "determining the target process component in response to the selection operation for the identification of the target process component" includes:
in step 202, a target input component is determined in response to a selection operation directed to the identification of the target input component.
In particular, the terminal may display the identity of each input component within a selection area of the flow component. The terminal can also comprise a display control corresponding to the identification of each input component. And the user triggers the display control of the identification of the target input assembly, and the terminal responds to the triggering operation and can display the target input control selected by the user in the assembly area of the flow assembly. The triggering operation of the user on the display control of the identifier of the target input component is the selection operation of the identifier of the target input.
Optionally, the target input component may include at least one of a file capture input component, an ES capture input component, a Kafka capture input component, an ftp file capture input component, a database capture input component, a socket MQ capture input component.
In step 204, a target processing component is determined in response to the selection operation for the identification of the target processing component.
In particular, the terminal may display an identification of each processing component within a selection area of the flow component. The terminal can also comprise a display control corresponding to the identification of each processing component. And the user triggers the display control of the identification of the target processing component, and the terminal responds to the triggering operation and can display the target processing control selected by the user in the assembly area of the flow component. The triggering operation of the user on the display control of the identifier of the target processing component is the selection operation of the identifier of the target processing.
Alternatively, the target processing component may comprise a plurality of processing components. Therefore, the above-mentioned processing step 204 can be executed a plurality of times according to the selection operation of the user, so that the terminal can obtain a plurality of processing components. The connection sequence of the processing components can be determined according to the sequence of the selection operation of the user on the processing components, and can also be determined according to the dragging operation of the user on the processing components in the assembly area. That is, the terminal determines the connection order between the target process components in response to a drag operation for each process component in the assembly area according to the user.
Optionally, the target processing component includes at least one of a dual-flow Join processing component, a dynamic mapping processing component, a global aggregation processing component, a conversion processing component, a 1- > N processing component, a data sorting processing component, an aggregation calculation processing component, a dynamic filtering processing component, and an index lookup processing component.
In response to the selection operation for the identification of the target output component, a target output component is determined, step 206.
In particular, the terminal may display an identification of each output component within a selection area of the flow component. The terminal can also comprise a display control corresponding to the identifier of each output component. And the user triggers the display control of the identification of the target output assembly, and the terminal responds to the triggering operation and can display the target output control selected by the user in the assembly area of the flow assembly. The triggering operation of the user on the display control of the identifier of the target output assembly is the selection operation of the identifier of the target output.
Alternatively, the target output component may include a plurality of output components. Therefore, the output step 206 can be performed a plurality of times according to the selection operation of the user, so that the terminal can obtain a plurality of output components. The connection sequence of the process components can be determined according to the sequence of the selection operation of the user on the process components, and can also be determined according to the dragging operation of the user on the process components in the assembly area. That is, the terminal determines the connection order between the target flow components in response to a drag operation for each flow component in the assembly area according to the user.
Optionally, the target output component may include at least one of an ES output component, an HDFS output component, a database output component, an ftp output component, a file output component, a Kafka output component, a Redis asynchronous output component.
Optionally, as shown in fig. 3, the configuration interface includes a selection area of the process component and an assembly area of the process component, where the selection area of the process component displays multiple types of identifiers of the process components, including an input component type, a processing component type, and an output component type. FES personalized customization (leveling) -the operation interface is the assembly area of the process assembly.
It should be noted that, in the embodiment of the present invention, the execution sequence of step 204, step 202, and step 206 is not specifically limited, and those skilled in the art of the present invention may specifically determine the execution process of the above steps in an actual application scenario.
In the embodiment, by uniformly managing the configuration and the flow of each data service, development customized data service components can be provided for developers, namely a standardized and reusable stream processing component set is provided; the development and test process is simplified, the development and test threshold is reduced, and the development and test quality can be ensured through the selection process of the solidified target process assembly.
In one embodiment, the step 103 of obtaining a specific processing procedure of the streaming data processing logic according to the target flow component includes:
and obtaining the streaming data processing logic according to the target input assembly, the target processing assembly, the target output assembly and the connection sequence among the target process assemblies.
Specifically, the connection order between the target flow components may be determined by the following process: and the terminal responds to the selection operation of the user on each target process assembly and determines the connection sequence among the target process assemblies according to the time sequence of the selection operation. The terminal can also determine the connection sequence among the target process components according to the dragging operation of the user on the target process components.
Optionally, the terminal determines a connection sequence between the target process components according to a dragging operation of the user on the target process components. Therefore, the terminal can connect the target flow components according to the connection sequence to obtain the streaming data processing logic. For example, as shown in fig. 4, the target input component may be a Kafka capture input component, the target processing component may be a conversion processing component, a dynamic filtering processing component, a 1- > N processing component, and the target output component corresponding to the target input component may be a Kafka output component. The first set of streaming data processing sub-logic may be Kafka acquisition input component (1) → first conversion processing component (2) → dynamic filtering processing component (3) → first Kafka output component (4); the second set of streaming data processing sub-logic may be Kafka acquisition input component (1) → first conversion processing component (2) → dynamic filtering processing component (3) → 1- > N _5 processing component (5) → second conversion processing component (6) → second Kafka output component (7).
In the embodiment, various types of flow components are provided for developers, the development of enterprise-level large-scale flow operation is adapted, the flexibility and the expandability in flow data processing logic development operation are expanded, the repeated development and the manpower learning cost of source code programming are broken through, the dynamic expandability characteristic of platform component library design is fully utilized, a visual flow operation development interface is provided for the developers, abundant real-time calculation operator components, real-time acquisition operator components and real-time transmission operator components are also provided for the developers of the flow process, various data processing logics related in the flow process are visually presented, the entry threshold of the developers of elementary flow operation is lowered, the operation development efficiency is obviously improved, and the operation development quality is improved.
In one embodiment, as shown in fig. 5, after the step 103 "obtaining the streaming data processing logic according to the target flow component", the developing method for processing streaming data further includes:
step 301, responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result.
Specifically, after obtaining the target streaming data processing logic, the user may trigger an execution operation on the target streaming data processing logic, and the terminal, in response to the execution operation of the streaming data processing logic of the user, may acquire streaming data through the target input component selected by the user, process the acquired streaming data through the one or more target processing components selected by the user, and output a processing result of the streaming data through the one or more target output components selected by the user.
Step 302, transmitting the streaming data processing result in the form of an asynchronous message.
Specifically, the terminal may transmit the processing result of the streaming data to other terminals in the form of an asynchronous message.
Optionally, the method described in this disclosure may be applied to a target platform, which may be a basic platform for enterprise-level asynchronous message transmission and real-time streaming processing, and when the target streaming data processing logic is executed, the platform may obtain multiple types of high-value hot data from a streaming data generating end (such as a front-end system, a middle processing system, a back-end service system, etc.), so that the target platform may process the hot data. Accordingly, the target platform can also transmit to a streaming consumption end (such as a data center station, a service center station, a background system, etc.) of the streaming data in the form of an asynchronous message. During transmission, the target platform may perform various processing on the hot data, such as operations of dropping storage, forwarding and routing, information filtering, aggregation and grouping, information completion, information conversion, and the like on streaming data.
In one embodiment, as shown in FIG. 6, the configuration interface further includes a run information display area; correspondingly, the development method for processing streaming data further comprises the following steps:
and responding to the selection operation of the user on the target process assembly, determining the running information of the target process assembly, and displaying the running information in the running information display area.
The operation information comprises one or more of name information of the process components, input data volume information, output data volume information, system throughput information, transmission delay information, processing delay information and data processing volume information of a plurality of target time periods.
In particular, the run information display area may be located on a side of the target process component. When the streaming data processing logic is executed, a user may select an identification of any of the target flow components in the streaming data processing logic. Correspondingly, the terminal responds to the selection operation of the user on the identification of any target process assembly, and obtains the running information corresponding to the target process assembly corresponding to the identification of the target process assembly. The terminal may display the operation information in the operation information display region in the form of a dialog box. The terminal may also display the operational information below the interface.
Alternatively, the data processing amount information of the plurality of target time periods may include data processing amount information in the last minute, data processing amount information in the last ten minutes, data processing amount information in the last hour, data processing amount information in the last day, and the like. The terminal may display the data processing amount information for each target time period selected by the user in the operation information display area in response to a selection operation of the data processing amount information for each target time period by the user.
In one embodiment, as shown in FIG. 7, the configuration interface further includes a flow management area for displaying an identification of the generated streaming data processing logic; accordingly, after "obtaining the streaming data processing logic according to the target flow component" in step 103, the development method for processing streaming data further includes: in response to a selection operation directed to the identification of the target streaming data processing logic, the target streaming data processing logic is displayed in the flow management area.
Specifically, after the terminal determines a plurality of target flow components in response to the selection operation of the user on each flow component and generates the target streaming data processing logic, the generated target streaming data processing logic may be stored in a preset database.
Optionally, the user performs a custom name input operation with respect to the target streaming data processing logic. In this way, the terminal can respond to the input operation of the user-defined name, acquire the text information corresponding to the user-defined name, take the text information as the name of the target streaming data processing logic, and display the name of the target streaming data processing logic in the process management area. That is, the identification of each target streaming data processing logic is the name of the target streaming data processing logic.
It should be noted that the target platform includes a configuration service layer, and the configuration service layer includes an element management module. The layer mainly provides configuration services for all related contents such as service management configuration, component library management, resource configuration, a data dictionary, routing \ filtering \ mapping templates and the like in the streaming processing process. The method provides configuration functions of a host, aerospike, zookeeper, kafka, redis, rockmq, hbase, hdfs, socket, an external http interface, ftp, an elastic search, a database, a calculation \ decision \ processing engine and the like for a platform user, covers aspects of a server, a message queue, a database, a calculation engine and the like, and meets various streaming processing scenes of development technical requirements. Powerful streaming processing operator components are built in the platform and comprise input components (file collection, ES collection, Kafka collection, ftp file collection, database collection, socket MQ collection and the like), processing components (double-flow Join, dynamic mapping, global aggregation, conversion, 1- > N, data sorting, aggregation calculation, dynamic filtering, index searching and the like), output components (ES output, HDFS output, database output, ftp output, file output, Kafka output, Redis asynchronous output and the like) and support a user-defined development logic processing component. The platform further has the function of supporting data \ field definition, third-party class library, method, template and other resource configuration, and is used for the user to explicitly configure and use in the process of process development.
The target platform also comprises a processing service layer which comprises a flow design module. The layer is mainly oriented to the development of flow operation and the calculation of real-time indexes. The platform processes flow data in real time, realizes a conversion task chain based on the directed acyclic graph at a very high speed, embeds strong conversion tasks (including filtering, mapping, T bypass out, T bypass in and copying), and supports user-defined conversion logic. And the system also comprises a microprobe (source) for non-embedded point type collection of multi-source data, and supports the transmission of the definition of the data source, the data processing logic, the operation and other contents to a data processing engine for execution.
The target platform also comprises a management service layer which comprises a monitoring management module and a transaction consistency management module. The monitoring management module supports monitoring of connection conditions of resource environments, and the monitoring management module comprises aerospike, zookeeper, kafka, redis, rockmq, hbase, hdfs, socket, external http interfaces, ftp, and elastic search cluster environments; the resource operation abnormity monitoring is supported, and abnormal messages can be timely notified to a platform side or a message production side, wherein the abnormal messages comprise log check abnormity, lag consumption condition detection abnormity, connectivity check and the like; the method supports dynamic monitoring of historical operation flow related information, including current flow operation state, flow operation information, TPS, delay time, time-sharing data processing quantity, input/output data quantity and the like. The transaction consistency management module supports the platform administrator to inquire the result after account checking for the system with transaction design.
The three-layer architecture design encapsulates various computation engine APIs and stream processing operators, supports resource allocation and service management allocation of different computation engine services, reduces complexity of an access mode caused by diversity of the computation engines, reduces repeated work in an operation development process, and provides a flexible, simple and visual development mode for developers of stream operation.
In the aspect of resource management, resource management is carried out on the basis of a CDH big data platform Yarn on Flink, resource requirements of different flow operations are abstracted into independent JobTask, and the JobTask is distributed to different containers for real-time data calculation on the basis of different operation parallelism degrees, so that the stability of resource services is guaranteed, different calculation engines are abstracted, and the resource services are isolated conveniently. In an actual production environment, the data volumes of different streaming data sending ends are different, and for a sending end with a large data volume, a streaming processing platform needs to have stronger receiving and supporting capabilities. When a traditional message channel establishes topic, the specific distribution of the partition on which server is uncertain, and is completely determined by the distribution algorithm of the message channel. When the topic is designed and created, the partition can be assigned to be distributed on which server, so that the performance of the server can be used to reach an optimal state, the method is more suitable for a scene of processing massive streaming data, resource isolation of a message channel layer can be achieved, and consumption of the streaming data cannot be influenced even if part of servers are abnormal.
In terms of fault tolerance, the fault tolerance mechanism of the Flink is realized based on distributed snapshots, the snapshots can save the state of the stream processing job (the check point and the snapshot of the Flink are not distinguished, because the check point and the snapshot are actually two different calls of the same thing. Through the design in the aspect of fault tolerance, the correctness of transmitted data can be still ensured if conditions such as flow execution failure or unexpected downtime occur in the complex data interaction process, and the high quality of a calculation result is ensured.
It should be understood that although the various steps in the flow charts of fig. 1-5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1-5 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 8, there is provided a development apparatus for processing streaming data, including: a display module 401, a determination module 402, and a streaming data processing logic development module 403, wherein:
the display module 401 is configured to display a configuration interface of the streaming data processing logic, where the configuration interface includes a selection area of the flow component and an assembly area of the flow component, and the selection area includes an identifier of at least one flow component.
A determination module 402 for determining a target process component in response to a selection operation directed to the identification of the target process component and displaying the target process component in the assembly area.
And a streaming data processing logic development module 403, configured to obtain streaming data processing logic according to the target flow component.
In one embodiment, the flow components include a plurality of input components, a plurality of processing components, and a plurality of output components; the determining module includes:
a first determination unit configured to determine a target input component in response to a selection operation for an identification of the target input component;
a second determination unit configured to determine a target processing component in response to a selection operation for the identification of the target processing component;
a third determination unit configured to determine the target output component in response to a selection operation for the identification of the target output component.
In one embodiment, the streaming data processing logic development module is specifically configured to obtain the streaming data processing logic according to the target input component, the target processing component, the target output component, and a connection sequence between the target process components.
In one embodiment, the development device for processing streaming data further includes:
the first response module is used for responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and the transmission module is used for transmitting the streaming data processing result in the form of asynchronous messages.
In one embodiment, the configuration interface further comprises a running information display area;
the development device for processing streaming data further comprises: and the second response module is used for responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly and displaying the running information in the running information display area.
In one embodiment, the configuration interface further comprises a process management area for displaying the generated identification of the streaming data processing logic;
after the step of obtaining the streaming data processing logic according to the target process component, the development device for processing streaming data further includes: a third response module for displaying the target streaming data processing logic in the process management area in response to a selection operation directed to the identification of the target streaming data processing logic.
For specific limitations of the development device for processing the streaming data, reference may be made to the above limitations of the development method for processing the streaming data, which are not described herein again. The various modules in the development device for processing streaming data described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store relevant data for generating the streaming data processing logic. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a development method for processing streaming data.
Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the flow components comprise a plurality of input components, a plurality of processing components and a plurality of output components; the determining the target process component in response to the selection operation directed to the identification of the target process component comprises:
determining a target input component in response to a selection operation for the identification of the target input component;
determining a target processing component in response to a selection operation directed to the identification of the target processing component;
in response to a selection operation directed to the identification of the target output component, the target output component is determined.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the obtaining of the streaming data processing logic according to the target process component includes:
and obtaining streaming data processing logic according to the target input assembly, the target processing assembly, the target output assembly and the connection sequence among the target process assemblies.
In one embodiment, the processor, when executing the computer program, further performs the steps of: after the step of obtaining streaming data processing logic according to the target flow component, the method further comprises:
responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and transmitting the streaming data processing result in the form of an asynchronous message.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the configuration interface further comprises an operation information display area;
the method further comprises the following steps:
and responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly, and displaying the running information in the running information display area.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the operation information includes one or more of name information of the process component, input data volume information, output data volume information, system throughput information, transmission delay information, processing delay information, and data throughput information of a plurality of target time periods.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the configuration interface further comprises a process management area, and the process management area is used for displaying the generated identification of the streaming data processing logic;
after the step of obtaining the streaming data processing logic according to the target process component, the method further includes:
displaying the target streaming data processing logic in the flow management area in response to a selection operation directed to the identification of the target streaming data processing logic.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
In one embodiment, the computer program when executed by the processor further performs the steps of: the flow components comprise a plurality of input components, a plurality of processing components and a plurality of output components; the determining the target process component in response to the selection operation directed to the identification of the target process component comprises:
determining a target input component in response to a selection operation for the identification of the target input component;
determining a target processing component in response to a selection operation directed to the identification of the target processing component;
in response to a selection operation directed to the identification of the target output component, the target output component is determined.
In one embodiment, the computer program when executed by the processor further performs the steps of: the obtaining of the streaming data processing logic according to the target process component includes:
and obtaining streaming data processing logic according to the target input assembly, the target processing assembly, the target output assembly and the connection sequence among the target process assemblies.
In one embodiment, the computer program when executed by the processor further performs the steps of: after the step of obtaining streaming data processing logic according to the target flow component, the method further comprises:
responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and transmitting the streaming data processing result in the form of an asynchronous message.
In one embodiment, the computer program when executed by the processor further performs the steps of: the configuration interface further comprises an operation information display area;
the method further comprises the following steps:
and responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly, and displaying the running information in the running information display area.
In one embodiment, the computer program when executed by the processor further performs the steps of: the operation information includes one or more of name information of the process component, input data volume information, output data volume information, system throughput information, transmission delay information, processing delay information, and data throughput information of a plurality of target time periods.
In one embodiment, the computer program when executed by the processor further performs the steps of: the configuration interface further comprises a process management area, and the process management area is used for displaying the generated identification of the streaming data processing logic;
after the step of obtaining the streaming data processing logic according to the target process component, the method further includes:
displaying the target streaming data processing logic in the flow management area in response to a selection operation directed to the identification of the target streaming data processing logic.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A development method for processing streaming data, the method comprising:
displaying a configuration interface of the streaming data processing logic, wherein the configuration interface comprises a selection area of the process components and an assembly area of the process components, and the selection area comprises an identifier of at least one process component;
in response to a selection operation for the identification of the target process component, determining the target process component and displaying the target process component in the assembly area;
and obtaining the streaming data processing logic according to the target flow assembly.
2. The method of claim 1, wherein the flow components comprise a plurality of input components, a plurality of processing components, and a plurality of output components; the determining the target process component in response to the selection operation directed to the identification of the target process component comprises:
determining a target input component in response to a selection operation for the identification of the target input component;
determining a target processing component in response to a selection operation directed to the identification of the target processing component;
in response to a selection operation directed to the identification of the target output component, the target output component is determined.
3. The method of claim 2, wherein the deriving streaming data processing logic from the target flow component comprises:
and obtaining streaming data processing logic according to the target input assembly, the target processing assembly, the target output assembly and the connection sequence among the target process assemblies.
4. The method of claim 1, wherein after the step of deriving streaming data processing logic from the target flow component, the method further comprises:
responding to the execution operation of the streaming data processing logic of the user, acquiring streaming data, and processing the streaming data to obtain a streaming data processing result;
and transmitting the streaming data processing result in the form of an asynchronous message.
5. The method of any of claims 1 to 4, wherein the configuration interface further comprises a run information display area;
the method further comprises the following steps:
and responding to the selection operation of the target process assembly by the user, determining the running information of the target process assembly, and displaying the running information in the running information display area.
6. The method of claim 5, wherein the operational information comprises one or more of name information of the process component, input data volume information, output data volume information, system throughput information, transmission delay information, processing delay information, and data throughput information for a plurality of target time periods.
7. The method of any of claims 1-4, wherein the configuration interface further comprises a process management area for displaying an identification of the streaming data processing logic that has been generated;
after the step of obtaining the streaming data processing logic according to the target process component, the method further includes:
displaying the target streaming data processing logic in the flow management area in response to a selection operation directed to the identification of the target streaming data processing logic.
8. A development device for processing streaming data, the device comprising:
the display module is used for displaying a configuration interface of the streaming data processing logic, the configuration interface comprises a selection area of the flow components and an assembly area of the flow components, and the selection area comprises an identifier of at least one flow component;
a determination module, configured to determine a target process component in response to a selection operation for an identifier of the target process component, and display the target process component in the assembly area;
and the streaming data processing logic development module is used for obtaining streaming data processing logic according to the target flow assembly.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110983088.3A 2021-08-25 2021-08-25 Development method and device for processing streaming data and computer equipment Pending CN113867600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110983088.3A CN113867600A (en) 2021-08-25 2021-08-25 Development method and device for processing streaming data and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110983088.3A CN113867600A (en) 2021-08-25 2021-08-25 Development method and device for processing streaming data and computer equipment

Publications (1)

Publication Number Publication Date
CN113867600A true CN113867600A (en) 2021-12-31

Family

ID=78988404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110983088.3A Pending CN113867600A (en) 2021-08-25 2021-08-25 Development method and device for processing streaming data and computer equipment

Country Status (1)

Country Link
CN (1) CN113867600A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398137A (en) * 2022-01-18 2022-04-26 江苏中天互联科技有限公司 Data processing flow deployment method and device and server
CN114936245A (en) * 2022-04-28 2022-08-23 北京远舢智能科技有限公司 Method and device for integrating and processing multi-source heterogeneous data
CN116069202A (en) * 2023-03-09 2023-05-05 苏州傲林科技有限公司 Operating condition radar chart processing method and device
CN117743398A (en) * 2023-12-15 2024-03-22 智人开源(北京)科技有限公司 Real-time decision and data processing method based on rules in stream database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764753A (en) * 2019-09-18 2020-02-07 亚信创新技术(南京)有限公司 Business logic code generation method, device, equipment and storage medium
CN111552470A (en) * 2019-12-31 2020-08-18 远景智能国际私人投资有限公司 Data analysis task creation method and device in Internet of things and storage medium
CN113010306A (en) * 2021-02-24 2021-06-22 金蝶软件(中国)有限公司 Service data processing method and device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764753A (en) * 2019-09-18 2020-02-07 亚信创新技术(南京)有限公司 Business logic code generation method, device, equipment and storage medium
CN111552470A (en) * 2019-12-31 2020-08-18 远景智能国际私人投资有限公司 Data analysis task creation method and device in Internet of things and storage medium
CN113010306A (en) * 2021-02-24 2021-06-22 金蝶软件(中国)有限公司 Service data processing method and device, computer equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398137A (en) * 2022-01-18 2022-04-26 江苏中天互联科技有限公司 Data processing flow deployment method and device and server
CN114936245A (en) * 2022-04-28 2022-08-23 北京远舢智能科技有限公司 Method and device for integrating and processing multi-source heterogeneous data
CN116069202A (en) * 2023-03-09 2023-05-05 苏州傲林科技有限公司 Operating condition radar chart processing method and device
CN117743398A (en) * 2023-12-15 2024-03-22 智人开源(北京)科技有限公司 Real-time decision and data processing method based on rules in stream database

Similar Documents

Publication Publication Date Title
CN109933522B (en) Test method, test system and storage medium for automatic case
Barika et al. Orchestrating big data analysis workflows in the cloud: research challenges, survey, and future directions
CN109828831B (en) Artificial intelligence cloud platform
CN113867600A (en) Development method and device for processing streaming data and computer equipment
CN108804618B (en) Database configuration method, device, computer equipment and storage medium
CN112910945A (en) Request link tracking method and service request processing method
CN112035228A (en) Resource scheduling method and device
CN110750458A (en) Big data platform testing method and device, readable storage medium and electronic equipment
CN111831191A (en) Workflow configuration method and device, computer equipment and storage medium
CN112313627B (en) Mapping mechanism of event to serverless function workflow instance
CN113037891B (en) Access method and device for stateful application in edge computing system and electronic equipment
CN102752770B (en) Method and device for polling service system
US11269691B2 (en) Load distribution for integration scenarios
US11231967B2 (en) Dynamically allocating and managing cloud workers
CN110781180A (en) Data screening method and data screening device
CN115392501A (en) Data acquisition method and device, electronic equipment and storage medium
CN113419818B (en) Basic component deployment method, device, server and storage medium
US11184251B2 (en) Data center cartography bootstrapping from process table data
CN117389655A (en) Task execution method, device, equipment and storage medium in cloud native environment
CN114564249B (en) Recommendation scheduling engine, recommendation scheduling method and computer readable storage medium
CN115237399A (en) Method for collecting data, storage medium, processor and engineering vehicle
US11531674B2 (en) System and method for supporting rollback of changes made to target systems via an integration platform
US20220030079A1 (en) Methods and systems for recording user operations on a cloud management platform
CN112564979A (en) Execution method and device for construction task, computer equipment and storage medium
CN112256384A (en) Service set processing method and device based on container technology and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination