CN112667683B - Stream computing system, electronic device thereof, and storage medium - Google Patents

Stream computing system, electronic device thereof, and storage medium Download PDF

Info

Publication number
CN112667683B
CN112667683B CN202011559972.6A CN202011559972A CN112667683B CN 112667683 B CN112667683 B CN 112667683B CN 202011559972 A CN202011559972 A CN 202011559972A CN 112667683 B CN112667683 B CN 112667683B
Authority
CN
China
Prior art keywords
stream
stream computing
data
task
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011559972.6A
Other languages
Chinese (zh)
Other versions
CN112667683A (en
Inventor
蒋英明
万书武
张观成
赵楚旋
林琪琛
刘微明
覃芳
曹晓能
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011559972.6A priority Critical patent/CN112667683B/en
Publication of CN112667683A publication Critical patent/CN112667683A/en
Application granted granted Critical
Publication of CN112667683B publication Critical patent/CN112667683B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application is applicable to the technical field of data processing, and provides a stream computing system, which comprises: the data acquisition layer is used for configuring a data source of the stream computing task and acquiring stream data resources required by processing the stream computing task from the data source in real time; the data bus layer is used for creating a task theme corresponding to the stream computing task and caching stream data resources acquired from the data source in real time according to the task theme; the resource management layer is used for scheduling and managing real-time stream data resources of the stream computing task according to the task theme; the computing engine layer is used for developing a corresponding stream computing mode for the stream computing task according to a data source configured by the stream computing task, executing the stream computing task according to the stream computing mode and outputting a stream computing result; and the storage and interface layer is used for storing the stream calculation result and providing an output interface for the stream calculation result. The method solves the problems of high development threshold and high development and operation and maintenance costs of the existing stream computing system when the enterprise-level productization application is realized.

Description

Stream computing system, electronic device thereof, and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a stream computing system, an electronic device thereof, and a storage medium.
Background
The real-time stream calculation is a process of carrying out real-time calculation on stream data by a pointer, acquiring mass data from different data sources in real time, and acquiring valuable information through real-time analysis processing, so as to realize perception, analysis, judgment and decision through a mode of 'in-the-fact' or even 'in-advance'. Stream computation inherits the basic idea that the value of data decreases over time, and thus, in order to be able to process stream data in a timely manner, it is desirable to provide a low-latency, scalable, highly reliable stream computing system.
The existing stream computing engines mainly comprise commercial grade InfoSphere Streams and StreamBase and open source version Twitter Storm, spark Streaming and Flink, and the application is more widely Spark Streaming and Flink engines. However, spark Streaming and Flink engines are usually open-source stream computing frameworks, and existing stream computing systems are real-time stream computing applications which are single in function, specific in application scenario and specific in application mode and are developed by advanced developers by utilizing the open-source stream computing frameworks such as Spark Streaming and Flink in combination with the product requirements of enterprises when enterprise-level productization applications are realized. The development threshold of the real-time stream computing application is high, and the development and operation costs are high.
Disclosure of Invention
In view of this, the embodiments of the present application provide a streaming computing system, and an electronic device and a storage medium thereof, which aim to at least solve one of the problems of the existing streaming computing system, such as a large difficulty in application development and a high operation and maintenance cost.
A first aspect of an embodiment of the present application provides a stream computing system, including:
the data acquisition layer is used for configuring a data source of a stream computing task and acquiring stream data resources required by processing the stream computing task from the data source in real time;
the data bus layer is used for creating a task theme corresponding to the stream computing task and caching stream data resources acquired from the data source in real time according to the task theme;
the resource management layer is used for scheduling and managing real-time stream data resources of the stream computing task according to the task theme;
the computing engine layer is used for developing a corresponding stream computing mode for the stream computing task according to a data source configured by the stream computing task, executing the stream computing task according to the stream computing mode and outputting a stream computing result;
and the storage and interface layer is used for storing the stream calculation result and providing an output interface for the stream calculation result.
With reference to the first aspect, in a first possible implementation manner of the first aspect, a data source configuration unit is provided in the data acquisition layer, where the data source configuration unit is configured to select, according to a flow calculation task execution request of a user, one data source configured as the flow calculation task from pre-accessed configurable data sources, where the pre-accessed configurable data sources include at least one of: mysql data source, postgresql data source, oracle data source, SQLserver data source, log data source, message bus MQ data source, external kafka data source, and restful API data source.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, a development mode configuration unit is provided in the compute engine layer, and the development mode configuration unit is configured to configure a development mode for the stream computing task according to a data source configured by the stream computing task, where the development mode configurable by the development mode configuration unit includes one or more of an sql development mode, a jar development mode, and a canvas development mode.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the resource management layer uses a kubernetes resource management manner and is compatible with a yarn resource management manner when scheduling and managing real-time stream data resources, where when the resource management layer identifies that a user requesting to execute a stream computing task is a traditional hadoop user, the resource management layer provides a compatible yarn resource management manner for the traditional hadoop user.
With reference to the first aspect, in a fourth possible implementation manner of the first aspect, the flow computing system further includes: service management module and operation system module, wherein:
the service management module is used for providing at least one service of multi-tenant resource management, user authority unified management compatible with kubernetes ecology and hadoop ecology, health state monitoring management of a stream computing task, operation index monitoring management of the stream computing task and data center monitoring management in the stream computing system;
the operation system module is used for providing at least one mechanism of a user guiding mechanism, a user behavior auditing mechanism, a system automatic capacity expanding mechanism, a system abnormality early warning mechanism and a system abnormality recovery mechanism for documents and video data in the stream computing system.
With reference to the fourth possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, the service management module is configured to, when providing user rights unified management compatible with kubernetes ecology and hadoop ecology in the stream computing system, further detect whether a user has rights to operate a real-time data bus layer, limit a flow of a user read-write data resource with the task subject as an object, and detect whether the stream computing task is provided with a preset hadoop operation right when scheduled by using the kubernetes resource management manner.
With reference to the fourth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, when the service management module provides a health status monitoring management service for a flow computing task in the flow computing system, the service management module is further configured to monitor whether the flow computing task survives and is back-pressed, where if a processing procedure of the flow computing task is monitored to have been interrupted or failed, it is determined that the flow computing task does not survive; and if the stream data receiving speed is larger than the stream data processing speed in the stream data processing process of the monitoring stream computing task, judging the backpressure of the stream computing task.
With reference to the first aspect, in a seventh possible implementation manner of the first aspect, the flow computing system further includes: and the operation system module is used for providing at least one mechanism of a user guiding mechanism, a user behavior auditing mechanism, a system automatic capacity expanding mechanism, a system abnormality early warning mechanism and a system abnormality recovery mechanism for documents and video data in the stream computing system.
A second aspect of the embodiments of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the electronic device, the processor implementing the functions of the system provided in the first aspect when the computer program is executed.
A third aspect of the embodiments of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the functions of the system provided by the first aspect.
The flow computing system, the electronic device and the storage medium thereof provided by the embodiment of the application have the beneficial effects that:
the method comprises the steps of establishing a one-to-one mapping relation between task topics and data sources configured by a stream computing task through a data acquisition layer, a data bus layer, a resource management layer, a computing engine layer and a storage and interface layer which are arranged by the stream computing system, realizing resource isolation management for the stream computing task for the data sources, realizing flow quota configuration of the data sources in the stream computing task according to the task topics based on resource isolation, realizing flow monitoring, automatic capacity expansion and other resource allocation management within a flow quota limit range, and realizing development and stream computing processing of a stream computing mode according to the selection of the type of the data sources by the computing engine layer, thereby realizing adaptation to different stream computing application development and meeting the demands of different skill user groups, reducing the development threshold of the stream computing application, and simultaneously reducing the development and operation cost.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a basic functional architecture of a streaming computing system according to a first embodiment of the present application;
FIG. 2 is a schematic diagram of a flow computing system according to another embodiment of the present application;
FIG. 3 is a schematic architecture diagram of a streaming computing system according to a third embodiment of the present application;
fig. 4 is a block diagram of an electronic device according to a fourth embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
Referring to fig. 1, fig. 1 is a schematic diagram of a basic functional architecture of a stream computing system according to a first embodiment of the present application. The details are as follows: the stream computing system 100 in this embodiment may include a data acquisition layer 110, a data bus layer 120, a resource management layer 130, a compute engine layer 140, and a storage and interface layer 150. Wherein:
the data acquisition layer 110 is configured to configure a data source of a stream computing task and acquire stream data resources required for processing the stream computing task from the data source in real time. In this embodiment, the data acquisition layer 110 includes a plurality of configurable data sources that are accessed in advance by the stream computing system 100, so as to provide the data acquisition layer 110 with automatic acquisition of the data sources. In this embodiment, the data acquisition layer may be provided with a data source configuration unit, where the data source configuration unit is configured to select, according to a flow calculation task execution request of a user, a data source configured as the flow calculation task from configurable data sources that are accessed in advance. Specifically, the data sources that the stream computing system 100 pre-accesses include, but are not limited to, mysql data source, postgresql data source, oracle data source, SQLserver data source, log data source, message bus MQ data source, external kafka data source, and restful API data source. The stream computing system triggers the data source configuration unit in the data acquisition layer 110 to select one data source configured as a stream computing task from the eight data sources according to the stream computing task execution request of the user, thereby realizing real-time acquisition of stream data resources required by the stream computing task from the configured data sources when the stream computing task is executed.
The data bus layer 120 is configured to create a task theme corresponding to the stream computing task, and cache stream data resources collected in real time from the data source according to the task theme. In this embodiment, the stream computing system employs a cluster deployment based on Apache kakfa (distributed publish-subscribe messaging system). By creating a task topic corresponding to the stream computation task on the data bus layer 120, i.e. creating a topic in kakfa, the data bus layer 120 can cache stream data resources collected in real time from the data sources according to the task topic. Based on the method, by creating the task theme, a one-to-one mapping relation between the task theme and the data sources configured by the stream computing task can be established, so that the stream computing task can perform resource isolation management facing the data sources.
The resource management layer 130 is configured to schedule and manage real-time stream data resources for the stream computing task according to the task theme. In this embodiment, the resource management layer 130 uniformly manages and schedules cluster resources, specifically, distributes the stream computing tasks to the working nodes of the stream computing system by adopting a scheduling algorithm, and performs resource distribution management and scheduling engine of the stream computing tasks by monitoring the use condition of the stream data resources on each working node. The resource allocation management further comprises, but is not limited to, configuring flow quota of a data source in the flow calculation task according to the task theme, realizing flow monitoring, automatically expanding in a flow quota limit range and the like.
The calculation engine layer 140 is configured to develop a corresponding flow calculation mode for the flow calculation task according to the data source configured by the flow calculation task, execute the flow calculation task according to the flow calculation mode, and output a flow calculation result. In this embodiment, the compute engine layer 130 supports the infrastructure of Apache flink, apache spark streaming. The 8 data sources configured in the data acquisition layer 110 can be broadly divided into three types: DB data type, log file type, restapi type. Correspondingly, the compute engine layer 130 provides three stream computing development modes, namely, an sql development mode (sql mode to develop stream computing tasks), a jar development mode (original jar mode to deploy stream computing tasks), and a canvas development mode (canvas drag mode to automatically build stream computing tasks). And selecting a corresponding stream computing development mode based on the type of the data source configured by the stream computing task to develop a stream computing mode for the stream computing task, further executing the stream computing task according to the correspondingly developed stream computing mode and outputting a stream computing result. Based on the method, the corresponding development mode can be selected in the stream computing system according to the type of the data source to develop the stream computing mode, so that the method can adapt to different stream computing application developments and meet the requirements of different skill user groups, the development threshold of the stream computing application is reduced, and meanwhile, the cost of development and operation is reduced.
The storage and interface layer 150 is configured to store the stream computation results and provide an output interface for the stream computation results. In this embodiment, the stream computing system is provided with a storage unit in the storage and interface layer 150, and after the computing engine layer 140 executes the stream computing task to obtain a stream computing result, the stream computing result is stored in the storage unit. The stream computing system is further provided with a message subscription interface in the storage and interface layer 150 for providing users with the stream computing results. In this embodiment, the message subscription interface is configured as an SDKAPI interface of java and python languages, and supports third party system integration. Through the SDKAPI interface, the background service of the stream computing system can be packaged, a user can directly use the stream computing system in an interface authentication mode, the user does not need to pay attention to background service logic of the bottom layer of the system, the operation is simple, and the use threshold is low.
The flow computing system in this embodiment establishes a one-to-one mapping relationship between task topics and data sources configured by the flow computing tasks in the data bus layer through the data acquisition layer, the data bus layer, the resource management layer, the computing engine layer and the storage and interface layer, realizes the resource isolation management for the data sources in the flow computing tasks configured by the task topics based on the resource isolation, realizes the flow quota of the data sources in the flow computing tasks configured by the task topics, realizes the flow monitoring, performs automatic capacity expansion and other resource allocation management within the flow quota limit range, and realizes the development and flow computing processing of the flow computing mode according to the selection of the type of the data sources in the computing engine layer, thereby realizing the development and the flow computing processing of the flow computing mode, adapting to the development of different flow computing applications and meeting the requirements of different skill user groups, reducing the development threshold of the flow computing applications, and simultaneously reducing the development and operation costs.
In this embodiment of the present application, the resource management layer 130 may also be compatible with two resource management modes of kubernetes and yarn. The kubernetes resource management mode is used when the real-time stream data resource is scheduled and managed, but when the resource management layer identifies that the user requesting to execute the stream calculation task is a traditional hadoop user, a compatible yarn resource management mode is provided for the traditional hadoop user. kubernetes is a container cluster management system, resource management is achieved by deploying containers, each container is isolated from each other, each container has a file system, processes among the containers cannot affect each other, and computing resources can be distinguished. kubernetes has the functions of automatic deployment, automatic capacity expansion and contraction, maintenance and the like. Yarn (Yet Another Resource Negotiator) is a hadoop cluster resource manager system, which is a general resource management system for providing uniform resource management and scheduling for upper layer applications. In this embodiment, mainly, in order to be compatible with application habits of a traditional hadoop user, a yarn stream computing resource management mode is provided for the traditional hadoop user. In this embodiment, based on task topics created in the data bus layer and a mapping relation between task topics and data sources configured by stream computing tasks, in cooperation with containerized resource management of kubernetes, resource isolation management for the data sources can be well performed on the stream computing tasks, and resource isolation management based on a Kafka construction platform level is realized, so that authority management according to product item configuration can be realized when enterprise-level productization application is realized. In this embodiment, the rights management may be configured according to the product item, and the system item member may be used as an authentication basis to determine whether the user has rights to operate the real-time data bus, which is simpler than the rights management based on the Kafka client in the conventional manner. In the embodiment, the unified management of kubernetes and hadoop data authority is also provided,
in this embodiment, a development mode configuration unit may be further configured in the compute engine layer 140 to configure a development mode for the stream computing task according to the data source configured by the stream computing task, where the development mode configurable by the development mode configuration unit includes an sql development mode, a jar development mode, and a canvas development mode. In this embodiment, the following three data source types are obtained by classifying the configurable data sources in the system data acquisition layer: DB data type, log file type, and restapi type. At this time, the development mode configuration unit may identify a data source type to which the data source configured by the stream computing task belongs, and further configure a corresponding development mode for the stream computing task according to the data source type. For example, for a data source of a DB data type, the development mode configured by the development mode configuration unit is an sql development mode, and at this time, the computing engine layer adopts the sql development mode to develop a corresponding stream computing mode for a stream computing task; aiming at a data source of a log file type, the development mode configured by the development mode configuration unit is a jar development mode, and at the moment, a calculation engine layer adopts the jar development mode to develop a corresponding stream calculation mode for a stream calculation task; aiming at the data source of the restart type, the development mode configured by the development mode configuration unit is a canvas development mode, and the calculation engine layer adopts the canvas development mode to develop a corresponding stream calculation mode for the stream calculation task. Therefore, the calculation engine layer is realized to develop a corresponding stream calculation mode for the stream calculation task according to the data source configured by the stream calculation task.
In another embodiment of the present application, the service of the stream computing system adopts a front-end and back-end separation design from the technical architecture, please refer to fig. 2, fig. 2 is a schematic diagram of the technical architecture of the stream computing system according to another embodiment of the present application. As shown in FIG. 2, in the streaming computing system 200, a front-end web UI interface is employed to be responsible for directly interoperating with the user. The operation entrance is the background service of the stream computing system and is a scheduler of the web UI interface and the background engine. The operation entrance comprises GBD-RTC, GBD-BUS and GBD-RSC, wherein the GBD-RTC is located in the data acquisition layer and is responsible for starting a collector in the data acquisition layer to acquire real-time data. In the data collection layer, various collectors configured by the stream computing system 200 may be included, such as mysql collector (mysql-collector), postgresql collector (PG-collector), oracle collector (oracle-collector), SQLserver collector (SQLserver-collector), log collector (log-collector), message bus MQ collector (MQ-collector), external kafka collector (kafka-collector), and restful API collector (restful-collector). The GBD-BUS is located at the data BUS layer and is responsible for creating a task topic (topic) in the data BUS layer. The GBD-RSC is located at the computing engine layer and is responsible for task distribution in the computing engine layer, and the task distribution supports a basic framework of two stream computing of a link and a spark stream. In this embodiment, the flow computing system 200 constructs kubernetes ecology at the resource management layer, so that the flow computing system 200 supports kubernetes resource management mode, and constructs Hadoop ecology at the storage and interface layer, so that the flow computing system 200 is compatible with the yarn resource management mode while supporting kubernetes resource management mode. And an SDK API interface is configured in the storage and interface layer to obtain a stream calculation result calculated by the stream calculation system in a manner that enables the stream calculation system 200 to support third party system integration. In this embodiment, one or more data centers may be deployed in the stream computing system 200, and each data center is constructed with a kubernetes ecology and a Hadoop ecology, so that the stream computing system 200 supports two resource management modes of kubernetes and yarn.
In an embodiment of the present application, please refer to fig. 3, fig. 3 is a schematic architecture diagram of a stream computing system according to a third embodiment of the present application. As shown in fig. 3, the front-end web UI interface of the stream computing system 300 includes a service management module 310 and an operation system module 320, based on the technical architecture of the stream computing system service. The service management module 310 is configured to provide at least one service of multi-tenant resource management, user authority unified management compatible with kubernetes ecology and hadoop ecology, health status monitoring management of a stream computing task, operation index monitoring management of the stream computing task, and data center monitoring management in the stream computing system. The operation system module is used for providing at least one mechanism of a user guiding mechanism, a user behavior auditing mechanism, a system automatic capacity expanding mechanism, a system abnormality early warning mechanism and a system abnormality recovery mechanism for documents and video data in the stream computing system.
In this embodiment, when the service management module 310 provides the multi-tenant resource management service, the multi-tenant resource management of the stream computing system is realized by constructing kubernetes and hadoop resource layer tenant isolation, and by performing resource isolation on data resources required by the stream computing tasks of the tenants, so as to perform data resource management based on the tenant dimensions.
In this embodiment, when providing the unified management service of user rights compatible with kubernetes ecology and hadoop ecology, the service management module 310 uses system item members as authentication basis, detects whether a user has rights to operate the real-time data bus layer, uses task subject as object to limit the flow of user read-write data resources, and presets corresponding hadoop operation rights in the stream computing system, so as to realize unified management of user rights compatible with kubernetes ecology and hadoop ecology by detecting whether preset hadoop operation rights are possessed when scheduling the stream computing task by adopting the kubernetes resource management mode.
In this embodiment, the service management module 310 monitors the activity of the flow computing task and the backpressure condition when providing the health status monitoring management service of the flow computing task, where the monitoring of the activity is to monitor whether the processing of the flow computing task has been interrupted or failed. And monitoring the back pressure, namely monitoring whether the stream data receiving speed is larger than the stream data processing speed in the stream data processing process of the stream computing system when the stream computing task is executed, and if the stream data receiving speed is larger than the stream data processing speed, indicating that the back pressure is generated. And when the conditions of whether the processing process is interrupted or failed, backpressure in the stream data processing process and the like are monitored, displaying the abnormal health state of the stream computing task. Furthermore, an alarm strategy of a telephone, a short message or other chat tools can be provided, and the related personnel are informed of abnormal health status of the flow calculation task, so that the related personnel can conveniently execute the abnormal recovery operation in time.
In this embodiment, the service management module 310 monitors various operation index parameters such as the number of bytes flowing in per second, the number of bytes flowing out per second, the number of records flowing out per second, and the like in the execution process of the stream calculation task when providing the operation index monitoring management service of the stream calculation task, and adjusts the parameters to achieve the purposes of data stream flow quota and automatic capacity expansion.
In this embodiment, the stream computation task has long data link and high aging requirement when processing, and basically reaches the millisecond level, and it is difficult to reach the aging requirement when the cross-data center processing occurs. Therefore, in this embodiment, the service management module 310 mainly monitors the streaming computing task when providing the data center monitoring management service, and prevents the streaming data processing procedure of the streaming computing task from crossing the data center. When the stream computing system supports a plurality of data centers, the data synchronization is realized by means of data disaster recovery or dual-activity strategy. The double-activity strategy is that when one data center has a problem and can not continue to execute the stream calculation task, the stream calculation system automatically switches services to ensure the normal processing of the stream calculation task.
In combination with the embodiment, from the perspective of enterprise products, the flow computing system provided by the application realizes visualization and configurability, and from development, testing, deployment and operation and maintenance full-flow management, so that the trial-and-error cost of user data development is reduced.
Referring to fig. 4, fig. 4 is a block diagram of an electronic device according to a fourth embodiment of the present application. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 41, a memory 42 and a computer program 43 stored in said memory 42 and executable on said processor 41. The processor 41, when executing the computer program 43, implements the functions of the respective execution layers or units or modules of the stream computing system in the above embodiment.
For example, the computer program 43 may be partitioned into one or more execution layers, which are stored in the memory 42 and executed by the processor 41 to complete the present application. The one or more execution layers may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 43 in the electronic device 4. For example, the computer program 43 may be split into:
the data acquisition layer is used for configuring a data source of a stream computing task and acquiring stream data resources required by processing the stream computing task from the data source in real time;
the data bus layer is used for creating a task theme corresponding to the stream computing task and caching stream data resources acquired from the data source in real time according to the task theme;
the resource management layer is used for scheduling and managing real-time stream data resources of the stream computing task according to the task theme;
the computing engine layer is used for developing a corresponding stream computing mode for the stream computing task according to a data source configured by the stream computing task, executing the stream computing task according to the stream computing mode and outputting a stream computing result;
and the storage and interface layer is used for storing the stream calculation result and providing an output interface for the stream calculation result.
The electronic device may include, but is not limited to, a processor 41, a memory 42. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the electronic device 4 and is not meant to be limiting of the electronic device 4, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the electronic device may further include an input-output device, a network access device, a bus, etc.
The processor 41 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 42 may be an internal storage unit of the electronic device 4, such as a hard disk or a memory of the electronic device 4. The memory 42 may also be an external storage device of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 4. Further, the memory 42 may also include both an internal storage unit and an external storage device of the electronic device 4. The memory 42 is used to store the computer program as well as other programs and data required by the electronic device. The memory 42 may also be used to temporarily store data that has been output or is to be output.
It should be noted that, because the content of information interaction and execution process between the execution layers is based on the same concept as the embodiment of the system of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and details are not repeated here.
Embodiments of the present application also provide a computer readable storage medium storing a computer program that, when executed by a processor, may implement the functions of the various system embodiments described above. In this embodiment, the computer-readable storage medium may be nonvolatile or may be volatile.
Embodiments of the present application provide a computer program product that, when run on a mobile terminal, enables the mobile terminal to perform the functions of the various system embodiments described above. Wherein:
in one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
configuring a data source of a stream computing task, and acquiring stream data resources required by processing the stream computing task from the data source in real time;
creating a task theme corresponding to the stream computing task, and caching stream data resources acquired from the data source in real time according to the task theme;
scheduling and managing real-time stream data resources of the stream computing task according to the task theme;
developing a corresponding stream computing mode for the stream computing task according to a data source configured by the stream computing task, executing the stream computing task according to the stream computing mode and outputting a stream computing result;
storing the stream computation results and providing an output interface for the stream computation results.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
selecting one data source configured as the stream computing task from pre-accessed configurable data sources according to a stream computing task execution request of a user, wherein the pre-accessed configurable data sources comprise at least one of the following: mysql data source, postgresql data source, oracle data source, SQLserver data source, log data source, message bus MQ data source, external kafka data source, and restful API data source.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
and configuring a development mode for the stream computing task according to the data source of the stream computing task configuration, wherein the development mode configurable by the development mode configuration unit comprises one or more of an sql development mode, a jar development mode and a canvas development mode.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
and when the resource management layer identifies that the user requesting to execute the stream calculation task is a traditional hadoop user, providing a compatible yarn resource management mode for the traditional hadoop user.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
and providing at least one service of multi-tenant resource management, unified user authority management compatible with kubernetes ecology and hadoop ecology, health state monitoring management of stream computing tasks, operation index monitoring management of stream computing tasks and data center monitoring management in the stream computing system.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
and detecting whether a user has the authority to operate the real-time data bus layer or not, and detecting whether the user has the preset hadoop operation authority or not when the flow calculation task is scheduled by adopting the kubernetes resource management mode.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
monitoring whether a flow calculation task survives and whether the flow calculation task is back-pressed, wherein if the processing process of the flow calculation task is interrupted or failed, judging that the flow calculation task does not survive; and if the stream data receiving speed is larger than the stream data processing speed in the stream data processing process of the monitoring stream computing task, judging the backpressure of the stream computing task.
In one embodiment, the computer program product, when run on a mobile terminal, causes the mobile terminal to perform the following functions:
at least one of a user guidance mechanism, a user behavior auditing mechanism, a system automatic capacity expansion mechanism, a system abnormality early warning mechanism and a system abnormality recovery mechanism for documents and video data is provided in the stream computing system.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of each execution layer is illustrated, and in practical application, the above-described function allocation may be performed by different functional units or modules according to needs, i.e. the internal structure of the system is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. With such understanding, the present application implements all or part of the functions of the above-described embodiments, and may also be implemented by means of a computer program, which may be stored on a computer readable storage medium and which, when executed by a processor, implements the functions of the above-described system embodiments. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium may include content that is subject to appropriate increases and decreases as required by jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is not included as electrical carrier signals and telecommunication signals.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A stream computing system, comprising:
the data acquisition layer is used for configuring a data source of a stream computing task and acquiring stream data resources required by processing the stream computing task from the data source in real time;
the data bus layer is used for creating a task theme corresponding to the stream computing task, caching stream data resources acquired from the data sources in real time according to the task theme, and establishing a one-to-one mapping relation between the task theme and the data sources configured by the stream computing task so as to realize the resource isolation management of the stream computing task facing the data sources;
the resource management layer is used for scheduling and managing real-time stream data resources of the stream computing task according to the task theme;
the computing engine layer is used for developing a corresponding stream computing mode for the stream computing task according to a data source configured by the stream computing task, executing the stream computing task according to the stream computing mode and outputting a stream computing result, wherein the corresponding development mode is selected according to the type of the data source to develop the stream computing mode and perform stream computing processing;
and the storage and interface layer is used for storing the stream calculation result and providing an output interface for the stream calculation result.
2. The stream computing system according to claim 1, wherein a data source configuration unit is provided in the data acquisition layer, and the data source configuration unit is configured to select one of the pre-accessed configurable data sources to be configured as the stream computing task according to a stream computing task execution request of a user, wherein the pre-accessed configurable data sources include at least one of: mysql data source, postgresql data source, oracle data source, SQLserver data source, log data source, message bus MQ data source, external kafka data source, and restful API data source.
3. The stream computing system according to claim 2, wherein a development mode configuration unit is disposed in the computing engine layer, and the development mode configuration unit is configured to configure a development mode for the stream computing task according to a data source configured by the stream computing task, wherein the development mode configurable by the development mode configuration unit includes one or more of an sql development mode, a jar development mode, and a canvas development mode.
4. The stream computing system of claim 1, wherein the resource management layer uses kubernetes resource management mode and is compatible with the yarn resource management mode when scheduling and managing real-time stream data resources, wherein when the resource management layer identifies that a user requesting to perform a stream computing task is a traditional hadoop user, the resource management layer provides the traditional hadoop user with the compatible yarn resource management mode.
5. The stream computing system of claim 1, wherein the stream computing system further comprises: the service management module is used for providing at least one service of multi-tenant resource management, user authority unified management compatible with kubernetes ecology and hadoop ecology, health state monitoring management of a stream computing task, operation index monitoring management of the stream computing task and data center monitoring management in the stream computing system.
6. The stream computing system of claim 5, wherein the service management module is further configured to detect whether a user has a right to operate a real-time data bus layer and detect whether the user has a preset hadoop operation right when scheduling the stream computing task by using the kubernetes resource management method when providing a user right unified management service compatible with kubernetes ecology and hadoop ecology in the stream computing system.
7. The flow computing system of claim 5, wherein the service management module, when providing a health status monitoring management service for a flow computing task in the flow computing system, is further configured to monitor whether the flow computing task is alive and back-pressure, wherein if the monitoring flow computing task has been interrupted or failed, determining that the flow computing task is not alive; and if the stream data receiving speed is larger than the stream data processing speed in the stream data processing process of the monitoring stream computing task, judging the backpressure of the stream computing task.
8. The stream computing system of claim 1, wherein the stream computing system further comprises: and the operation system module is used for providing at least one mechanism of a user guiding mechanism, a user behavior auditing mechanism, a system automatic capacity expanding mechanism, a system abnormality early warning mechanism and a system abnormality recovery mechanism for documents and video data in the stream computing system.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the functions of the system according to any one of claims 1 to 8 when executing the computer program.
10. A computer readable storage medium storing a computer program, which when executed by a processor performs the functions of the system of any one of claims 1 to 8.
CN202011559972.6A 2020-12-25 2020-12-25 Stream computing system, electronic device thereof, and storage medium Active CN112667683B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011559972.6A CN112667683B (en) 2020-12-25 2020-12-25 Stream computing system, electronic device thereof, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011559972.6A CN112667683B (en) 2020-12-25 2020-12-25 Stream computing system, electronic device thereof, and storage medium

Publications (2)

Publication Number Publication Date
CN112667683A CN112667683A (en) 2021-04-16
CN112667683B true CN112667683B (en) 2023-05-26

Family

ID=75408882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011559972.6A Active CN112667683B (en) 2020-12-25 2020-12-25 Stream computing system, electronic device thereof, and storage medium

Country Status (1)

Country Link
CN (1) CN112667683B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239081A (en) * 2021-05-21 2021-08-10 瀚云科技有限公司 Streaming data calculation method
CN115904722A (en) * 2022-12-14 2023-04-04 上海汇付支付有限公司 Big data real-time processing platform

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245158A (en) * 2019-06-10 2019-09-17 上海理想信息产业(集团)有限公司 A kind of multi-source heterogeneous generating date system and method based on Flink stream calculation technology
US10817334B1 (en) * 2017-03-14 2020-10-27 Twitter, Inc. Real-time analysis of data streaming objects for distributed stream processing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10572276B2 (en) * 2016-09-12 2020-02-25 International Business Machines Corporation Window management based on a set of computing resources in a stream computing environment
US20180165306A1 (en) * 2016-12-09 2018-06-14 International Business Machines Corporation Executing Queries Referencing Data Stored in a Unified Data Layer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10817334B1 (en) * 2017-03-14 2020-10-27 Twitter, Inc. Real-time analysis of data streaming objects for distributed stream processing
CN110245158A (en) * 2019-06-10 2019-09-17 上海理想信息产业(集团)有限公司 A kind of multi-source heterogeneous generating date system and method based on Flink stream calculation technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
大数据流式计算:关键技术及系统实例;孙大为;《软件学报》;第第25卷卷(第第4期期);第839-862页 *

Also Published As

Publication number Publication date
CN112667683A (en) 2021-04-16

Similar Documents

Publication Publication Date Title
CN110290189B (en) Container cluster management method, device and system
US11281673B2 (en) Data pipeline for scalable analytics and management
CN109714192B (en) Monitoring method and system for monitoring cloud platform
US9838483B2 (en) Methods, systems, and computer readable media for a network function virtualization information concentrator
CA2780013C (en) Cloud computing monitoring and management system
US9419917B2 (en) System and method of semantically modelling and monitoring applications and software architecture hosted by an IaaS provider
Calheiros et al. On the effectiveness of isolation‐based anomaly detection in cloud data centers
WO2023142054A1 (en) Container microservice-oriented performance monitoring and alarm method and alarm system
CN112667683B (en) Stream computing system, electronic device thereof, and storage medium
CN110995497A (en) Method for unified operation and maintenance in cloud computing environment, terminal device and storage medium
CN103986748A (en) Method and device for achieving servitization
CN113596150A (en) Message pushing method and device, computer equipment and storage medium
WO2019157399A1 (en) Data pipeline for scalable analytics and management
US10122602B1 (en) Distributed system infrastructure testing
CN112100034A (en) Service monitoring method and device
US10331484B2 (en) Distributed data platform resource allocator
CN110855481A (en) Data acquisition system and method
CN114153609A (en) Resource control method and device, electronic equipment and computer readable storage medium
Zhang et al. Efficient online surveillance video processing based on spark framework
CN112039985A (en) Heterogeneous cloud management method and system
CN111049846A (en) Data processing method and device, electronic equipment and computer readable storage medium
Pourmajidi et al. Dogfooding: Using ibm cloud services to monitor ibm cloud infrastructure
EP2674876A1 (en) Streaming analytics processing node and network topology aware streaming analytics system
Pham et al. Multi-level just-enough elasticity for MQTT brokers of Internet of Things applications
Hamid et al. AWS Support in Open Source Mano Monitoring Module

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant