CN114969139A - Big data operation and maintenance management method, system, device and storage medium - Google Patents

Big data operation and maintenance management method, system, device and storage medium Download PDF

Info

Publication number
CN114969139A
CN114969139A CN202111401635.9A CN202111401635A CN114969139A CN 114969139 A CN114969139 A CN 114969139A CN 202111401635 A CN202111401635 A CN 202111401635A CN 114969139 A CN114969139 A CN 114969139A
Authority
CN
China
Prior art keywords
data
big data
maintenance management
operation request
storing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111401635.9A
Other languages
Chinese (zh)
Inventor
郄彬
黄秋林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou City Construction College
Original Assignee
Guangzhou City Construction College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou City Construction College filed Critical Guangzhou City Construction College
Priority to CN202111401635.9A priority Critical patent/CN114969139A/en
Publication of CN114969139A publication Critical patent/CN114969139A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Automation & Control Theory (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a big data operation and maintenance management method, a system, a device and a storage medium, wherein the method comprises the following steps: firstly, responding to the visualization operation of a front-end page, and generating an operation request; storing the operation request into message middleware; acquiring an operation request from the message middleware, and generating an operation command according to the operation request; starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result; and displaying the operation result on a front page. According to the method and the device, the corresponding operation command is generated according to the visual operation of the user, different processes in the big data processing cluster are started through the operation command, the operation result can be fed back to the front-end page to be referred by the user, and the user can obtain the visual result of the big data through simple and visual operation.

Description

Big data operation and maintenance management method, system, device and storage medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a big data operation and maintenance management method, system, apparatus, and storage medium.
Background
With the development of big data technology, people begin to rely on big data to analyze every industry so as to achieve the purposes of summarizing the overall situation of a past period of time and predicting the development of a certain period of time in the future. However, because big data has the characteristics of large data volume, complex data source and the like, how to effectively and intuitively analyze and operate the massive data is an urgent problem to be solved in the big data era.
Disclosure of Invention
The present application is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, the application provides a big data operation and maintenance management method, a system, a device and a storage medium.
In a first aspect, an embodiment of the present application provides a big data operation and maintenance management method, where the method includes: responding to the visualization operation of the front-end page, and generating an operation request; storing the operation request into message middleware; acquiring the operation request from the message middleware, and generating an operation command according to the operation request; starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result; and displaying the operation result on the front-end page.
Optionally, the obtaining the operation request from the message middleware and generating an operation command according to the operation request includes: and acquiring the operation request from the message middleware at regular time according to a preset timing task, and generating an operation command according to the operation request.
Optionally, the method further comprises: determining a task text according to the timing task; performing fuzzy search on the task text; when error information occurs in the fuzzy search process, intercepting associated information in the task text according to the error information, wherein the associated information comprises the error information; and sending the associated information to a specified developer mailbox.
Optionally, the process in the big data processing cluster includes: monitoring and collecting the data source in real time to collect target data; storing the target data into a distributed storage cluster; acquiring target data from the distributed storage cluster, and performing data cleaning on the target data; storing the target data subjected to data cleaning into a data warehouse; and storing the target data in the data warehouse into a database.
Optionally, the method further comprises: when one main node in the distributed storage cluster crashes, a complete standby node is selected as a new main node through an election mechanism.
Optionally, the database includes a high-speed data subunit and a basic data subunit, and the storing the target data in the data warehouse into the database includes: storing a portion of said target data in said high speed data subunit and storing the remainder of said target data in a base data subunit.
In a second aspect, an embodiment of the present application provides a big data operation and maintenance management system, where the system includes: the visualization management module is used for responding to the visualization operation of the front-end page and generating an operation request; the operation result is displayed on the front-end page; the message middleware is used for storing the operation request into the message middleware; the command interaction module is used for acquiring the operation request from the message middleware and generating an operation command according to the operation request; and the big data command processing module is used for starting a plurality of processes in the big data processing cluster by the big data command according to the operation command and determining the operation result.
In a third aspect, an embodiment of the present application provides an apparatus, including: at least one processor; at least one memory for storing at least one program; when the at least one program is executed by the at least one processor, the at least one processor is enabled to implement the big data operation and maintenance management method according to the first aspect.
In a fourth aspect, the present application provides a computer storage medium, in which a processor-executable program is stored, where the processor-executable program is used to implement the big data operation and maintenance management method according to the first aspect when executed by the processor.
The beneficial effects of the embodiment of the application are as follows: firstly, responding to the visualization operation of a front-end page, and generating an operation request; storing the operation request into message middleware; acquiring the operation request from the message middleware, and generating an operation command according to the operation request; starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result; and displaying the operation result on the front-end page. According to the method and the device, the corresponding operation command is generated according to the visual operation of the user, different processes in the big data processing cluster are started through the operation command, the operation result can be fed back to the front-end page to be referred by the user, and the user can obtain the visual result of the big data through simple and visual operation.
Drawings
The accompanying drawings are included to provide a further understanding of the claimed subject matter and are incorporated in and constitute a part of this specification, illustrate embodiments of the subject matter and together with the description serve to explain the principles of the subject matter and not to limit the subject matter.
FIG. 1 is a schematic diagram of a big data operation and maintenance management system provided in an embodiment of the present application;
fig. 2 is a flowchart illustrating steps of a big data operation and maintenance management method according to an embodiment of the present application;
fig. 3 is a step diagram of a processing flow in a big data processing cluster according to an embodiment of the present application;
fig. 4 is a schematic diagram of an apparatus provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional block divisions are provided in the system drawings and logical orders are shown in the flowcharts, in some cases, the steps shown and described may be performed in different orders than the block divisions in the systems or in the flowcharts. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
With the development of big data technology, people begin to rely on big data to analyze every industry so as to achieve the purposes of summarizing the overall situation of a past period of time and predicting the development of a certain period of time in the future. However, since big data has the characteristics of large data volume, complex data source and the like, how to perform effective and intuitive analysis operation on massive data is an urgent problem to be solved in the big data era.
Based on this, the embodiment of the present application provides a big data operation and maintenance management method, system, device and storage medium, and the method includes the steps of: firstly, responding to the visualization operation of a front-end page, and generating an operation request; storing the operation request into message middleware; acquiring an operation request from the message middleware, and generating an operation command according to the operation request; starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result; and displaying the operation result on a front page. According to the method and the device, the corresponding operation command is generated according to the visual operation of the user, different processes in the big data processing cluster are started through the operation command, the operation result can be fed back to the front-end page to be referred by the user, and the user can obtain the visual result of the big data through simple and visual operation.
The embodiments of the present application will be further explained with reference to the drawings.
First, a big data operation and maintenance management system provided in the embodiment of the present application is explained.
Referring to fig. 1, fig. 1 is a schematic diagram of a big data operation and maintenance management system provided in an embodiment of the present application, where the big data operation and maintenance management system 100 includes: visualization management module 110, message middleware 120, command interaction module 130, and big data command processing module 140. The visualization management module is used for responding to the visualization operation of the front-end page and generating an operation request; the operation result is displayed on the front-end page; the message middleware is used for storing the operation request into the message middleware; the command interaction module is used for acquiring an operation request from the message middleware and generating an operation command according to the operation request; and the big data command processing module is used for starting a plurality of processes in the big data processing cluster by the big data command according to the operation command and determining the operation result.
In some embodiments, the big data command processing module includes a data acquisition unit, a distributed storage unit, a data cleaning unit, an analysis calculation unit, a data warehouse, a data transmission unit, and a database. The data acquisition unit is used for monitoring and acquiring a data source in real time and acquiring target data; the distributed storage unit comprises a distributed storage cluster, and the acquired target data is stored in the distributed storage cluster; the data cleaning unit and the analysis and calculation unit jointly act to clean the target data acquired from the distributed storage cluster; storing the target data subjected to data cleaning into a data warehouse; the data transmission unit is used for storing the target data in the data warehouse into the database.
In other embodiments, the distributed storage unit includes a distributed storage cluster and a scheduling subunit, where the distributed storage cluster is used to store a large amount of data, and the scheduling subunit is used to schedule each node in the distributed storage cluster according to a zookeeper or other technology. In the embodiment of the application, the two units, namely the distributed storage cluster and the scheduling subunit, are containerized by using a docker virtualization container technology, a sandbox mechanism is used between the two containers, and no interface exists between the two containers, so that no file needs to be configured, multi-platform migration of the distributed storage unit can be supported, and the distributed storage cluster and the scheduling subunit are more convenient and smaller in capacity.
In other embodiments, the database includes a high-speed data subunit and a basic data subunit, the high-speed data subunit refers to a data unit capable of supporting high-speed reading and writing of data like redis, and the basic data subunit refers to a conventional database like MySQL.
Next, a big data operation and maintenance management method provided in the embodiment of the present application is described with reference to the big data operation and maintenance management system shown in fig. 1.
Referring to fig. 2, fig. 2 is a flowchart illustrating steps of a big data operation and maintenance management method according to an embodiment of the present application, where the method is applied to the big data operation and maintenance management system shown in fig. 1, and the method includes, but is not limited to, steps S200 to S240:
s200, responding to the visualization operation of the front-end page, and generating an operation request;
specifically, in the embodiment of the application, the visualization management module displays the visualized operation interface on the front web page, and the user can perform the visualization operation on the front web page. And responding to the visualization operation of the front-end page, the visualization management module generates operation requests, the operation requests can correspond to different large data flows, and after the corresponding large data flows are executed, the operation results of the large data flows can be obtained, so that the visualization of the large data operation and maintenance management process is realized.
S210, storing the operation request into a message middleware;
specifically, the message middleware comprises kafka and redis technologies, and is used for caching the operation request generated by the visualization management module. Taking redis as an example, operating requests generated by the visualization management module are stored by using clustered redis, and due to the characteristics of high availability of redis and difficulty in data loss, the security of the operating requests stored therein can be further guaranteed.
S220, acquiring an operation request from the message middleware, and generating an operation command according to the operation request;
specifically, the command interaction module acquires an operation request from the message middleware and generates an operation command according to the operation request. The command interaction module uses a shell script technology to firstly acquire an operation request in the message middleware in real time, and then analyzes the operation request through the shell script to generate an operation command. The operation command in the embodiment of the application is used for starting a plurality of processes of the big data cluster.
In some embodiments, the command interaction module may further use a crontab timing technology to preset a timing task, and according to the preset timing task, the command interaction module acquires an operation request from the message middleware at regular time and generates an operation command according to the operation request. For example, the timing task may be set to: and if the user selects 21 pm to capture the data source information, the command interaction module acquires the corresponding operation request from the message middleware at 21 pm and generates an operation command for capturing the data source information.
In other embodiments, to avoid the task failure, the command interaction module further sets an error reporting procedure, specifically: and the command interaction module redirects according to a preset timing task, determines a task text and carries out fuzzy search on the task text. When error information appears in the task text in the fuzzy search process, for example, error information with an error word in the task text is detected, the associated information in the task text is intercepted, wherein the associated information refers to information which comprises the error information and is 20 lines before and after the error information, and the associated information is sent to a specified developer mailbox. The developer can more conveniently identify specific errors occurring in a certain large data flow through the associated information to carry out targeted repair.
S230, starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result;
specifically, according to the operation command generated in step S220, several processes in the big data processing cluster are started, and operation results of each flow are generated.
In conjunction with big data command processing module 140 in FIG. 1, the flow in the big data processing cluster may be mainly illustrated by the method steps of FIG. 3.
Referring to fig. 3, fig. 3 is a step diagram of a processing flow in a big data processing cluster according to an embodiment of the present application, where the method is applied to the big data command processing module 140 in fig. 1, and the method includes, but is not limited to, steps S300-S340:
s300, monitoring and collecting a data source in real time to collect target data;
specifically, the data acquisition unit monitors and acquires the data source in real time by using a flash technology to acquire target data. The flash is a distributed system for mass data acquisition, aggregation and transmission. The method comprises the following steps that flash supports various data sending parties to be customized in a log system and used for collecting data; at the same time, flash provides the ability to simply process data and write to various data recipients (customizable). As a data acquisition technology, the flume technology has the characteristics of good expansibility and high reliability, and specifically comprises the following steps: the flare can be deployed in a distributed cluster, and the expansibility is good; and when the node fails, the data can be transmitted to other nodes without loss, so that the flash reliability is good. In the data acquisition unit of the embodiment of the application, the data source is subjected to real-time data acquisition by using a stream mode, and acquired target data are acquired.
S310, storing the target data into a distributed storage cluster;
specifically, the distributed storage cluster in the distributed storage unit is mainly used for storing large data files, and the storage mode of distributed deployment of the cluster provides a huge guarantee for data in terms of safety and reliability, can ensure that data cannot be lost after one machine crashes, and can continue to access the data, so that the distributed storage cluster is used for performing distributed storage on the data acquired by the data acquisition module.
In some embodiments, the distributed storage unit further includes a scheduling subunit, where the scheduling subunit is configured to schedule the distributed storage cluster, where the scheduling subunit uses a zookeeper technology, and when a master node in the distributed storage cluster crashes, the scheduling subunit may select a good standby node as a new master node through an election mechanism, thereby avoiding a situation that data is lost and data cannot be accessed.
S320, acquiring target data from the distributed storage cluster, and performing data cleaning on the target data;
specifically, the embodiment of the application provides a data cleaning unit and an analysis and calculation unit, and the two units are used for cleaning target data together. The data cleaning unit uses spark and Scala technology, and mainly utilizes spark to clear the target data. The spark technology has the advantages of high speed, convenience in use and high universality. Compared with MapReduce of Hadoop, spark is faster than 100 times based on the operation of the memory, so that spark realizes an efficient DAG execution engine, can process stream data efficiently based on the memory, and has higher speed. Moreover, Spark supports APIs of Java, Python and Scale, and also supports more than 80 advanced algorithms, so that users can quickly construct different applications, and the application is simple and convenient to use. In addition, Spark can perform batch processing and interactive query through Spark SQL, perform real-time stream processing through Spark Streaming, perform machine learning through Spark MLlib, and perform graph calculation through Spark GraphX, and these different types of processing can be seamlessly used in the same application, which indicates that Spark is highly versatile. And spark has a large performance advantage as a unified solution.
In addition, the analysis and calculation unit mainly adopts a MapReduce calculation framework. First, as a cluster-based high-performance parallel computing platform, the MapReduce computing framework allows a distributed and parallel computing cluster containing tens, hundreds, or thousands of nodes to be constructed with commercially available servers. Moreover, as a parallel computing and running software framework, the MapReduce computing framework can automatically complete the parallelization processing of computing tasks, automatically divide computing data and computing tasks, automatically distribute and execute tasks on cluster nodes and collect computing results, and deliver many complex details of the system bottom layer related to the parallel computing such as data distribution storage, data communication, fault-tolerant processing and the like to the system for processing, thereby greatly reducing the burden of software developers. Finally, as a parallel program design model and method, the MapReduce calculation framework provides a simple and convenient parallel program design method by means of the design idea of a functional programming language Lisp, realizes basic parallel calculation tasks by using two functions of Map and Reduce, and provides an abstract operation and parallel programming interface so as to simply and conveniently finish the programming and calculation processing of large-scale data.
And the target data is cleaned through the mutual matching of the data cleaning unit and the analysis and calculation unit.
S330, storing the target data subjected to data cleaning into a data warehouse;
specifically, target data after data cleaning is finished is stored in a data warehouse, the data warehouse adopts a hive data warehouse technology, hive is a set of data warehouse analysis system constructed based on Hadoop, and the hive is characterized by comprising the following steps: scalable (dynamic addition of devices on a Hadoop cluster), scalable, high fault tolerance, loose coupling of input formats, etc. hive provides a rich SQL query mode to analyze data stored in the Hadoop distributed file system, can map structured data files into a database table, and provides a complete SQL query function; the SQL sentences can be converted into MapReduce tasks to be operated, and the needed contents are inquired and analyzed through the SQL of the user, wherein the SQL is called Hive SQL for short. Users unfamiliar with Mapreduce may also conveniently query, summarize, and analyze data using the SQL language. And Mapreduce developers can use mappers and reducers written by themselves as plug-ins to support hive for more complex data analysis. The hive data warehouse is slightly different from the SQL of a relational database, but supports most statements such as DDL and DML and common aggregation functions, connection queries and condition queries. The system also provides a series of tools for data extraction, transformation and loading, is used for storing, querying and analyzing large-scale data sets stored in Hadoop, supports UDF (User-Defined Function), UDAF (User-Defined aggregation Function) and UDTF (User-Defined Table-Generating Function), can also customize map and reduce functions, and provides good flexibility and extensibility for data operation. hive is suitable for batch processing operation based on large amount of invariable data.
S340, storing the target data in the data warehouse into a database;
specifically, the embodiment of the application stores the target data in the data warehouse into the database through the data transmission unit. The data transmission unit uses a sqoop transmission technology, which is used for data transmission between hive and a conventional database (such as a MySQL database), and the sqoop can lead data in a relational database to the HDFS of Hadoop, and can also lead data of the HDFS to the relational database. For some NoSQL databases, hive also provides a connector. Similar to other ETL tools, sqoop uses a metadata model to determine the data type and performs type-safe data processing when data is transferred from a data source to Hadoop. sqoop is designed specifically for large data bulk transfers, and can split the data set and create a maptask task to process each block.
In the step, target data in the hive data warehouse is stored in the database through the sqoop. The database in the embodiment of the application comprises a high-speed data subunit and a basic data subunit, wherein the high-speed data subunit uses a redis cluster, and the basic data subunit uses a MySQL database. For a redis cluster, it has the following advantages:
1) quick response
Redis is very fast in response, and can perform approximately 110000 write operations per second, or 81000 read operations, which is far faster than a conventional database.
2) Support 6 data types
The data types supported by Redis are strings, hash structures, lists, collections, orderable collections, and cardinality. For example, for a string, some underlying data types of Java may be stored, a hash may store objects, a List may store List objects, and so on. Therefore, the type of the stored data can be easily selected according to the requirement in the application, and the development is convenient.
For Redis, although there are only 6 data types, there are two major benefits: on one hand, the requirement of storing various data structure bodies can be met; on the other hand, the data type is less, so that the rule is less, the needed judgment and logic are less, and the reading/writing speed is higher.
3) The operations being atomic
All Redis operations are atomic, ensuring that when two clients access the Redis server at the same time, the updated values (most recent values) are obtained. In the situation of high concurrency, a transaction using Redis can be considered, and some services needing locks are processed.
4) Multi utility tool
Redis can be used in, for example, caching, messaging queues (Redis supports "publish + subscribe" messaging mode), in applications such as Web application sessions, website page hits, etc. for any transient data.
Based on the above advantages, part of the target data is stored in the high-speed data subunit, and the rest of the target data is stored in the basic data subunit. The basic data subunit adopts MySQL database technology, and plays a basic storage role on target data. When the data transmission unit transmits a large amount of data to the database, the two units work in a shared manner and respectively store different target data.
Through steps S300-S340, the embodiment of the present application provides a plurality of big data flows executable in the big data command processing module. Step S230 is completed, and step S240 is described.
S240, displaying the operation result on a front-end page;
specifically, according to the above, the big data command processing module includes a plurality of big data processing flows, and different processing flows can be started according to different operation commands. For example, a specific URL of a data source is input in the visualization management module, an operation request for acquiring data may be generated, the operation request is first stored in the message middleware and then read by the command interaction module, and an operation command for acquiring data is generated, where the operation command is used to instruct the data acquisition unit to find the data source according to the URL of the data source and perform data acquisition to obtain target data. And displaying the operation result on a front-end page after starting the corresponding data acquisition flow according to the operation command. For example, in the front page: "data collection for a given data source has been completed". Similarly, each big data flow shown in fig. 3 may be started by an operation command generated by the command interaction module in real time, and an operation result of each flow is displayed in the front-end page, so that a user can check real-time progress of the big data operation flow in time, and better perform operation and maintenance management on the big data. After all the big data flows are completed, the data can be displayed on a front-end page in the forms of diagrams and the like, so that the user can know the relevant information of the big data more intuitively and clearly.
It can be understood that the data displayed in the form of a chart or the like in the visualization management module can be converted into an API form and provided to the user website for docking. For example, after the weather data of 15 days in the future is made into a chart for visualization, the chart can be quickly placed in a weather website for display, so that the user website can quickly and conveniently visualize large data information.
Through steps S200 to S240, on the basis of the big data operation and maintenance management system provided in the embodiment of the present application, the embodiment of the present application provides a big data operation and maintenance management method, where the visualization management module generates an operation request in response to a visualization operation of a front-end page; storing the operation request into message middleware; the command interaction module acquires an operation request from the message middleware and generates an operation command according to the operation request; the big data command processing module starts a plurality of processes in the big data processing cluster according to the operation command and determines an operation result; and the visual management module displays the operation result on a front-end page. According to the method and the device, the corresponding operation command is generated according to the visual operation of the user, different processes in the big data processing cluster are started by the operation command, the operation result can be fed back to the front-end page to be referred by the user, and the user can obtain the visual result of the big data through simple and visual operation. In addition, the application provides an error reporting process of the command interaction module, which is beneficial for developers to carry out error maintenance on the large data flow. In addition, each module of the big data operation and maintenance management system provided by the application has high portability and strong universality.
Referring to fig. 4, fig. 4 is a schematic diagram of an apparatus 400 provided in an embodiment of the present application, where the apparatus 400 includes at least one processor 410 and at least one memory 420 for storing at least one program; in fig. 4, a processor and a memory are taken as an example.
The processor and memory may be connected by a bus or other means, such as by a bus in FIG. 4.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Another embodiment of the present application also provides an apparatus that may be used to perform the control method as in any of the above embodiments, e.g., to perform the method steps of fig. 2 described above.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
The embodiment of the application also discloses a computer storage medium, wherein a program executable by a processor is stored, and the program executable by the processor is used for realizing the matching method of the synthetic voice and the original video when being executed by the processor.
One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as is well known to those skilled in the art.
While the preferred embodiments of the present invention have been described, the present invention is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and such equivalent modifications or substitutions are included in the scope of the present invention defined by the claims.

Claims (9)

1. A big data operation and maintenance management method is characterized by comprising the following steps:
responding to the visualization operation of the front-end page, and generating an operation request;
storing the operation request into message middleware;
acquiring the operation request from the message middleware, and generating an operation command according to the operation request;
starting a plurality of processes in the big data processing cluster according to the operation command, and determining an operation result;
and displaying the operation result on the front-end page.
2. The big data operation and maintenance management method according to claim 1, wherein the obtaining the operation request from the message middleware and generating an operation command according to the operation request comprises:
and acquiring the operation request from the message middleware at regular time according to a preset timing task, and generating an operation command according to the operation request.
3. The big data operation and maintenance management method according to claim 2, further comprising:
determining a task text according to the timing task;
performing fuzzy search on the task text;
when error information occurs in the fuzzy search process, intercepting associated information in the task text according to the error information, wherein the associated information comprises the error information;
and sending the associated information to a specified developer mailbox.
4. The big data operation and maintenance management method according to claim 1, wherein the process in the big data processing cluster comprises:
monitoring and collecting the data source in real time to collect target data;
storing the target data into a distributed storage cluster;
acquiring target data from the distributed storage cluster, and performing data cleaning on the target data;
storing the target data subjected to data cleaning into a data warehouse;
and storing the target data in the data warehouse into a database.
5. The big data operation and maintenance management method according to claim 4, further comprising:
when one main node in the distributed storage cluster crashes, a complete standby node is selected as a new main node through an election mechanism.
6. The big data operation and maintenance management method according to claim 4, wherein the database comprises a high-speed data subunit and a basic data subunit, and the storing the target data in the data warehouse into the database comprises:
storing a portion of said target data in said high speed data subunit and storing the remainder of said target data in a base data subunit.
7. A big data operation and maintenance management system, which is characterized in that the system comprises:
the visualization management module is used for responding to the visualization operation of the front-end page and generating an operation request; the operation result is displayed on the front page;
the message middleware is used for storing the operation request into the message middleware;
the command interaction module is used for acquiring the operation request from the message middleware and generating an operation command according to the operation request;
and the big data command processing module is used for starting a plurality of processes in the big data processing cluster according to the operation command and determining the operation result.
8. An apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
when the at least one program is executed by the at least one processor, the at least one processor is enabled to implement the big data operation and maintenance management method as claimed in any one of claims 1 to 6.
9. A computer storage medium having stored therein a processor-executable program, wherein the processor-executable program, when executed by the processor, is configured to implement the big data operation and maintenance management method according to any one of claims 1 to 6.
CN202111401635.9A 2021-11-24 2021-11-24 Big data operation and maintenance management method, system, device and storage medium Pending CN114969139A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111401635.9A CN114969139A (en) 2021-11-24 2021-11-24 Big data operation and maintenance management method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111401635.9A CN114969139A (en) 2021-11-24 2021-11-24 Big data operation and maintenance management method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN114969139A true CN114969139A (en) 2022-08-30

Family

ID=82974605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111401635.9A Pending CN114969139A (en) 2021-11-24 2021-11-24 Big data operation and maintenance management method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN114969139A (en)

Similar Documents

Publication Publication Date Title
US11860874B2 (en) Multi-partitioning data for combination operations
US11182098B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
US11151137B2 (en) Multi-partition operation in combination operations
US10262032B2 (en) Cache based efficient access scheduling for super scaled stream processing systems
US10409650B2 (en) Efficient access scheduling for super scaled stream processing systems
US11412343B2 (en) Geo-hashing for proximity computation in a stream of a distributed system
EP3910476A1 (en) Event batching, output sequencing, and log based state storage in continuous query processing
US10698935B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
US11036608B2 (en) Identifying differences in resource usage across different versions of a software application
WO2020087082A1 (en) Trace and span sampling and analysis for instrumented software
US11676066B2 (en) Parallel model deployment for artificial intelligence using a primary storage system
US8276022B2 (en) Efficient failure detection for long running data transfer jobs
US11892976B2 (en) Enhanced search performance using data model summaries stored in a remote data store
US20180004797A1 (en) Application resiliency management using a database driver
CN111177237B (en) Data processing system, method and device
CN114090580A (en) Data processing method, device, equipment, storage medium and product
KR20150118963A (en) Queue monitoring and visualization
CN114356692A (en) Visual processing method and device for application monitoring link and storage medium
CN117149873A (en) Data lake service platform construction method based on flow batch integration
CN112506887A (en) Vehicle terminal CAN bus data processing method and device
CN112506490A (en) Interface generation method and device, electronic equipment and storage medium
US11841827B2 (en) Facilitating generation of data model summaries
CN115391361A (en) Real-time data processing method and device based on distributed database
CN114969139A (en) Big data operation and maintenance management method, system, device and storage medium
Dhanda Big data storage and analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination