CN109635022B - Visual elastic search data acquisition method and device - Google Patents

Visual elastic search data acquisition method and device Download PDF

Info

Publication number
CN109635022B
CN109635022B CN201811290888.1A CN201811290888A CN109635022B CN 109635022 B CN109635022 B CN 109635022B CN 201811290888 A CN201811290888 A CN 201811290888A CN 109635022 B CN109635022 B CN 109635022B
Authority
CN
China
Prior art keywords
plug
configuration file
scheduling
task
data acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811290888.1A
Other languages
Chinese (zh)
Other versions
CN109635022A (en
Inventor
杨耀
王纯斌
钟武
李森林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sefon Software Co Ltd
Original Assignee
Chengdu Sefon Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Sefon Software Co Ltd filed Critical Chengdu Sefon Software Co Ltd
Priority to CN201811290888.1A priority Critical patent/CN109635022B/en
Publication of CN109635022A publication Critical patent/CN109635022A/en
Application granted granted Critical
Publication of CN109635022B publication Critical patent/CN109635022B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming

Abstract

The invention discloses a visual elastic search data acquisition method and a visual elastic search data acquisition device, wherein the visual elastic search data acquisition method comprises the following steps: creating a visual component, wherein the visual component comprises an input plug-in, an output plug-in and a scheduling plug-in; creating a task by associating an input plug-in, an output plug-in and a scheduling plug-in; respectively carrying out parameter configuration on the input plug-in, the output plug-in and the scheduling plug-in to obtain an input plug-in configuration file, an output plug-in configuration file and a scheduling plug-in configuration file; configuring a running node and a task strategy of a task; loading a task strategy and acquiring node information of a target operation node; sending the task to the target operation node according to the node information of the target operation node so that the target operation node can acquire and analyze the input plug-in configuration file, the output plug-in configuration file and the scheduling plug-in configuration file, and executing data acquisition; the invention simplifies the configuration process, can simultaneously carry out multi-task and multi-node concurrent acquisition, effectively improves the efficiency of data acquisition and reduces the use cost.

Description

Visual elastic search data acquisition method and device
Technical Field
The invention relates to the technical field of data acquisition, in particular to a visual elastic search data acquisition method and device.
Background
The elastic search is a search server based on Lucene, provides a full-text search engine with distributed multi-user capability, is based on RESTful web interfaces, is developed by Java, is issued as an open source code under Apache licensing terms, is a current popular enterprise-level search engine, is designed for cloud computing, can achieve real-time search, and is stable, reliable, rapid and convenient to install and use.
Efficient searching using the ElasticSearch has a precondition that data needs to be collected into the ElasticSearch first. At present, data acquisition is usually completed by using a third-party plug-in, a command line or complex configuration is needed, the learning cost and the use threshold are high, the data acquisition can be only executed in a single task, the acquisition speed is low, and the efficiency is low.
Disclosure of Invention
In order to solve the problems, the invention provides a visual ElasticSearch data acquisition method and device, which define the input and output of structured data and create a scheduling task through a graphical operation mode, realize data acquisition based on the ElasticSearch, simplify the acquisition process and improve the acquisition rate.
In order to achieve the purpose, the invention adopts the following technical scheme:
specifically, a visual elastic search data acquisition method is applied to a user terminal in communication connection with a node server, and the method comprises the following steps:
creating a visualization component comprising an input plug-in, an output plug-in and a scheduling plug-in;
creating a task by associating the input plug-in, the output plug-in and the scheduling plug-in;
respectively carrying out parameter configuration on the input plug-in, the output plug-in and the scheduling plug-in to obtain an input plug-in configuration file, an output plug-in configuration file and a scheduling plug-in configuration file;
configuring the running node and the task strategy of the task;
loading the task strategy and acquiring node information of a target operation node;
and sending the task to the target operation node according to the node information of the target operation node so that the target operation node can acquire and analyze the input plug-in configuration file, the output plug-in configuration file and the scheduling plug-in configuration file, and executing data acquisition.
Further, the input plug-in configuration file comprises data source information and a query script, wherein the data source information comprises an IP (Internet protocol) and port information of a data source database.
Further, the output plug-in configuration file comprises data target information, the data target is an ElasticSearch server, and the data target information comprises an IP, a port, an index name and a type name of the ElasticSearch server.
Further, the configuration file of the scheduling plug-in includes a scheduling type, a scheduling time and associated input and output.
Further, configuring the running node of the task includes configuring a name, an IP, and port information of the running node server.
Further, the task policy for configuring the task includes configuring an execution mode, a target running node server, a task log level, and a scheduling task, where the scheduling task is to associate the task with the target running node server.
Further, the input plug-in configuration file and the output plug-in configuration file are both saved as ktr files, and the scheduling plug-in configuration file is kjb file.
Further, the specific steps of the target operation node executing data acquisition are as follows: the target operation node analyzes the scheduling plug-in configuration file, acquires the scheduling type, scheduling time and associated input and output information of the task, acquires the input plug-in configuration file and the output plug-in configuration file according to the acquired associated input and output information, acquires the data source information and the data target information by analyzing the input plug-in configuration file and the output plug-in configuration file, and executes data acquisition according to the scheduling type and the scheduling time according to the data source information and the data target information.
Specifically, a visual elastic search data acquisition device, the device includes: the system comprises a designer and a manager, wherein the designer is used for creating tasks through a visual component, and the manager is used for configuring operation nodes and target operation nodes for distributing the tasks, and sending the tasks to the target operation nodes for execution.
Compared with the prior art, the invention has the beneficial effects that:
the invention defines the input and output of the structured data and establishes the scheduling task by a graphical operation mode, can start the acquisition of the structured data by only simple configuration aiming at different business requirements, improves the usability of data acquisition, can simultaneously carry out multi-task and multi-node concurrent acquisition and improves the high efficiency of data acquisition.
Drawings
FIG. 1 is a flow chart of a visualized ElasticSearch data acquisition method of the present invention;
FIG. 2 is a flow chart of a data collection process according to embodiment 1 of the present invention;
FIG. 3 is a block diagram of a visual ElasticSearch data acquisition device of the present invention.
Description of reference numerals: 101-designer, 102-manager.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Example 1
As shown in fig. 1, a visual elastic search data acquisition method is applied to a user terminal in communication connection with a node server, and the method includes:
the method comprises the steps of establishing a visual component, wherein the visual component comprises an input plug-in, an output plug-in and a scheduling plug-in, the data acquisition is completely visual, a user can start data acquisition only by simple configuration, abstract and difficult-to-remember commands and complex and tedious configuration operations are not needed, user experience is improved, learning cost and use thresholds are reduced, the acquisition process is intelligently controlled by a scheduling center, manual intervention is not needed, and acquisition details can be monitored in real time.
Tasks are created by associating input, output and dispatch plug-ins, each of which is independent, with a complete task chain consisting of input, output and dispatch plug-ins.
Configuring an input plug-in, wherein configuration information comprises an input plug-in name, data source information and a query script, and storing a configured input plug-in configuration file as a ktr file; the data source information includes information such as an IP and a port of a data source database, the query script is a script for querying data from the data source database, taking an oracle as the data source database as an example, and data needs to be read from the oracle, the data source information needs to fill information such as the IP and the port of the oracle, and the query script is an SQL statement for querying data from the oracle.
Configuring an output plug-in, wherein the output plug-in comprises an output plug-in name and data target information, the data target is a target database to which data is to be extracted, and a configured output plug-in configuration file is stored as a ktr file; in this embodiment, the data target is an ElasticSearch server, and the data target information includes an IP, a port, an index name, and a type name of the ElasticSearch server.
Configuring a scheduling plug-in, including whether repeated acquisition, scheduling type, scheduling time and associated input and output are carried out, and storing a configured scheduling plug-in configuration file as an kjb file; the scheduling types comprise immediate execution, how long it is before, a certain time of day, a certain time of week, a certain time of month; the scheduling time is specific scheduling time required to be specified after the scheduling type is configured; the associated input and output are input plug-ins and output plug-ins which need to be called by the specified current scheduling plug-in.
And configuring the operation node, including the name, IP and port information of the operation node server.
Configuring a task strategy, namely allocating target operation nodes of a task, wherein configuration information comprises an execution mode, a target operation node server, a task log level and a scheduling task, and the scheduling task is to associate the task to the allocated target operation node server; meanwhile, node clustering is adopted, tasks can be executed at multiple nodes, simultaneous multi-task execution is supported, the nodes support transverse expansion, data acquisition efficiency can be effectively improved, and when performance bottlenecks occur, the problem can be solved only by transversely expanding the nodes.
And loading a task strategy, acquiring node information of the target operation node, sending a task to the target operation node according to the node information of the target operation node so that the target operation node can acquire and analyze the input plug-in configuration file, the output plug-in configuration file and the scheduling plug-in configuration file, and executing data acquisition.
As shown in fig. 2, the specific steps of the target operation node executing data acquisition are as follows: the target operation node analyzes the scheduling plug-in configuration file, acquires the scheduling type, scheduling time and associated input and output information of the task, acquires the input plug-in configuration file and the output plug-in configuration file according to the acquired associated input and output information, acquires data source information and data target information by analyzing the input plug-in configuration file and the output plug-in configuration file, and performs data acquisition according to the scheduling type and the scheduling time according to the data source information and the data target information.
Example 2
As shown in fig. 3, a visualized elastic search data acquisition device includes: the system comprises a designer and a manager, wherein the designer is used for creating tasks through a visual component, a complete task chain is composed of 3 plug-ins including an input plug-in, an output plug-in and a scheduling plug-in, and the visual operation comprises the following specific processes: selecting a plug-in on a designer page, pressing a left mouse button, dragging into an editing area, double-clicking the plug-in to edit detailed configuration information of the plug-in, then pointing one plug-in by the mouse, pressing the left button, not dragging to the other plug-in to carry out connection association; the manager is used for distributing target operation nodes for the tasks and sending the created tasks to the target operation nodes, and the target operation nodes perform data acquisition according to the configuration information of the input plug-ins, the output plug-ins and the scheduling plug-ins.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (9)

1. A visual ElasticSearch data acquisition method is applied to a user terminal in communication connection with a node server, and is characterized by comprising the following steps:
creating a visualization component comprising an input plug-in, an output plug-in and a scheduling plug-in;
creating a task by associating the input plug-in, the output plug-in and the scheduling plug-in;
respectively carrying out parameter configuration on the input plug-in, the output plug-in and the scheduling plug-in to obtain an input plug-in configuration file, an output plug-in configuration file and a scheduling plug-in configuration file;
configuring the running node and the task strategy of the task;
loading the task strategy and acquiring node information of a target operation node;
and sending the task to the target operation node according to the node information of the target operation node so that the target operation node can acquire and analyze the input plug-in configuration file, the output plug-in configuration file and the scheduling plug-in configuration file, and executing data acquisition.
2. The visual ElasticSearch data acquisition method of claim 1, wherein the input plug-in configuration file comprises data source information and a query script, and the data source information comprises IP and port information of a data source database.
3. The visual ElasticSearch data acquisition method of claim 2, wherein the output plug-in configuration file comprises data target information, the data target is an ElasticSearch server, and the data target information comprises an IP, a port, an index name and a type name of the ElasticSearch server.
4. The visual ElasticSearch data acquisition method of claim 1, wherein the scheduling plug-in configuration file comprises a scheduling type, a scheduling time and associated input and output.
5. The visual ElasticSearch data acquisition method of claim 1, wherein configuring the running node of the task comprises configuring name, IP and port information of a running node server.
6. The visual ElasticSearch data acquisition method of claim 1, wherein the task policy for configuring the task comprises configuring an execution mode, a target running node server, a task log level and a scheduling task, wherein the scheduling task is to associate the task with the target running node server.
7. The visual ElasticSearch data acquisition method of claim 1, wherein the input plug-in configuration file and the output plug-in configuration file are both saved as ktr files, and the scheduling plug-in configuration file is kjb file.
8. The visual ElasticSearch data acquisition method according to claim 3, wherein the specific steps of the target operation node executing data acquisition are as follows: the target operation node analyzes the scheduling plug-in configuration file, acquires the scheduling type, scheduling time and associated input and output information of the task, acquires the input plug-in configuration file and the output plug-in configuration file according to the acquired associated input and output information, acquires the data source information and the data target information by analyzing the input plug-in configuration file and the output plug-in configuration file, and executes data acquisition according to the scheduling type and the scheduling time according to the data source information and the data target information.
9. A visualized elastic search data acquisition device, which is applied to the visualized elastic search data acquisition method according to any one of claims 1 to 8, wherein the device comprises: the system comprises a designer and a manager, wherein the designer is used for creating tasks through a visual component, and the manager is used for configuring operation nodes and target operation nodes for distributing the tasks, and sending the tasks to the target operation nodes for execution.
CN201811290888.1A 2018-10-31 2018-10-31 Visual elastic search data acquisition method and device Active CN109635022B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811290888.1A CN109635022B (en) 2018-10-31 2018-10-31 Visual elastic search data acquisition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811290888.1A CN109635022B (en) 2018-10-31 2018-10-31 Visual elastic search data acquisition method and device

Publications (2)

Publication Number Publication Date
CN109635022A CN109635022A (en) 2019-04-16
CN109635022B true CN109635022B (en) 2021-04-13

Family

ID=66066926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811290888.1A Active CN109635022B (en) 2018-10-31 2018-10-31 Visual elastic search data acquisition method and device

Country Status (1)

Country Link
CN (1) CN109635022B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113542796B (en) * 2020-04-22 2023-08-08 腾讯科技(深圳)有限公司 Video evaluation method, device, computer equipment and storage medium
CN111949389B (en) * 2020-08-11 2022-02-18 曙光信息产业(北京)有限公司 Slurm-based information acquisition method and device, server and computer-readable storage medium
CN114661685B (en) * 2022-03-25 2023-01-10 机科发展科技股份有限公司 Method and apparatus for generating log record component, log recording method, and medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150170114A1 (en) * 2013-12-18 2015-06-18 Apriva, Llc System and method for acquiring and integrating multi-source information for advanced analystics and visualization
CN104283967B (en) * 2014-10-23 2018-07-13 武汉华大优能信息有限公司 A kind of third party's data service system based on internet of things data acquisition
CN104486445B (en) * 2014-12-30 2017-03-22 北京天云融创软件技术有限公司 Distributed extendable resource monitoring system based on cloud platform
CN104847924B (en) * 2015-03-27 2017-04-12 浙江工业大学 High-speed rotating valve and flow parameter real-time detecting device for visualization observation
CN106354723B (en) * 2015-07-15 2019-06-04 北京中电普华信息技术有限公司 A kind of on-line data acquisition system
CN107104818B (en) * 2017-03-16 2020-07-14 柳州达迪通信技术股份有限公司 Implementation method of ODF port eagle eye view
CN107247721A (en) * 2017-04-24 2017-10-13 江苏曙光信息技术有限公司 Visualize collecting method
CN107391686A (en) * 2017-07-24 2017-11-24 威创软件南京有限公司 A kind of visual configuration data collecting system implementation method
CN108197237B (en) * 2017-12-29 2020-03-24 北京恒泰实达科技股份有限公司 Visual data acquisition and display system

Also Published As

Publication number Publication date
CN109635022A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
US11663257B2 (en) Design-time information based on run-time artifacts in transient cloud-based distributed computing clusters
Marcu et al. Spark versus flink: Understanding performance in big data analytics frameworks
US10929173B2 (en) Design-time information based on run-time artifacts in a distributed computing cluster
CN108694195B (en) Management method and system of distributed data warehouse
US20220261413A1 (en) Using specified performance attributes to configure machine learning pipepline stages for an etl job
CN109635022B (en) Visual elastic search data acquisition method and device
US10061858B2 (en) Method and apparatus for processing exploding data stream
CN103399887A (en) Query and statistical analysis system for mass logs
CN109408493A (en) A kind of moving method and system of data source
CN103309650A (en) Generating method and device of persistence layer codes
CN112099800B (en) Code data processing method, device and server
CN109144734A (en) A kind of container resource quota distribution method and device
US20180121507A1 (en) Generation of query execution plans
CN105653647A (en) Information acquisition method and system of SQL (Structured Query Language) statement
CN104199912A (en) Task processing method and device
CN104537012B (en) Data processing method and device
Farhan et al. A study and performance comparison of mapreduce and apache spark on Twitter data on hadoop cluster
CN107315652B (en) Data backup method and cloud HDFS system
CN103248511A (en) Analyses method, device and system for single-point service performance
CN109063040A (en) Client-side program collecting method and system
CN110909072B (en) Data table establishment method, device and equipment
CN115757639A (en) Data source synchronization method and device, electronic equipment and storage medium
Jain et al. Analyzing & optimizing hadoop performance
CN103577560B (en) Method and device for inputting data base operating instructions
CN113360576A (en) Power grid mass data real-time processing method and device based on Flink Streaming

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant