CN110532261B - Method and device for visually monitoring Hive data warehouse - Google Patents

Method and device for visually monitoring Hive data warehouse Download PDF

Info

Publication number
CN110532261B
CN110532261B CN201910672433.4A CN201910672433A CN110532261B CN 110532261 B CN110532261 B CN 110532261B CN 201910672433 A CN201910672433 A CN 201910672433A CN 110532261 B CN110532261 B CN 110532261B
Authority
CN
China
Prior art keywords
information
data warehouse
buffer
hive
analyzing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910672433.4A
Other languages
Chinese (zh)
Other versions
CN110532261A (en
Inventor
和思扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN201910672433.4A priority Critical patent/CN110532261B/en
Publication of CN110532261A publication Critical patent/CN110532261A/en
Application granted granted Critical
Publication of CN110532261B publication Critical patent/CN110532261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method and a device for visually monitoring a Hive data warehouse, wherein the method comprises the following steps: storing specific information and task routine information of each table and partition of the Hive data warehouse through a buffer; when a routine task is submitted, analyzing the stored information through a structured query language (sql) analyzer; analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table; and reading the merging information of each dimension of each table for web page display. The embodiment of the invention can comb the complicated database table dependency relationship and optimize and adjust the cluster tasks, so that a manager can observe each dimension of the database, the monitoring convenience is improved, and the management cost is reduced.

Description

Method and device for visually monitoring Hive data warehouse
Technical Field
The invention relates to a Hive data warehouse technology, in particular to a method and a device for visually monitoring a Hive data warehouse.
Background
Hadoop is a distributed system infrastructure developed by the Apache foundation, Hive is a data warehouse tool based on Hadoop, can map a Structured data file into a database table, provides a simple Structured Query Language (SQL) Query function, and can convert SQL statements into a computing framework MapReduce task running on a resource manager yann for running. SQL is a database query and programming language used for accessing data and querying, updating and managing a relational database system; and is also an extension of the database script file.
In the prior art, in the process of monitoring an enterprise-level Hive data warehouse, the dependency relationship of an intricate database table cannot be combed, and the cluster tasks cannot be optimized and adjusted; therefore, managers are not easy to observe all dimensions of the data warehouse, the complexity of management and maintenance is increased, and the cost of service data table combing and controlling is high.
Disclosure of Invention
In order to solve the technical problem, embodiments of the present invention provide a method and an apparatus for visually monitoring a Hive data warehouse, which can comb up the dependency relationship of an intricate database table, optimize and adjust a cluster task, enable a manager to observe each dimension of the data warehouse, improve monitoring convenience, and reduce management cost.
In order to achieve the object of the present invention, in one aspect, an embodiment of the present invention provides a method for visually monitoring a Hive data warehouse, including:
storing specific information and task routine information of each table and partition of the Hive data warehouse through a buffer;
when a routine task is submitted, analyzing the stored information through a structured query language sql analyzer;
analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table;
and reading the merging information of each dimensionality of each table for web page display.
Further, the method comprises:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a calculation framework MapReduce task running on a resource manager yarn for running, calculating and capturing specific information of the task as third-class information, and storing the third-class information into the buffer.
Further, the method comprises:
and the buffer combines the stored first-type information to the third-type information to form combined information for each table.
Further, the method comprises:
the merging information of each table is the detailed data of each table, and comprises the following steps:
partition size, throughput time, resource consumption, upstream and downstream relationships.
Optionally, the method further comprises:
the final merged result of the buffers is read for providing the specific settings at the front end.
Further, the method further comprises: the buffer performs global scanning or appointed library scanning on the Hive data warehouse according to the configuration, caches the information of each table in the database in the form of a local file, and performs timing refreshing according to a time interval appointed by a user, wherein the refreshing frequency can be set on a web interface.
On the other hand, an embodiment of the present invention further provides a device for visually monitoring a Hive data warehouse, including:
the storage module is used for storing specific information and task routine information of each table and partition of the Hive data warehouse through the buffer;
the analysis module is used for analyzing the stored information through a structured query language sql analyzer when the routine task is submitted;
the merging module is used for acquiring the relationship between the information in each table after analyzing the information of each table, merging the information and the relationship of each table to obtain the merged information of each dimensionality of each table;
and the display module is used for reading the merging information of each dimensionality of each table and displaying the web pages.
Further, the storage module is configured to:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a calculation framework MapReduce task running on a resource manager yarn for running, calculating and capturing specific information of the task as third-class information, and storing the third-class information into the buffer.
Further, the merging module is configured to:
and the buffer combines the stored first-type information to the third-type information to form combined information for each table.
Further, the apparatus is further configured to:
the final merged result of the buffers is read for providing the specific settings at the front end.
The embodiment of the invention stores the specific information and task routine information of each table and partition of the Hive data warehouse through the buffer; when a routine task is submitted, analyzing the stored information through a structured query language sql analyzer; analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table; and reading the merging information of each dimension of each table for web page display. The embodiment of the invention can comb the dependency relationship of the complex database table and optimize and adjust the cluster tasks, so that a manager can observe each dimension of the data warehouse, the monitoring convenience is improved, and the management cost is reduced.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flow chart of a method for visually monitoring a Hive data warehouse according to an embodiment of the invention;
FIG. 2 is a schematic diagram illustrating an implementation of a method for visually monitoring a Hive data warehouse according to an embodiment of the present invention;
fig. 3 is a structural diagram of an apparatus for visually monitoring a Hive data warehouse according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
Fig. 1 is a flowchart of a method for visually monitoring a Hive data warehouse according to an embodiment of the present invention, and as shown in fig. 1, the method according to the embodiment of the present invention includes the following steps:
step 101: storing specific information and task routine information of each table and partition of the Hive data warehouse through a buffer;
the embodiment of the invention provides a visual monitoring method based on Hive data warehouses, which is mainly applied to visual monitoring of enterprise-level Hive data warehouses. The monitoring information is accurate to the table level and comprises all partitions, the total size of data files of sub-partitions, the output time, the memory occupation, the upper and lower blood relationship graphs of the table and the like. The important information is displayed in a web interface mode, an intuitive way is provided for a data warehouse manager to monitor the routine condition of each table, and the method is more beneficial to combing the complicated business dependence among the tables.
Step 102: when a routine task is submitted, analyzing the stored information through a structured query language sql analyzer;
step 103: analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table;
step 104: and reading the merging information of each dimension of each table for web page display.
Further, the method comprises:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a computation framework MapReduce task running on a resource manager yarn for running, computing and capturing specific information of the task as third type information, and storing the third type information into the buffer.
Further, the method comprises:
and the buffer combines the stored first-class information to the third-class information to form combined information for each table.
Fig. 2 is a schematic diagram illustrating an implementation of a method for visually monitoring a Hive data warehouse according to an embodiment of the present invention, and as shown in fig. 2, an implementation process of a technical solution according to an embodiment of the present invention is as follows:
the embodiment of the invention comprises a buffer, an sql parser and a web platform.
Specifically, a buffer is introduced outside the Hive data warehouse, and the information of each partition of each table of the storage data warehouse is periodically refreshed.
And introducing a Sql analyzer, analyzing the Sql statement after the Hive script is submitted each time, analyzing a data source table and an output target table of each section of Sql, and storing the dependency information into a buffer.
And submitting the parsed sql task to yarn for mapReduce calculation, capturing information such as the starting time and the ending time of the task, memory occupation and the like, and storing the information into a buffer.
Among them, yarn is a resource manager, responsible for managing and scheduling cluster resources, and it can realize the allocation of all the resources such as cpu, memory, file system, disk, etc. of the cluster. And MapReduce is a computational framework, running on yarn.
The buffer combines the information obtained in the above steps to form detailed data for each table, including partition size, output time, resource consumption, upstream and downstream relationship, etc.
And reading the final combined result of the buffer for the front-end web exhibition, and providing related settings at the front end.
The method for memory allocation for a single hive task is implemented by introducing a preprocessor and a mapReduce configurator after the task is submitted and before the mapReduce is executed, and a schematic diagram is shown in fig. 2. The specific implementation process is as follows:
a1, introducing a buffer outside the Hive data warehouse, wherein the buffer can perform global scanning or specified library scanning on the Hive data warehouse according to configuration, buffer information such as table names of all tables in the database, partition names, total size of data files of all partitions, creation time and the like in a local file form, and perform timing refreshing according to a time interval specified by a user, and the refreshing frequency can be set on a web interface;
a2, introducing an Sql parser, after the Hive script is submitted each time, scanning all the contents of the script and parsing the Sql statement, parsing a data source table and a generated destination table of each section of Sql according to INSERT/FROM/JOIN/UNION statements and the like, and storing the dependency information into a buffer.
a3, submitting the parsed sql task to yarn for mapReduce calculation, capturing information such as the start time and the end time of the task, the highest memory occupation amount and the like, and storing the information in a buffer.
a4, the buffer memory combines the tables and the partition information obtained in the step a1, the upper and lower level dependency information of each table obtained in the step a2, and the task execution information in the step a3 to form the partition data size, the upstream data source and the downstream data destination of each table, the time consumption and the memory consumption when the data is generated in each routine, and the like.
a5 reads the final merging result of the buffer, and the front-end display is realized through Java Web and the like. The show content comprises the tables contained within the individual Hive banks; the size of the partition data contained in each table; a direct upstream data source table of each table and a sub-table of downstream data import; time per routine, resource consumption, etc. The front end provides a setting interface for a user to specify the refreshing frequency of the buffer, cache Hive number bin information in nearly XX days, and reserve XX days of the cached information in the buffer, and the like.
According to the embodiment of the invention, by visually monitoring the Hive data warehouse, the complex table dependency relationship of a warehouse manager is easier to comb; the memory consumption and the output time of each table in routine are controlled more accurately, so that the cluster tasks are optimized and adjusted conveniently; meanwhile, the data size of all routine partitions in a certain table is displayed for a period of time, so that a user can perform transverse comparison and is easier to locate the abnormity, for example: the partition file generated in a routine is far smaller than the adjacent time nodes in the front and the back, so that the problem can be traced according to the routine. Therefore, managers can observe all dimensions of the data warehouse more easily, convenience of management and maintenance is greatly improved, and the cost of combing and controlling the business data table is reduced.
The embodiment of the invention introduces the buffer and the sql parser to store and display important information of each table of the Hive data warehouse, including the partition size, the blood relationship, the routine output condition and the like of the table, and the information is displayed to managers of several warehouses in a visual interface mode, so that each table of the Hive data warehouse is monitored in a macroscopic angle, and the maintenance is more convenient.
The visual monitoring method of the Hive data warehouse comprises the following steps: important information of each table and each partition of the Hive data warehouse and related information of task routines are stored through a buffer, the upper and lower level blood relationship of each table is analyzed through an sql analyzer when a routine task is submitted and stored through the buffer, the information is combined, and each dimension information of each table is used for web page display;
fig. 3 is a structural diagram of an apparatus for visually monitoring a Hive data warehouse according to an embodiment of the present invention, and as shown in fig. 3, an apparatus for visually monitoring a Hive data warehouse according to another aspect of the embodiment of the present invention includes:
the storage module 301 is used for storing specific information of each table and partition of the Hive data warehouse and task routine information through a buffer;
the analysis module 302 is used for analyzing the stored information through a structured query language sql analyzer when the routine task is submitted;
the merging module 302 is configured to obtain a relationship between information in each table after analyzing information of each table, and merge the information and the relationship of each table to obtain merged information of each dimension of each table;
a presentation module 304, configured to read the merging information of each dimension of each table, for web page presentation.
Further, the storage module 301 is configured to:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a computation framework MapReduce task running on a resource manager yarn for running, computing and capturing specific information of the task as third type information, and storing the third type information into the buffer.
Further, the merging module 302 is configured to:
and the buffer combines the stored first-class information to the third-class information to form combined information for each table.
Further, the apparatus is further configured to:
the final merged result of the buffers is read for providing the specific settings at the front end.
Wherein the apparatus is configured to: the method for realizing the buffer and the sql parser mainly comprises the following steps for parsing and storing the important information of each table of the data warehouse:
introducing a buffer outside the Hive data warehouse, and regularly refreshing the information of each partition of each table of the storage data warehouse;
introducing an Sql analyzer, analyzing an Sql statement after a Hive script is submitted each time, analyzing a data source table and an output target table of each section of Sql, and storing the dependency information into a buffer;
submitting the parsed sql task to yarn for mapReduce calculation, capturing information such as start and end time of the task, memory occupation and the like, and storing the information into a buffer;
the buffer combines the acquired information to form detailed data for each table, including partition size, output time, resource consumption, upstream-downstream relationship and the like;
the final merged result of the cache is read for front-end web presentation and relevant settings are provided at the front-end.
The embodiment of the invention stores the specific information and task routine information of each table and partition of the Hive data warehouse through the buffer; when a routine task is submitted, analyzing the stored information through a structured query language (sql) analyzer; analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table; and reading the merging information of each dimensionality of each table for web page display. The embodiment of the invention can comb the dependency relationship of the complex database table and optimize and adjust the cluster tasks, so that a manager can observe each dimension of the data warehouse, the monitoring convenience is improved, and the management cost is reduced.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for visually monitoring a Hive data warehouse, which is characterized by comprising the following steps:
specific information and task routine information of each table and partition of the Hive data warehouse are stored through a buffer;
when a routine task is submitted, analyzing the stored information through a structured query language (sql) analyzer;
analyzing the information of each table, acquiring the relationship between the information in each table, and merging the information and the relationship of each table to obtain merged information of each dimension of each table;
and reading the merging information of each dimension of each table for web page display.
2. The method for visually monitoring a Hive data warehouse of claim 1, which comprises:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a computation framework MapReduce task running on a resource manager yarn for running, computing and capturing specific information of the task as third type information, and storing the third type information into the buffer.
3. A method of visually monitoring a Hive data warehouse according to claim 2, which comprises:
and the buffer combines the stored first-class information to the third-class information to form combined information for each table.
4. A method of visually monitoring a Hive data warehouse according to claim 3, which comprises:
the merging information of each table is the detailed data of each table, and comprises the following steps:
partition size, throughput time, resource consumption, upstream and downstream relationships.
5. The method of visually monitoring a Hive data warehouse according to claim 1, further comprising:
the final merged result of the buffers is read for providing the specific settings at the front end.
6. The method of visually monitoring a Hive data warehouse according to claim 1, further comprising: the buffer performs global scanning or appointed library scanning on the Hive data warehouse according to the configuration, caches the information of each table in the database in the form of a local file, and performs timing refreshing according to a time interval appointed by a user, wherein the refreshing frequency can be set on a web interface.
7. An apparatus for visually monitoring a Hive data warehouse, comprising:
the storage module is used for storing specific information and task routine information of each table and partition of the Hive data warehouse through a buffer;
the analysis module is used for analyzing the stored information through a structured query language sql analyzer when the routine task is submitted;
the merging module is used for acquiring the relationship between the information in each table after analyzing the information of each table, merging the information and the relationship of each table to obtain the merged information of each dimensionality of each table;
and the display module is used for reading the merging information of each dimensionality of each table and displaying the web pages.
8. The apparatus for visually monitoring a Hive data warehouse according to claim 7, wherein the storage module is configured to:
in the Hive data warehouse, regularly refreshing the information of each partition of each table of the storage data warehouse as first-class information, and storing the first-class information through a buffer;
after the Hive script is submitted every time, analyzing the Sql statement in the first type of information through an Sql analyzer, analyzing a data source table and a corresponding target table of each section of Sql, and storing dependency information of the data source table and the target table into the buffer as second type of information;
and converting the sql statement into a calculation framework MapReduce task running on a resource manager yarn for running, calculating and capturing specific information of the task as third-class information, and storing the third-class information into the buffer.
9. An apparatus for visually monitoring a Hive data warehouse according to claim 8, wherein the merge module is configured to:
and the buffer combines the stored first-type information to the third-type information to form combined information for each table.
10. An apparatus for visually monitoring a Hive data warehouse according to claim 9, wherein said apparatus is further configured to:
the final merged result of the buffers is read for providing the specific settings at the front end.
CN201910672433.4A 2019-07-24 2019-07-24 Method and device for visually monitoring Hive data warehouse Active CN110532261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910672433.4A CN110532261B (en) 2019-07-24 2019-07-24 Method and device for visually monitoring Hive data warehouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910672433.4A CN110532261B (en) 2019-07-24 2019-07-24 Method and device for visually monitoring Hive data warehouse

Publications (2)

Publication Number Publication Date
CN110532261A CN110532261A (en) 2019-12-03
CN110532261B true CN110532261B (en) 2022-09-20

Family

ID=68660867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910672433.4A Active CN110532261B (en) 2019-07-24 2019-07-24 Method and device for visually monitoring Hive data warehouse

Country Status (1)

Country Link
CN (1) CN110532261B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026568B (en) * 2019-12-04 2023-09-29 深圳前海环融联易信息科技服务有限公司 Data and task relation construction method and device, computer equipment and storage medium
CN111158990A (en) * 2019-12-31 2020-05-15 重庆富民银行股份有限公司 Data warehouse intelligent scheduling task batch running system and method
CN114328568A (en) * 2022-01-20 2022-04-12 重庆长安汽车股份有限公司 Hive job management method and system based on web application and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110302151A1 (en) * 2010-06-04 2011-12-08 Yale University Query Execution Systems and Methods
CN109739893A (en) * 2018-12-28 2019-05-10 上海连尚网络科技有限公司 A kind of metadata management method, equipment and computer-readable medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110302151A1 (en) * 2010-06-04 2011-12-08 Yale University Query Execution Systems and Methods
CN109739893A (en) * 2018-12-28 2019-05-10 上海连尚网络科技有限公司 A kind of metadata management method, equipment and computer-readable medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Optimization of Multiple Queries for Big Data with Apache Hadoop/Hive》;Varun Garg 等;《2015 International Conference on Computational Intelligence and Communication Networks (CICN)》;20151231;全文 *
《基于Hive的大数据在线分析处理》;陈耀旺 等;《计算机时代》;20180131;全文 *

Also Published As

Publication number Publication date
CN110532261A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110532261B (en) Method and device for visually monitoring Hive data warehouse
US11481253B2 (en) Managing the processing of streamed data in a data streaming application using query information from a relational database
US11630830B2 (en) Background format optimization for enhanced queries in a distributed computing cluster
US10853368B2 (en) Distinct value estimation for query planning
CN111125178B (en) Data query method, device, terminal, presto query engine and storage medium
CN103646073A (en) Condition query optimizing method based on HBase table
CN109840298B (en) Multi-information-source acquisition method and system for large-scale network data
CN109684079B (en) Display data processing method and device and electronic equipment
CN112286957A (en) API application method and system of BI system based on structured query language
CN108768790A (en) Distributed search cluster monitoring method and device, computing device, storage medium
CN108519908A (en) A kind of task dynamic management approach and device
Belo et al. Restructuring dynamically analytical dashboards based on usage profiles
CN112231344B (en) Real-time stream data query method and device
CN117149849A (en) Method and device for processing multiple query requests and electronic equipment
CN110990227A (en) Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof
DE112011106057T5 (en) Energy-efficient query optimization
CN108073643B (en) Task processing method and device
US9239867B2 (en) System and method for fast identification of variable roles during initial data exploration
US8521721B2 (en) Custom operators for a parallel query engine
CN113760671A (en) Online task diagnosis method and device and electronic equipment
Schad et al. Elephant, do not forget everything! efficient processing of growing datasets
US20100257152A1 (en) Enhanced identification of relevant database indices
CN118210847A (en) Data analysis method based on combination of ETL data warehouse and graph database
CN115185995A (en) Enterprise operation rate evaluation method, system, equipment and medium
CN115469862A (en) Page display method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant