CN110909072A - Data table establishing method, device and equipment - Google Patents

Data table establishing method, device and equipment Download PDF

Info

Publication number
CN110909072A
CN110909072A CN201811090063.5A CN201811090063A CN110909072A CN 110909072 A CN110909072 A CN 110909072A CN 201811090063 A CN201811090063 A CN 201811090063A CN 110909072 A CN110909072 A CN 110909072A
Authority
CN
China
Prior art keywords
data table
database
performance parameters
determining
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811090063.5A
Other languages
Chinese (zh)
Other versions
CN110909072B (en
Inventor
周祥
王烨
赵永春
温绍锦
李瑞萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811090063.5A priority Critical patent/CN110909072B/en
Publication of CN110909072A publication Critical patent/CN110909072A/en
Application granted granted Critical
Publication of CN110909072B publication Critical patent/CN110909072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application provides a data table establishing method, a device and equipment, wherein the method comprises the following steps: acquiring data table information; aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases; selecting a target database from a database set according to the performance parameters; and establishing a data table in the target database according to the data table information. By the technical scheme, the target database can meet the requirements of users, user experience is improved, and a reliable target database is provided for the users in the scene of multiple databases and heterogeneous databases.

Description

Data table establishing method, device and equipment
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, and a device for establishing a data table.
Background
Data Lake analysis (Data Lake analysis) is used for providing a Serverless query analysis service for users, can analyze and query mass Data in any dimension, and supports functions of high concurrency, low delay (millisecond response), real-time online analysis, mass Data query and the like. In the data lake analysis system, a database and a front-end node can be included, wherein the database is used for storing a large amount of data, and the front-end node can inquire the data corresponding to the inquiry request from the database after receiving the inquiry request.
Currently, data lake analysis can support multiple types of databases, the performance of which varies. When creating a data table for a user, it is necessary to select a database from among a plurality of types of databases and create a data table in the selected database. However, in the conventional method, there is no effective database selection method, so that the selected database cannot meet the requirements of the user, and the user experience is poor.
Disclosure of Invention
The application provides a data table establishing method, which comprises the following steps:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
The application provides a data table establishing method, which comprises the following steps:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
The application provides a data table establishes device, the device includes:
the acquisition module is used for acquiring data table information;
the determining module is used for determining the performance parameters of the database according to the data table information aiming at the database in the database set; wherein the database collection comprises a plurality of types of databases;
the selection module is used for selecting a target database from the database set according to the performance parameters;
and the establishing module is used for establishing a data table in the target database according to the data table information.
The application provides a data table establishes device, the device includes:
the acquisition module is used for acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
the generating module is used for selecting a database from a plurality of types of databases in a database set and generating an alternative plan for the SQL statement, wherein the alternative plan comprises the corresponding relation between the data table and the database;
the determining module is used for determining the performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and the establishing module is used for establishing the data table in the target database according to the data table information.
The application provides a data table establishes equipment, includes:
a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
The application provides a data table establishes equipment, includes:
a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
Based on the technical scheme, in the embodiment of the application, the target database can be selected from the database set according to the performance parameters of the database, and the data table is established in the target database, so that when the data table is established for a user, the target database can be selected from the databases of multiple types, an effective selection mode of the database is provided, the target database can meet the requirements of the user, the user experience is improved, and a reliable target database is provided for the user in the scene of multiple databases and heterogeneous databases.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present application or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present application.
FIG. 1A is a schematic diagram of a system architecture in one embodiment of the present application;
FIG. 1B is a schematic diagram of a system configuration in another embodiment of the present application;
FIG. 2 is a flow diagram of a data table creation method in one embodiment of the present application;
FIG. 3 is a flow chart of a data table creation method in another embodiment of the present application;
FIGS. 4A and 4B are schematic diagrams of a data table setup in an embodiment of the present application;
FIG. 5 is a block diagram of a data table creation device in one embodiment of the present application;
fig. 6 is a block diagram of a data table creating apparatus according to another embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".
The embodiment of the application provides a data table establishing method, which can be applied to a system comprising a client, a load balancing device, a front node (also referred to as a front-end server) and a database, such as a database system for implementing data lake analysis. Referring to fig. 1A and fig. 1B, which are schematic diagrams of an application scenario of this embodiment, in fig. 1A, a client, a load balancing device, a front-end node, and a database are included, and in fig. 1B, a client, a load balancing device, a front-end node, a computing node (also referred to as a computing server) and a database are included. Of course, in fig. 1A and fig. 1B, other servers may also be included, for example, a resource scheduling server may also be included, and the system structure is not limited.
In fig. 1A and fig. 1B, 3 front-end nodes are taken as an example, in practical application, the number of the front-end nodes may also be other numbers, which is not limited to this. In fig. 1B, 5 computing nodes are taken as an example, and in practical application, the number of the computing nodes may also be other numbers, which is not limited to this. Since the processing flow of each front-end node is the same, and the processing flow of each computing node is the same, for convenience of description, in the following embodiments, the processing flow of 1 front-end node is taken as an example, and the processing flow of 1 computing node is taken as an example.
The client may be an APP (Application) included in a terminal device (e.g., a Personal Computer (PC), a notebook Computer, a mobile terminal, etc.), or may be a browser included in the terminal device, which is not limited thereto. The load balancing device is configured to perform load balancing on the query request, and if the query request is received, the load balancing device performs load balancing on the query request to each front-end node, which is not limited to this.
Wherein, a plurality of front-end nodes are used for providing the same function and forming a resource pool of the front-end nodes. And each front-end node in the resource pool is used for receiving a Query request sent by a client, performing SQL (Structured Query Language) analysis on the Query request, generating an execution plan according to an analysis result, and processing the execution plan. For example, in the application scenario shown in FIG. 1A, the execution plan is processed by the front-end node. In the application scenario shown in FIG. 1B, the front-end node sends the execution plan to the compute node, which processes the execution plan. For example, the execution plan may be sent to a compute node, which processes the execution plan; alternatively, the execution plan is broken down into multiple sub-plans, and the multiple sub-plans are sent to multiple compute nodes, each compute node processing a sub-plan.
Wherein, a plurality of computing nodes are used for providing the same function and forming a resource pool of the computing nodes. For each computing node in the resource pool, if an execution plan sent by a front-end node is received, the execution plan can be processed; alternatively, if a sub-plan sent by the front-end node is received, the sub-plan may be processed.
The database is used to store various types of data, and the data types are not limited, for example, the data may be user data, commodity data, map data, video data, image data, audio data, and the like.
Referring to fig. 1A and fig. 1B, 5 databases are taken as an example for description, and in practical applications, the number of databases may be other numbers, which is not limited in this respect. In this embodiment, the data sources may be heterogeneous, that is, the databases 101 to 105 may be the same type of database or different types of databases. For example, databases 101 and 102 may be the same type of database, and databases 101 and 103 may be different types of databases.
The database of this embodiment may be a relational database, or a non-relational database, which is not limited to this. For each database, the types of databases may include, but are not limited to: OSS (Object Storage Service), TableStore, HBase (Hadoop Database), HDFS (Hadoop distributed File System), MySQL, etc., which are just a few examples of the types of databases and are not limited thereto.
In the application scenario, referring to fig. 2, a flowchart of a data table establishing method provided in the embodiment of the present application is shown, where the method may be applied to a front-end node, and the method may include the following steps:
step 201, obtaining data table information. In addition, user desired parameters may also be obtained.
In one example, obtaining the spreadsheet information and the user desired parameters may include, but is not limited to: receiving data table information and user expectation parameters input by a first type of user; or receiving user expectation parameters input by the second type of users, and collecting data table information of the second type of users from the database.
In one example, the first class of users may include users who have not established a data table in the database; in addition, the second class of users may include users who have already built a data table in the database.
Step 202, aiming at a database (such as each database) in a database set, determining performance parameters of the database according to the data table information; wherein the database collection may include a plurality of types of databases.
In one example, the data table information may include, but is not limited to: SQL sentences, data tables corresponding to the SQL sentences and data volumes of the data tables; based on this, determining the performance parameters of the database according to the data table information may include: inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among SQL information, data volume, database types and performance parameters of the database.
Further, the performance information table may be generated before determining the performance parameters of the database based on the data table information. Specifically, a data table corresponding to the executed SQL statement, a data amount of the data table, a type of a database where the data table is located, and a performance parameter occupied when the SQL statement is executed may be determined; then, the corresponding relationship between the SQL information of the data table, the data size of the data table, the type of the database where the data table is located, and the performance parameter may be recorded in the performance information table.
In the above embodiments, the SQL information may include, but is not limited to: statement type or operator type.
Step 203, selecting a target database from the database set according to the performance parameters. Specifically, the target database may be selected from a set of databases according to the performance parameter and the user desired parameter.
The user expectation parameters comprise user expectation overhead and user expectation time, and the performance parameters comprise overhead parameters and time parameters; or the user desired parameter comprises a user desired overhead, and the performance parameter comprises an overhead parameter; alternatively, the user desired parameter includes a user desired time and the performance parameter includes a time parameter.
In one example, the data table information may include, but is not limited to: SQL sentences and data tables corresponding to the SQL sentences; based on this, selecting a target database from the set of databases according to the performance parameters may include, but is not limited to: determining performance parameters corresponding to the SQL statement according to the performance parameters of the database, and determining total performance parameters according to the performance parameters corresponding to the SQL statement; a target database may then be selected from the set of databases based on the overall performance parameters and the user desired parameters.
The determining the performance parameter corresponding to the SQL statement according to the performance parameter of the database may include, but is not limited to: if the SQL statement corresponds to one data table, determining the performance parameters corresponding to the SQL statement by using the performance parameters of the database corresponding to the data table; or, if the SQL statement corresponds to multiple data tables, the performance parameter corresponding to the SQL statement may be determined by using the performance parameter of the database corresponding to each of the multiple data tables. Of course, in practical application, the performance parameter corresponding to the SQL statement may also be determined in other manners, and the determination manner is not limited.
The determining the total performance parameter according to the performance parameter corresponding to the SQL statement may include, but is not limited to: if the data table information comprises an SQL statement, determining a total performance parameter by using the performance parameter corresponding to the SQL statement; or, if the data table information includes a plurality of SQL statements, the total performance parameter may be determined by using the performance parameter corresponding to each of the plurality of SQL statements.
Wherein selecting the target database from the database set according to the total performance parameters may include: and selecting total performance parameters which accord with expectation from all the total performance parameters according to the user expectation parameters, determining a database corresponding to the total performance parameters, and determining the database as a target database.
Step 204, establishing a data table in the target database according to the data table information.
Wherein, establishing the data table in the target database according to the data table information may include but is not limited to: if the data table information includes a data table name and a data table field, a data table corresponding to the data table name and the data table field may be established in the target database, which is not limited to this.
In one example, creating a data table in the target database according to the data table information may further include: establishing a data table in a target database for the first class of users according to the data table information; or establishing a data table in the target database for the second class user according to the data table information, and migrating the data corresponding to the second class user to the data table of the target database; the first class of users comprise users who do not establish a data table in the database; the second category of users includes users who have already built a data table in the database.
In an example, the execution sequence is only an example given for convenience of description, and in practical applications, the execution sequence between steps may also be changed, and the execution sequence is not limited. Moreover, in other embodiments, the steps of the respective methods do not have to be performed in the order shown and described herein, and the methods may include more or less steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.
Based on the technical scheme, in the embodiment of the application, the target database can be selected from the database set according to the performance parameters of the databases and the user expectation parameters, and the data table is established in the target database, so that when the data table is established for the user, the target database can be selected from the databases of multiple types, an effective selection mode of the database is provided, the target database can meet the requirements of the user, the user experience is improved, and the reliable target database is provided for the user in the scene of multiple databases and heterogeneous databases.
Referring to fig. 3, a flow chart of another data table establishing method proposed in the embodiment of the present application is shown, where the method may be applied to a front-end node, and the method may include the following steps:
step 301, obtaining data table information; the data table information may include, but is not limited to, an SQL statement and a data table corresponding to the SQL statement. In addition, user desired parameters may also be obtained.
In one example, obtaining the spreadsheet information and the user desired parameters may include, but is not limited to: receiving data table information and user expectation parameters input by a first type of user; or receiving user expectation parameters input by the second type of users, and collecting data table information of the second type of users from the database.
In one example, the first class of users may include users who have not established a data table in the database; in addition, the second class of users may include users who have already built a data table in the database.
Step 302, selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, where the alternative plan includes a corresponding relationship between the data table and the database.
Step 303, determining the performance parameters of the database according to the data table information.
In one example, the data table information may include, but is not limited to: SQL sentences, data tables corresponding to the SQL sentences and data volumes of the data tables; based on this, determining the performance parameters of the database according to the data table information may include: inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among SQL information, data volume, database types and performance parameters of the database.
Further, the performance information table may be generated before determining the performance parameters of the database based on the data table information. Specifically, a data table corresponding to the executed SQL statement, a data amount of the data table, a type of a database where the data table is located, and a performance parameter occupied when the SQL statement is executed may be determined; then, the corresponding relationship between the SQL information of the data table, the data size of the data table, the type of the database where the data table is located, and the performance parameter may be recorded in the performance information table.
In the above embodiments, the SQL information may include, but is not limited to: statement type or operator type.
Step 304, determining the performance parameters of the alternative plan according to the performance parameters of the database.
In one example, determining the performance parameters of the alternative plan based on the performance parameters of the database may include, but is not limited to: if the SQL statement corresponds to one data table, the performance parameters of the candidate plan may be determined by using the performance parameters of the database corresponding to the data table; or, if the SQL statement corresponds to a plurality of data tables, the performance parameter of the candidate plan may be determined by using the performance parameter of the database corresponding to each of the plurality of data tables. Of course, in practical application, the performance parameter corresponding to the candidate plan may also be determined in other manners, and the determination manner is not limited.
Step 305, determining a target candidate plan according to the performance parameters of the candidate plans.
In one example, determining the target alternate plan based on the performance parameters of the alternate plan may include, but is not limited to: according to the expected parameters of the user, selecting performance parameters which meet the expectation from the performance parameters of all the alternative plans corresponding to the SQL statement, and then determining the alternative plans which meet the expectation corresponding to the performance parameters as target alternative plans corresponding to the SQL statement.
Step 306, determining the database in the target candidate plan as the target database corresponding to the data table.
Step 307, building a data table in the target database according to the data table information.
Wherein, establishing the data table in the target database according to the data table information may include but is not limited to: if the data table information includes a data table name and a data table field, a data table corresponding to the data table name and the data table field may be established in the target database, which is not limited to this.
In one example, creating a data table in the target database according to the data table information may further include: establishing a data table in a target database for the first class of users according to the data table information; or establishing a data table in the target database for the second class user according to the data table information, and migrating the data corresponding to the second class user to the data table of the target database; the first class of users comprise users who do not establish a data table in the database; the second category of users includes users who have already built a data table in the database.
In the above embodiments, the user desired parameters may include, but are not limited to, a user desired overhead and a user desired time, and the performance parameters may include, but are not limited to, an overhead parameter and a time parameter; alternatively, the user desired parameter may comprise a user desired overhead, and the performance parameter may comprise an overhead parameter; alternatively, the user desired parameter may include a user desired time, and the performance parameter may include a time parameter.
In an example, the execution sequence is only an example given for convenience of description, and in practical applications, the execution sequence between steps may also be changed, and the execution sequence is not limited. Moreover, in other embodiments, the steps of the respective methods do not have to be performed in the order shown and described herein, and the methods may include more or less steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.
Based on the technical scheme, in the embodiment of the application, the target database can be selected from the database set according to the performance parameters of the databases and the user expectation parameters, and the data table is established in the target database, so that when the data table is established for the user, the target database can be selected from the databases of multiple types, an effective selection mode of the database is provided, the target database can meet the requirements of the user, the user experience is improved, and the reliable target database is provided for the user in the scene of multiple databases and heterogeneous databases.
The above technical solution is further described below with reference to several specific application scenarios.
Application scenario 1, generation process of the performance information table.
For a user who has already established a data table, when this user executes an SQL statement, the front-end node may determine the data table corresponding to the executed SQL statement, the data size of the data table, the type of the database where the data table is located, and the performance parameters occupied when executing the SQL statement. Then, the front-end node may record, in the performance information table, a correspondence between SQL information (such as a statement type or an operator type) of the data table, a data amount of the data table, a type of a database in which the data table is located, and the performance parameter.
For example, when SQL statement 1 is executed, the statement type of SQL statement 1 is determined. If the SQL statement 1 is a join statement, the statement type is join; if the SQL statement 1 is a group by statement, the statement type is a group by; if the SQL statement 1 is an order by statement, the statement type is order by; if the SQL statement 1 is a distinct (different) statement, the statement type is distinct; if the SQL statement 1 is a count statement, the statement type is count; if the SQL statement 1 is a window statement, the statement type is window. Of course, the above is merely an example, and no limitation is made thereto.
Then, a data table corresponding to the SQL statement 1, the data size of the data table, and the type of the database where the data table is located are determined. For example, when the SQL statement 1 is executed, the data table a and the data table B need to be operated, and it is determined that the data table corresponding to the SQL statement 1 is the data table a and the data table B. The amount of data for data table a is determined, e.g., 100M, and the amount of data for data table B is determined, e.g., 200M. The type of database a, such as OSS, in which table a is located is determined, and the type of database B, such as HDFS, in which table B is located is determined.
Then, performance parameters used when executing SQL statement 1 may be determined, e.g., the performance parameters may include time parameters (e.g., elapsed time to execute). For example, when the SQL statement 1 is executed, the execution time for the data table a, that is, the time consumed in the data table a, such as 5 seconds, may be counted; when the SQL statement 1 is executed, the execution time for the data table B, that is, the time consumed in the data table B, such as 6 seconds, may also be counted.
Through the processing, the data table a and the data table B corresponding to the SQL statement 1, the data volume 100M of the data table a, the data volume 200M of the data table B, the type OSS of the database a in which the data table a is located, the type HDFS of the database B in which the data table B is located, 5 seconds consumed for executing the data table a, and 6 seconds consumed for executing the data table B can be obtained. Then, the above contents can be recorded in the performance information table, see table 1, which is an example of the performance information table, and in table 1, the example is that SQL statement 1 is a join statement.
TABLE 1
Statement type Data volume of data table Type of database Time consuming to perform
join 100M OSS 5 seconds
join 200M HDFS 6 seconds
For another example, when the SQL statement 1 is executed, the Operator type of the SQL statement 1 may be determined, such as a tablescan Operator, a Filter Operator, a Join Operator, an Agg Operator, an Output Operator, and the like, but the above are only examples and are not limited thereto. If SQL statement 1 uses a Table Scan Operator, the Operator type may be the Table Scan Operator.
Then, a data table corresponding to the SQL statement 1, a data amount of the data table, a type of a database where the data table is located, and a performance parameter used when the SQL statement 1 is executed are determined, for a specific process, refer to the above example, and are not described herein again. Finally, a data table a and a data table B corresponding to the SQL statement 1, a data volume 100M of the data table a, a data volume 200M of the data table B, a type OSS of the database a in which the data table a is located, a type HDFS of the database B in which the data table B is located, 5 seconds consumed for executing the data table a, and 6 seconds consumed for executing the data table B may be obtained, and the above contents are recorded in the performance information table, as shown in table 2.
TABLE 2
Figure BDA0001804117270000121
Application scenario 2, a first category user's data table creation process. The first type of users are users who do not establish a data table in the database, that is, a new user, and it is subsequently assumed that the first type of users are user a.
Referring to fig. 4A, a flow diagram of the front-end node establishing a data table for user a is shown.
Step 411, the front-end node receives a data table establishment message sent by the user a through the client, where the data table establishment message may carry data table information and user expectation parameters, and the content is not limited.
Wherein, the data table information may include but is not limited to: SQL statements (e.g., SQL statement 1 and SQL statement 2, with SQL statement 1 being a join statement and SQL statement 2 being an order by statement); the data table corresponding to the SQL statement (e.g. SQL statement 1 corresponds to data table 11 and data table 12, and SQL statement 2 corresponds to data table 21 and data table 22); the data size of the data table (e.g., the data size of data table 11 is 100M, the data size of data table 12 is 200M, the data size of data table 21 is 200M, and the data size of data table 22 is 300M); the name of the data table and the fields of the data table (for example, the name of the data table 11 is RRR, the fields of the data table 11 include field 1, field 2, etc., and the names and fields of other data tables are not described herein again).
The user desired parameters may include, but are not limited to: user desired overhead and user desired time. The user desired overhead may be a desired maximum cost, i.e., the actual cost is no greater than the maximum cost; the user desired time may be a desired maximum execution elapsed time, i.e., the actual execution elapsed time is not greater than the maximum execution elapsed time.
In step 412, the front-end node selects a database from the plurality of types of databases in the database set, and generates an alternative plan for the SQL statement, where the alternative plan includes a corresponding relationship between the data table and the database.
For example, assuming that the database set includes a database B of the type of the OSS type database A, HDFS and a database C of the type of MySQL, the alternative plan generated by the front-end node for the SQL statement 1 is shown in table 3, and the alternative plan generated by the front-end node for the SQL statement 2 is similar to table 3, which is not described herein again.
TABLE 3
Figure BDA0001804117270000131
In step 413, the front-end node determines performance parameters (such as overhead parameters and time parameters) of the database according to the data table information, and determines performance parameters of the alternative plan according to the performance parameters of the database.
For example, the front-end node may determine the performance parameter of database a according to the data table information of alternative plan 1, and determine the performance parameter of alternative plan 1 according to the performance parameter of database a; and determining the performance parameters of the database A and the database B according to the data table information of the alternative plan 2, determining the performance parameters of the alternative plan 2 according to the performance parameters of the database A and the database B, and so on.
Taking the processing procedure for alternative plan 2 as an example, the front-end node may determine SQL information of data Table 11 (for example, if SQL statement 1 is a join statement, SQL information is a join, or if SQL statement 1 uses a Table Scan Operator, SQL information is a Table Scan Operator), and determine data amount 100M of data Table 11, type OSS of database a. The front-end node can then look up the table of performance information shown in table 1 by join, 100M and OSS to get the performance parameter of database a (e.g., the time it takes to execute) is 5 seconds. Or, the front-end node may query the performance information Table shown in Table 2 through the Table Scan Operator, 100M and OSS, and it takes 5 seconds to obtain the execution time of the database a.
The front-end node may determine SQL information (e.g., join or Table Scan Operator operators) for the data Table 12 and determine the amount of data 200M for the data Table 12, the type HDFS for database B. The front-end node can then look up the table of performance information shown in table 1 by join, 200M and HDFS, and the execution of database B takes 6 seconds. Or, the front-end node may query the performance information Table shown in Table 2 through the Table Scan Operator, 200M and HDFS, and it takes 6 seconds to obtain the execution time of the database B.
Since the alternative plan 2 for the SQL statement 1 includes the data table 11 and the data table 12, the execution time of the alternative plan 2 may be determined by using the execution time of the database a corresponding to the data table 11 and the execution time of the database B corresponding to the data table 12, for example, the execution time of the alternative plan 2 is the sum of the execution time of the database a and the execution time of the database B, that is, the execution time of the alternative plan 2 is 11 seconds.
Since the overhead parameter of the database a (e.g., the cost of the database a) is known, e.g., 5 bits, and the overhead parameter of the database B (e.g., the cost of the database B) is known, e.g., 7 bits, the overhead parameter of the alternative plan 2 may also be determined according to the overhead parameter of the database a corresponding to the data table 11 and the overhead parameter of the database B corresponding to the data table 12, e.g., the overhead parameter of the alternative plan 2 is the sum of the overhead parameter of the database a and the overhead parameter of the database B, i.e., the overhead parameter of the alternative plan 2 may be 12 bits.
In summary, performance parameters of the alternative plan 2 can be determined, such as 11 seconds for execution and 12 elements for overhead. Similarly, the performance parameters of other candidate plans may be determined in the above manner, and the determination process is not described again, and finally, the performance parameters of all candidate plans for the SQL statement 1 may be as shown in table 4.
TABLE 4
Figure BDA0001804117270000151
In addition, after the front-end node generates the alternative plans for the SQL statement 2, the overhead parameters and execution time of each alternative plan may also be determined, and the specific determination manner is referred to the above embodiments and is not described herein again.
In step 414, the front-end node determines a total performance parameter according to the performance parameter corresponding to the SQL statement (i.e., the performance parameter corresponding to the candidate plan of the SQL statement), determines a target candidate plan according to the total performance parameter and the user expectation parameter, and determines a database in the target candidate plan as a target database.
For example, assuming that SQL statement 1 corresponds to alternative plan 1-alternative plan 9, and SQL statement 2 corresponds to alternative plan a-alternative plan F, the sum of the performance parameter of alternative plan 1 and the performance parameter of alternative plan a may be determined as total performance parameter 1, the sum of the performance parameter of alternative plan 1 and the performance parameter of alternative plan B may be determined as total performance parameter 2, and so on, the sum of the performance parameter of alternative plan 9 and the performance parameter of alternative plan F may be determined as total performance parameter 54, so that a total of 54 total performance parameters may be obtained.
When the alternative plan corresponding to the SQL statement 1 and the alternative plan corresponding to the SQL statement 2 are selected, if the data table corresponding to the SQL statement 1 and the data table corresponding to the SQL statement 2 are not repeated, the alternative plan combination may be selected at will, for example, when the SQL statement 1 corresponds to the data tables 11 and 12, and when the SQL statement 2 corresponds to the data tables 21 and 22, the alternative plan combination may be selected at will, for example, the combination of the above 54 alternative plans may be obtained, and 54 total performance parameters are obtained. If the data table corresponding to the SQL statement 1 is repeated with the data table corresponding to the SQL statement 2, an orthogonal alternative plan combination may be selected, for example, if the data table 11 and the data table 12 correspond to the SQL statement 1, and if the data table 11 and the data table 22 correspond to the SQL statement 2, a repeated data table 11 exists; assuming that the alternative plan 1 is selected, since the data table 11 in the alternative plan 1 corresponds to the database a, when the alternative plan is selected from the alternative plans a to F, only the alternative plan of the database a corresponding to the data table 11 is selected, and the alternative plan of the database B or the database C corresponding to the data table 11 cannot be selected.
After obtaining the total performance parameters, the front-end node may determine a target candidate plan according to the total performance parameters and the user expectation parameters, for example, may select a total performance parameter that meets expectations from all the total performance parameters according to the user expectation parameters, and determine a candidate plan corresponding to the total performance parameter as the target candidate plan.
For example, if the user desired parameters are the user desired cost and the user desired time, when the cost parameter in the total performance parameters (such as the total performance parameter 1-the total performance parameter 54) is less than or equal to the user desired cost and the execution time in the total performance parameters is less than or equal to the user desired time, it indicates that the total performance parameters satisfy the user requirements, otherwise, it indicates that the total performance parameters do not satisfy the user requirements. Further, if it is a total performance parameter that meets the user's requirements, the total performance parameter may be determined to be a total performance parameter that meets expectations; if a plurality of total performance parameters are met, one of the plurality of total performance parameters may be selected and determined as meeting the expected total performance parameter.
For example, assuming that the total performance parameter 1 is a total performance parameter that meets expectations, the alternative plan 1 and the alternative plan a corresponding to the total performance parameter 1 may be determined as target alternative plans. The front-end node may then determine the database in alternative plan 1 as the target database, that is, may determine database a as the target database of data table 11 and database a as the target database of data table 12. Similarly, the database in the alternative plan a may be determined as the target database, for example, the database a may be determined as the target database of the data table 21, and the database C may be determined as the target database of the data table 22.
In the above embodiment, if a plurality of total performance parameters are satisfied, the front-end node may randomly select one total performance parameter from the plurality of total performance parameters; alternatively, an optimal overall performance parameter may be selected from the plurality of overall performance parameters, i.e. the overall performance parameter is minimal; alternatively, the plurality of overall performance parameters may be sent to the user, so that the user selects one overall performance parameter from the plurality of overall performance parameters. Of course, the above selection modes are only a few examples, and the selection modes are not limited.
In the above embodiment, after the front-end node determines the target alternative plan, the front-end node may further send the target alternative plan to the user, and the user determines whether to adopt the target alternative plan. If the front-end node receives the consent instruction, the database in the target alternative plan may be determined to be the target database. If the front-end node receives the rejection instruction, the target alternative plan may be re-determined, for example, the total performance parameter meeting the expectation is re-selected from the plurality of total performance parameters meeting the user requirement, and the target alternative plan is re-determined, and then the re-determined target alternative plan is sent to the user, and so on, and the details are not repeated here.
In step 415, the front-end node builds a data table in the target database based on the data table information. Specifically, a data table corresponding to the data table name and the data table field may be established in the target database.
For example, assuming that database a is determined to be the target database of data table 11, database a is determined to be the target database of data table 12, database a is determined to be the target database of data table 21, and database C is determined to be the target database of data table 22, the front-end node may establish data table 11, data table 12, and data table 21 in database a, and may establish data table 22 in database C.
The data table information may include a data table name and a data table field, for example, the name of the data table 11 is RRR, and the field of the data table 11 includes field 1, field 2, and the like, so that the data table 11 is established in the database a, the name of the data table 11 is RRR, and the field of the data table 11 includes field 1, field 2, and the like.
In one example, the front-end node may generate a data table setup plan for use in setting up data tables in the target database. In the application scenario shown in fig. 1A, the data table creation plan is executed by the front-end node to create a data table in the target database based on the data table information. In the application scenario shown in fig. 1B, the front-end node may send the data table creation plan to the compute node, which executes the data table creation plan to create a data table in the target database according to the data table information.
In the above embodiment, after the front-end node generates the data table establishment plan, the visual analysis graph of the data table establishment plan may be displayed to the user through the user interface, so that the user can know the current data table establishment plan, thereby providing a good interactive experience and improving the use experience of the user.
In the above embodiment, the database may be a cloud database, that is, a database service provided by a cloud computing platform, and furthermore, the database may be a plurality of databases of heterogeneous Data sources of Data Lake.
Based on the technical scheme, in the embodiment of the application, when the data table is created for the user, the target database can be selected from the databases of multiple types, so that an effective selection mode of the database is provided, the target database can meet the requirements of the user, the user experience is improved, and a reliable target database is provided for the user in the scene of multiple databases and heterogeneous databases. Moreover, the method can help the user to have more choices and service optimization in the aspects of price (cost), performance and the like in the Data Lake scene on the cloud.
Application scenario 3, data table creation process for the second class of users. Wherein, the second type of user is the user who has already established the data table in the database, i.e. an existing user, and it is assumed that the second type of user is user B.
Referring to fig. 4B, a flow diagram of the front-end node establishing a data table for user B is shown.
In step 421, the front-end node collects the data table information of the user B from the database, and receives the user expectation parameters input by the user B, such as the user expectation overhead and the user expectation time, which are not limited herein.
For example, on the basis of already deploying the data table of the user B, the front-end node may also monitor the actual operation condition of the user B, and if the actual operation condition of the user B is found to be poor, and if the execution time is long, the data table information of the user B may be collected, and the specific content may be referred to in step 411, which is not described herein again.
When the actual operation condition of the user B is poor, the data table representing the user B may need to be migrated, and therefore, the user B may be requested for the user desired parameters and receive the user desired parameters input by the user B.
Of course, in practical application, it may also be other situations to trigger the front-end node to collect the data table information of the user B, such as the front-end node periodically collecting the data table information of the user B, which is not limited herein.
In step 422, the front-end node selects a database from the multiple types of databases in the database set, and generates an alternative plan for the SQL statement, where the alternative plan includes a corresponding relationship between the data table and the database.
In step 423, the front-end node determines performance parameters (such as overhead parameters and time parameters) of the database according to the data table information, and determines performance parameters of the candidate plan according to the performance parameters of the database.
In step 424, the front-end node determines a total performance parameter according to the performance parameter corresponding to the SQL statement (i.e., the performance parameter corresponding to the candidate plan of the SQL statement), determines a target candidate plan according to the total performance parameter and the user expectation parameter, and determines a database in the target candidate plan as a target database.
Step 425, the front-end node builds a data table in the target database based on the data table information.
The implementation process of step 422 to step 425 can refer to step 412 to step 415, and will not be described herein again.
And 426, migrating the data corresponding to the user to a data table of the target database by the front-end node.
For example, before step 421, data table 3 has been established in database a for user B, and in step 425, data table 3 is established in database B for user B, so all data in data table 3 of database a may be migrated to data table 3 of database B, and the migration process is not described again. Then, all the data is deleted from the data table 3 of the database a, and the data table 3 of the database a is deleted.
Based on the same application concept as the method, an embodiment of the present application further provides a data table creating apparatus, which can be applied to a front-end node, as shown in fig. 5, and is a structure diagram of the apparatus, where the apparatus includes:
an obtaining module 501, configured to obtain information of a data table;
a determining module 502, configured to determine, for a database in a database set, a performance parameter of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
a selecting module 503, configured to select a target database from the database set according to the performance parameter;
an establishing module 504, configured to establish a data table in the target database according to the data table information.
The data table information comprises SQL sentences, data tables corresponding to the SQL sentences and data volumes of the data tables; the determining module 502 is specifically configured to, when determining the performance parameter of the database according to the data table information: inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among the SQL information, the data volume, the type of the database and the performance parameters of the database.
The data table information comprises SQL sentences and data tables corresponding to the SQL sentences; the selecting module 503 is specifically configured to, when selecting the target database from the database set according to the performance parameter: determining performance parameters corresponding to the SQL statements according to the performance parameters of the database; determining total performance parameters according to the performance parameters corresponding to the SQL statements; and selecting a target database from the database set according to the total performance parameters and the user expected parameters.
Based on the same application concept as the method, an embodiment of the present application further provides a data table creating device, including: a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
The embodiment of the application also provides a machine-readable storage medium, wherein a plurality of computer instructions are stored on the machine-readable storage medium; the computer instructions when executed perform the following:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
Based on the same application concept as the method, an embodiment of the present application further provides a data table creating apparatus, which can be applied to a front-end node, as shown in fig. 6, and is a structure diagram of the apparatus, where the apparatus includes:
an obtaining module 601, configured to obtain information of a data table; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
a generating module 602, configured to select a database from multiple types of databases in a database set, and generate an alternative plan for the SQL statement, where the alternative plan includes a correspondence between the data table and the database;
a determining module 603, configured to determine a performance parameter of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
the establishing module 604 is configured to establish a data table in the target database according to the data table information.
Based on the same application concept as the method, an embodiment of the present application further provides a data table creating device, including: a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
The embodiment of the application also provides a machine-readable storage medium, wherein a plurality of computer instructions are stored on the machine-readable storage medium; the computer instructions when executed perform the following:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (23)

1. A method for creating a data table, the method comprising:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
2. The method of claim 1,
the method further comprises the following steps: acquiring a user expected parameter;
selecting a target database from a database set according to the performance parameters comprises:
and selecting a target database from a database set according to the performance parameters and the user expected parameters.
3. The method of claim 1,
the acquiring data table information comprises:
receiving data table information input by a first type of user; or collecting the data table information of the second type of users from the database; wherein the first class of users comprises users who do not establish a data table in a database;
the second class of users includes users who have already built a data table in the database.
4. The method according to claim 1, wherein the data table information includes SQL statements, data tables corresponding to the SQL statements, and data volumes of the data tables;
the determining the performance parameters of the database according to the data table information comprises:
inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among the SQL information, the data volume, the type of the database and the performance parameters of the database.
5. The method of claim 4,
before determining the performance parameters of the database according to the data table information, the method further comprises the following steps:
determining a data table corresponding to the executed SQL statement, the data volume of the data table, the type of a database where the data table is located, and performance parameters occupied when the SQL statement is executed;
and recording the SQL information of the data table, the data volume of the data table, the type of the database where the data table is located and the corresponding relation between the performance parameters in a performance information table.
6. The method of claim 1,
the data table information comprises SQL sentences and data tables corresponding to the SQL sentences;
selecting a target database from a database set according to the performance parameters comprises:
determining performance parameters corresponding to the SQL statements according to the performance parameters of the database;
determining total performance parameters according to the performance parameters corresponding to the SQL statements;
and selecting a target database from the database set according to the total performance parameters and the user expected parameters.
7. The method of claim 6,
the determining the performance parameters corresponding to the SQL statements according to the performance parameters of the database comprises the following steps:
if the SQL statement corresponds to one data table, determining the performance parameters corresponding to the SQL statement by using the performance parameters of the database corresponding to the data table; or the like, or, alternatively,
and if the SQL statement corresponds to a plurality of data tables, determining the performance parameters corresponding to the SQL statement by using the performance parameters of the database corresponding to each data table in the plurality of data tables.
8. The method of claim 6,
the determining the total performance parameter according to the performance parameter corresponding to the SQL statement includes:
if the data table information comprises an SQL statement, determining a total performance parameter by using the performance parameter corresponding to the SQL statement; or, if the data table information includes a plurality of SQL statements, determining a total performance parameter by using a performance parameter corresponding to each of the plurality of SQL statements.
9. The method of claim 6, wherein selecting the target database from the set of databases based on the total performance parameters and the user desired parameters further comprises:
selecting a total performance parameter which meets the expectation from all the total performance parameters according to the user expectation parameter;
and determining a database corresponding to the total performance parameters, and determining the database as a target database.
10. The method of claim 1,
the establishing of the data table in the target database according to the data table information comprises:
and if the data table information comprises a data table name and a data table field, establishing a data table corresponding to the data table name and the data table field in the target database.
11. The method of claim 1,
the establishing of the data table in the target database according to the data table information comprises:
establishing a data table in the target database for the first class of users according to the data table information; alternatively, the first and second electrodes may be,
establishing a data table in the target database for a second type of user according to the data table information, and migrating data corresponding to the second type of user to the data table of the target database;
wherein the first class of users comprises users who do not establish a data table in a database;
the second class of users includes users who have already built a data table in the database.
12. A method for creating a data table, the method comprising:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
13. The method of claim 12, wherein the data table information includes a data quantity of the data table; determining performance parameters of the database according to the data table information, including:
inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among the SQL information, the data volume, the type of the database and the performance parameters of the database.
14. The method of claim 13,
before determining the performance parameters of the database according to the data table information, the method further comprises the following steps:
determining a data table corresponding to the executed SQL statement, the data volume of the data table, the type of a database where the data table is located, and performance parameters occupied when the SQL statement is executed;
and recording the SQL information of the data table, the data volume of the data table, the type of the database where the data table is located and the corresponding relation between the performance parameters in a performance information table.
15. The method of claim 12,
the determining the performance parameters of the alternative plans according to the performance parameters of the database includes:
if the SQL statement corresponds to one data table, determining the performance parameters of the alternative plan by using the performance parameters of the database corresponding to the data table; alternatively, the first and second electrodes may be,
and if the SQL statement corresponds to a plurality of data tables, determining the performance parameters of the alternative plan by using the performance parameters of the database corresponding to each data table in the plurality of data tables.
16. The method of claim 12,
the determining a target candidate plan according to the performance parameters of the candidate plan includes:
and selecting performance parameters which accord with expectation from the performance parameters of all the alternative plans corresponding to the SQL statements according to the user expectation parameters, and determining the alternative plans corresponding to the performance parameters which accord with the expectation as target alternative plans corresponding to the SQL statements.
17. The method of claim 12,
the establishing of the data table in the target database according to the data table information comprises:
establishing a data table in the target database for the first class of users according to the data table information; alternatively, the first and second electrodes may be,
establishing a data table in the target database for a second type of user according to the data table information, and migrating data corresponding to the second type of user to the data table of the target database;
wherein the first class of users comprises users who do not establish a data table in a database;
the second class of users includes users who have already built a data table in the database.
18. A data table creation apparatus, the apparatus comprising:
the acquisition module is used for acquiring data table information;
the determining module is used for determining the performance parameters of the database according to the data table information aiming at the database in the database set; wherein the database collection comprises a plurality of types of databases;
the selection module is used for selecting a target database from the database set according to the performance parameters;
and the establishing module is used for establishing a data table in the target database according to the data table information.
19. The apparatus according to claim 18, wherein the data table information includes SQL statements, data tables corresponding to the SQL statements, and data volumes of the data tables;
the determining module is specifically configured to, when determining the performance parameter of the database according to the data table information: inquiring a performance information table according to the SQL information of the data table, the data volume and the type of the database to obtain performance parameters of the database; the performance information table is used for recording the corresponding relation among the SQL information, the data volume, the type of the database and the performance parameters of the database.
20. The apparatus of claim 18,
the data table information comprises SQL sentences and data tables corresponding to the SQL sentences; the selection module is specifically configured to, when selecting the target database from the database set according to the performance parameter:
determining performance parameters corresponding to the SQL statements according to the performance parameters of the database;
determining total performance parameters according to the performance parameters corresponding to the SQL statements;
and selecting a target database from the database set according to the total performance parameters and the user expected parameters.
21. A data table creation apparatus, the apparatus comprising:
the acquisition module is used for acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
the generating module is used for selecting a database from a plurality of types of databases in a database set and generating an alternative plan for the SQL statement, wherein the alternative plan comprises the corresponding relation between the data table and the database;
the determining module is used for determining the performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and the establishing module is used for establishing the data table in the target database according to the data table information.
22. A data table creation device, comprising:
a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information;
aiming at a database in a database set, determining performance parameters of the database according to the data table information; wherein the database collection comprises a plurality of types of databases;
selecting a target database from a database set according to the performance parameters;
and establishing a data table in the target database according to the data table information.
23. A data table creation device, comprising:
a processor and a machine-readable storage medium having stored thereon a plurality of computer instructions, the processor when executing the computer instructions performs:
acquiring data table information; the data table information comprises a Structured Query Language (SQL) statement and a data table corresponding to the SQL statement;
selecting a database from a plurality of types of databases in a database set, and generating an alternative plan for the SQL statement, wherein the alternative plan comprises a corresponding relation between the data table and the database;
determining performance parameters of the database according to the data table information;
determining the performance parameters of the alternative plan according to the performance parameters of the database;
determining a target alternative plan according to the performance parameters of the alternative plan;
determining a database in the target alternative plan as a target database corresponding to the data table;
and establishing the data table in the target database according to the data table information.
CN201811090063.5A 2018-09-18 2018-09-18 Data table establishment method, device and equipment Active CN110909072B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811090063.5A CN110909072B (en) 2018-09-18 2018-09-18 Data table establishment method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811090063.5A CN110909072B (en) 2018-09-18 2018-09-18 Data table establishment method, device and equipment

Publications (2)

Publication Number Publication Date
CN110909072A true CN110909072A (en) 2020-03-24
CN110909072B CN110909072B (en) 2023-07-18

Family

ID=69812983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811090063.5A Active CN110909072B (en) 2018-09-18 2018-09-18 Data table establishment method, device and equipment

Country Status (1)

Country Link
CN (1) CN110909072B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666263A (en) * 2020-05-12 2020-09-15 埃睿迪信息技术(北京)有限公司 Method for realizing heterogeneous data management in data lake environment
WO2023138665A1 (en) * 2022-01-24 2023-07-27 北京奥星贝斯科技有限公司 Query optimization method and apparatus for distributed database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132718A1 (en) * 2005-08-12 2009-05-21 Agent Mobile Pty Ltd Content Filtering System for a Mobile Communication Device and Method of Using Same
CN103810219A (en) * 2012-11-15 2014-05-21 中国移动通信集团公司 Line storage database-based data processing method and device
CN107016019A (en) * 2015-10-23 2017-08-04 阿里巴巴集团控股有限公司 Database index creation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132718A1 (en) * 2005-08-12 2009-05-21 Agent Mobile Pty Ltd Content Filtering System for a Mobile Communication Device and Method of Using Same
CN103810219A (en) * 2012-11-15 2014-05-21 中国移动通信集团公司 Line storage database-based data processing method and device
CN107016019A (en) * 2015-10-23 2017-08-04 阿里巴巴集团控股有限公司 Database index creation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘如九;张振山;柴天佑;: "一种通用的多数据库间数据抽取方法及应用", 北京交通大学学报 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666263A (en) * 2020-05-12 2020-09-15 埃睿迪信息技术(北京)有限公司 Method for realizing heterogeneous data management in data lake environment
WO2023138665A1 (en) * 2022-01-24 2023-07-27 北京奥星贝斯科技有限公司 Query optimization method and apparatus for distributed database

Also Published As

Publication number Publication date
CN110909072B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN109309596B (en) Pressure testing method and device and server
CN111177222A (en) Model testing method and device, computing equipment and storage medium
JP2021519460A (en) Data query methods, devices, and devices
CN107291770B (en) Mass data query method and device in distributed system
CN110347515B (en) Resource optimization allocation method suitable for edge computing environment
WO2020211717A1 (en) Data processing method, apparatus and device
CN111782404A (en) Data processing method and related equipment
CN110909072B (en) Data table establishment method, device and equipment
CN111400301B (en) Data query method, device and equipment
CN110377611B (en) Method and device for ranking scores
CN112328865A (en) Information processing and recommending method, device, equipment and storage medium
CN107193749B (en) Test method, device and equipment
CN110505276B (en) Object matching method, device and system, electronic equipment and storage medium
CN112506887A (en) Vehicle terminal CAN bus data processing method and device
CN115563160A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN109389271B (en) Application performance management method and system
CN107679096B (en) Method and device for sharing indexes among data marts
CN116028696A (en) Resource information acquisition method and device, electronic equipment and storage medium
CN110928895B (en) Data query and data table establishment method, device and equipment
CN115033616A (en) Data screening rule verification method and device based on multi-round sampling
CN110866052A (en) Data analysis method, device and equipment
CN110309177B (en) Data processing method and related device
CN111221858B (en) Data processing method, device and equipment
CN110427390A (en) Data query method and device, storage medium, electronic device
CN111831425A (en) Data processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant