CN111639078A - Data query method and device, electronic equipment and readable storage medium - Google Patents

Data query method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN111639078A
CN111639078A CN202010450068.5A CN202010450068A CN111639078A CN 111639078 A CN111639078 A CN 111639078A CN 202010450068 A CN202010450068 A CN 202010450068A CN 111639078 A CN111639078 A CN 111639078A
Authority
CN
China
Prior art keywords
query
target
logic
sql
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010450068.5A
Other languages
Chinese (zh)
Inventor
池阳
薛景福
封磊
徐鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010450068.5A priority Critical patent/CN111639078A/en
Publication of CN111639078A publication Critical patent/CN111639078A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2445Data retrieval commands; View definitions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations

Abstract

The application discloses a data query method and device, electronic equipment and a readable storage medium, and relates to the technical field of big data. The specific implementation scheme is as follows: acquiring a query request parameter input by a user; analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table; generating a plurality of inner-layer SQL statements according to the plurality of logic tables and the field information included by each logic table; splicing the plurality of inner-layer SQL sentences to obtain target SQL sentences; and carrying out data query by using the target SQL statement. According to the scheme in the application, the timeliness of data query can be improved, and the requirement of rapid business iteration is met.

Description

Data query method and device, electronic equipment and readable storage medium
Technical Field
The application relates to the technical field of computers, in particular to the technical field of big data.
Background
At present, a content ecology B end stores massive data such as hundred-family articles, authors, task systems, end logs and the like. If the desired data is extracted and obtained from the massive data, a professional is required to develop a sequential combined (MapReduce, MR) task or a computation engine spark task for different requirements to query the data. In this case, because the development cycle of the MR task and the like is long, the timeliness of data query is poor, and the requirement of fast iteration of the service cannot be met.
Disclosure of Invention
The disclosure provides a data query method, a data query device, an electronic device and a readable storage medium.
According to an aspect of the present disclosure, there is provided a data query method including:
acquiring a query request parameter input by a user;
analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table;
generating a plurality of inner-layer SQL statements according to the plurality of logic tables and the field information included by each logic table;
splicing the plurality of inner-layer SQL sentences to obtain target SQL sentences;
and carrying out data query by using the target SQL statement.
Therefore, the target SQL statement can be generated based on the query request parameter input by the user, and the data query is carried out by utilizing the target SQL statement, so that research personnel do not need to develop special tasks to query the data, the timeliness of the data query is improved, and the requirement of rapid business iteration is met.
According to another aspect of the present disclosure, there is provided an apparatus for data query, including:
the acquisition module is used for acquiring the query request parameters input by a user;
the analysis module is used for analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table;
the generating module is used for generating a plurality of inner-layer SQL statements according to the plurality of logic tables and the field information included by each logic table;
the splicing module is used for splicing the plurality of inner-layer SQL sentences to obtain target SQL sentences;
and the query module is used for carrying out data query by utilizing the target SQL statement.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data query method as described above.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the data query method as described above.
According to the technology of the application, the problem that the timeliness of the existing data query method is poor is solved, and the timeliness of data query is improved, so that the requirement of quick service iteration is met.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is an overall architecture diagram of a data query system in an embodiment of the present application;
FIG. 2 is a schematic diagram of a hierarchical structure of a data query system in an embodiment of the present application;
FIG. 3 is a flow chart of a data query method in an embodiment of the present application;
FIG. 4 is a schematic diagram of an SQL splicing process in the embodiment of the present application;
FIG. 5 is a flow chart of the construction of the inner SQL in the embodiment of the present application;
FIG. 6 is a flow chart of the construction of outer SQL in the embodiment of the present application;
FIG. 7 is a schematic structural diagram of a data query device in an embodiment of the present application;
fig. 8 is a block diagram of an electronic device for implementing the data query method according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. In the description and in the claims "and/or" means at least one of the connected objects.
To facilitate understanding of the embodiments of the present application, the following is first explained.
Referring to fig. 1, fig. 1 is an overall architecture diagram of a data query system in an embodiment of the present application. As shown in fig. 1, the data query system in the embodiment of the present application mainly relates to the following parts:
a data warehouse: business data, a buried point log, a Distributed File System (HDFS) data, a File Transfer Protocol (FTP) data, and/or other data may be obtained from a data source through an Extract-Transform-Load (ETL) technique, a Transform (Transform), and a Load (Load) process. The type of data warehouse may be selected from, but is not limited to: turing, UDW (massively parallel processing data warehouse), Alluxio (virtual distributed storage system with memory as the center), a hundredth data warehouse Palo, ES (elastic search), and the like.
Front end interface: such as a web page (web) User Interface (UI). The user may check the dimension and index selection box on the corresponding web UI for business topics (such as author topics, article topics, author figures, etc.). Wherein the dimensions represent characteristics of a transactional phenomenon; for example, in the theme of a hundred-homed article, the dimensions may be selected as: article identification (id), article type, article title, article classification, etc. The index represents a unit and a method for measuring the development degree of the transaction, and can be obtained through aggregation statistics such as addition, averaging and the like; for example, in the theme of a hundred-jia article, the indexes can be selected as follows: reading amount, distribution amount, recommendation amount, comment amount, and the like.
Server (Server): a list of parameters in a query request entered by a user via a web UI may be received. The Server side can realize the following management: 1) user management, such as permission verification through user input information; 2) the method comprises the following steps of (1) associating condition management and metadata management, wherein the method mainly relates to a Structured Query Language (SQL) construction process; 3) task management (also referred to as job management) mainly involves query task submission, query status polling, and the like. Optionally, after obtaining the query parameter list input by the user, the Server may execute the following process: user permission verification, concurrency control, SQL construction, engine routing, engine adaptation based on user input information, query task submission (Submit), query status polling, query result unloading and the like.
An execution engine: the system is designed for multiple engines, including but not limited to a Turing engine, a Zeppelin engine, a Pingo4 engine, a Presto engine, a JDBC engine, etc. The Server can inquire (Query) the required data from the data warehouse through an adaptive engine.
Referring to fig. 2, fig. 2 is a schematic diagram of a hierarchical structure of a data query system according to an embodiment of the present application. As shown in fig. 2, the data query system in the present application is mainly divided into the following four levels:
1. and (3) a data layer: the data layer mainly comprises various modules such as user information, metadata information, task (job) information, query result information and the like. The user information records the identity of the user, including departments, teams and the like, and provides data support for authority control. The metadata information records the theme information, the dependent business table and other information divided by the system query platform, and the SQL is constructed by translating the parameter field input by the user into an SQL statement according to the metadata information. The jobinformation records the whole life cycle of the query task from submission to termination, and accordingly the task can be traced. The query result information records the returned result of the query and is finally displayed to the user at the front end.
2. An engine layer: the system can cover 4 query engines, for example, and can meet the requirements of different scenes such as offline query, ad hoc query and the like. The engines are pluggable, each query engine is packaged into service independently, and expansibility is strong. The 4 query engines can be selected from, but are not limited to:
1) and the zeppelin engine is used for sharing the same spark queue resource by all submitted tasks, so that the resource use is controllable, and the data output is in a minute level.
2) The pingo4 engine is an internal engine in certain scenes, different tasks share spark queue resources independently, the task query efficiency is high, the resource use condition is controlled by limiting the concurrent number of the tasks, and data are output in a minute level.
3) the training engine is an internal engine of some scenes, and the bottom storage has extremely high query efficiency and data output second level due to the indexing and caching mechanism. The index can adopt an external index mode, has no invasion to a file format, supports two index types of B + Tree and BitMap (BitMap), and can be cached in an off-heap memory of the Executor end. The cache may be a column-granular cache, supporting both passive loading and active loading strategies.
4) The palo engine supports the construction of a data cube, is suitable for a statistical analysis scene in ad hoc query, and can return a query result within a second level.
3. And (3) a service layer: the system can encapsulate public service and support external quick access. The service layer may implement the following functions:
1) and SQL splicing, namely after a user selects required dimensions and indexes on a front-end interface and submits a task, a server end acquires a parameter list from request parameters and carries out SQL splicing. The SQL splicing can adopt a divide and conquer idea, and is split into 1 to a plurality of 'inner layer splices' and final 'outer layer splices', and an executable target SQL statement is assembled.
2) And selecting a routing choice, and selecting different query engines to query data according to conditions of the identity information, the query scene, the service table meta-information and the like of the user.
3) And the authority management is carried out on the service table aiming at different user roles.
4. An application layer: the system can open an Application Programming Interface (API), support data query by submitting SQL statements, support sampling requirements such as direct sampling, random sampling and weighted sampling and statistical requirements such as reeling-up and drilling-down by checking front-end dimensions and indexes.
In order to solve the problem of poor timeliness of the existing data query method, the embodiment of the application provides a method for querying data through a constructed executable SQL statement, which is described as follows.
Referring to fig. 3, fig. 3 is a flowchart of a data query method provided by an embodiment of the present application, where the method is applied to an electronic device, and as shown in fig. 3, the method includes the following steps:
step 301: and acquiring the query request parameters input by the user.
In this embodiment, the user can check the dimension and index selection box on the front-end page (or front-end interface) to input the query request parameters. Different front-end pages can be provided for different business topics, such as author topics, article topics, author portrayal and the like. Each front-end page may include several dimensions and index selection boxes. Wherein the dimensions represent characteristics of a transactional phenomenon; for example, in the theme of a hundred-homed article, the dimensions may be selected as: article identification (id), article type, article title, article classification, etc. The index represents a unit and a method for measuring the development degree of the transaction, and can be obtained through aggregation statistics such as addition, averaging and the like; for example, in the theme of a hundred-jia article, the indexes can be selected as follows: reading amount, distribution amount, recommendation amount, comment amount, and the like.
Step 302: and analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table.
In this embodiment, the logic table may be preset, generally represents a logic relationship of the query request parameter, and is related to the query request parameter input by the user. The field information represents a query request parameter field, and the query request parameter field includes, for example, a dimension field and an index field. And the dimension field and the index field can be used as the presentation field and the filter field.
For example, in the subject page of the article, if the user checks the article B end ID, the article C end ID and the article text sending time selection box in the dimension field, and selects the text sending time range from 2020-05-04 to 2020-05-05 days, and checks the reading and distribution amount selection box in the index field, the corresponding logical table can be determined to be the annular _ dim logical table by parsing the query request parameter field, such as the following parameter examples { "k" { "annular _ dim-publish _ at", "startTime": 2020-05-04"," end ": 2020-05-5" }, and the k value "annular _ dim-publish _ at" represents the publish _ at field in the annular _ dim logical table, and the values of start time and end respectively represent the start and stop times.
Step 303: and generating a plurality of inner-layer SQL statements according to the plurality of logic tables and the field information included by each logic table.
In this embodiment, each logic table may correspond to one inner-layer SQL statement, that is, to one inner-layer SQL statement splicing task. If the query request parameters entered by the user in the query scenario involve multiple logical tables, multiple inner-level SQL statements may be constructed by for-loop (i.e., performing a particular number of loop iterations). Each inner-layer SQL statement can be obtained by splicing based on relevant field information in the corresponding logic table.
Step 304: and splicing the plurality of inner-layer SQL sentences to obtain a target SQL sentence.
Optionally, in this embodiment, multiple inner-layer SQL statements may be directly spliced to obtain a required executable target SQL statement. The target SQL statement can be called an outer SQL statement and is obtained by splicing a plurality of inner SQL statements.
Step 305: and carrying out data query by using the target SQL statement.
According to the data query method in the embodiment of the application, the target SQL statement can be generated based on the query request parameters input by the user, and the data query is carried out by using the target SQL statement, so that research personnel do not need to develop special tasks to query data, the timeliness of the data query is improved, and the requirement of rapid business iteration is met. Furthermore, zero research and development cost and zero communication cost can be realized, the data acquisition period can be shortened from a day level to a minute level, and the data timeliness is greatly improved.
In the embodiment of the application, in order to meet different query requirements, a plurality of query engines can exist, and a target query engine is selected from the query engines to query required data. Alternatively, the target query engine may be selected from the plurality of query engines according to user input information (such as user identity information, required service information, and the like). Further, the process of performing data query by using the target SQL statement may include: and performing data query by using the target SQL statement through the target query engine. In this way, the accuracy of data query can be improved by querying the data through the selected target query engine.
In this embodiment, each of the plurality of logic tables may correspond to an inner SQL statement. The process of generating a plurality of inner-layer SQL statements in step 303 may include:
for each logical table and the field information included in the logical table, respectively executing the following processes:
associating the logic table with prestored metadata information to obtain an actual target table name, and associating the field information with the prestored metadata information to obtain an actual target field name;
and generating an inner-layer SQL statement corresponding to the logic table according to the target table name, the target field name and the filtering condition information in the query request parameters.
The pre-stored metadata information may be data layer metadata information, and records pre-divided theme information, dependent service tables, and other information. The target field names may be understood as field names that need to be presented. Therefore, by means of the prestored metadata information, the query request parameters input by the user can be translated into the SQL sentences, and the executable SQL sentences meeting the requirements are obtained.
Optionally, the process of generating the inner-layer SQL statement corresponding to the logic table according to the target table name, the target field name, and the filter condition information in the query request parameter may include: storing the target field names into a first linked list, and storing the filter condition information in the query request parameters into a second linked list; and generating the inner-layer SQL statement according to the first linked list, the second linked list and the target table name. The first linked list can be selected as a selectcolumnList linked list, and the second linked list can be selected as a whereList linked list. Therefore, the field element information can be simply and effectively spliced through the pointer link sequence in the linked list, and the required inner-layer SQL statement is obtained.
Optionally, the process of splicing the plurality of inner-layer SQL statements to obtain the target SQL statement may include: determining association condition information among the logic tables according to the logic tables and the pre-stored metadata information; and splicing the plurality of inner-layer SQL sentences according to the target field names and the associated condition information among the logic tables to obtain the target SQL sentences. When the target SQL statement is obtained, firstly, the corresponding multiple inner-layer SQL statements can be subjected to associated assembly based on the associated condition information among the logic tables; and then processing the related assembled SQL statement based on the target field name, namely the field to be displayed, so as to obtain the final target SQL statement.
In one embodiment, when performing association assembly on a plurality of inner-layer SQL statements, Join on may be used for implementation. For example, as shown in fig. 4, if three inner-level SQL statements are generated: SELECT a, … FROM a, SELECT B, … FROM B, SELECT C, … FROM C, then the outer SQL statement (i.e. the target SQL statement) can be obtained using the following Join expressions:
SELECT a.x,b.y,c.z
FROM
(SELECT a,…FROM A)a
JOIN
(SELECT b,…FROM B)b
ON a.x ═ b.x — this is the association condition of the first two inner-level SQL statements;
JOIN
(SELECT c,…FROM C)c
ON b.y ═ c.y — this is the association condition for the last two inner-level SQL statements.
In one embodiment, the process of obtaining the outer SQL statement at the server side may be: first, the inner layer splicing process (as shown in fig. 5): 1) a server end obtains a query request parameter input by a user based on collectParam operation; 2) analyzing the query request parameters to obtain a plurality of logic tables and field information used by the query, wherein each table corresponds to one inner-layer SQL splicing task, and if the query scene relates to a plurality of tables, a plurality of inner-layer SQL statements are constructed through for circulation; 3) associating each logic table and field information thereof with metadata information of a data layer to obtain an actual target table name and a target field name in a data warehouse; 4) storing the target field name into a selectColumList linked list, and writing the target field name into an outFielddNameBuiled linked list for outer layer splicing; 5) storing the query filtering condition information in the acquired query request parameters into a whereList linked list; 6) generating an SQL statement according to the selectColumList linked list, the whereList linked list and the target table name; and storing the SQL splicing result into an innerSQLList linked list. Second, the outer layer splicing process (as shown in fig. 6): 7) acquiring an inner-layer splicing result, namely an inner-layer SQL statement, from an innerSQLList linked list; 8) acquiring a target field name, namely a field needing to be displayed, from an outofield name linked list; 9) inquiring the associated condition information among the logic tables according to the logic tables and the metadata information of the data layer; 10) and performing association assembly ON each inner-layer SQL statement by adopting JOIN ON based ON the fields to be displayed and the association condition information among the logic tables to generate the final query SQL.
The present application will be described in detail with reference to specific examples.
In this particular example, it is assumed that the reading volume and the distribution volume generated by the articles published from 2020-05-04 to 2020-05-05 are counted. The process of generating query SQL specifically may be:
s1: the user checks the article B end ID, the article C end ID and the article text sending time selection box in the dimension field of the article subject page, the text sending time range is selected from 2020-05-04 to 2020-05-05 days, the reading amount and distribution amount selection boxes are checked in the index field, and a query request is submitted, and the request parameters can be as follows:
Figure BDA0002507195380000091
s2: the server side analyzes the request parameter, the parameter examples are as follows { "k", "angle _ dim-publish _ at", "startTime": 2020-05-04"," endTime ": 2020-05-5" }, then the k value "angle _ dim-publish _ at" represents the publish _ at field in the angle _ dim logical table, and the values of startTime and endTime respectively represent start and stop times, so as to determine that the logical table of the query includes angle _ dim and angle _ fac, and the required fields respectively include: bid, cid, publish _ at, and view _ count, click _ pv.
S3: the fields involved in the query are grouped by logical table name, which in this example includes two logical tables, namely, aromatic _ dim and aromatic _ fac. Therefore, the area _ dim logical table and the contained rid, bid, cid, publish _ at fields are divided into one group, and the area _ fac logical table and the contained rid, print _ view _ count, click _ pv fields are divided into one group.
S4: and selecting the fields of the logical tables of the annular _ dim, rid, bid, cid and publish _ at to be associated with the metadata information of the data layer, and acquiring the actual table name and field name in the data warehouse.
S5: and storing the actual field names into a selectColumList linked list, and writing the actual field names into an outfield name building linked list for outer layer splicing.
S6: the publish _ at filter condition is stored in the whereList linked list.
S7: and generating an sql query statement according to the selectColumList linked list, the whereList linked list and the actual table name, and storing the current sql splicing result into the innerSQLList linked list.
S8: all packets are traversed, repeating S5 through S7. At this point, the internal sql splice is complete.
For example, the inner SQL concatenation results are as follows:
(1)
SELECT rid,bid,cid,publish_at
FROM bjh_data.bjh_dim_essay_df
WHERE event_day=20200508
AND to_date(publish_at)>='2020-05-04'
AND to_date(publish_at)<='2020-05-05'
(2)
SELECT rid,reprint_view_count,click_pv
FROM bjh_data.bjh_essay_pre_aggregated
WHERE event_day=20200508
s9: and acquiring the inner SQL from the innerSQLList linked list.
S10: obtaining the fields which need to be displayed finally from outofieeldnamebuilt, including: bid, a cid, a public _ at; print _ view _ count, b.
S11: and inquiring the associated condition information of each logic table according to the inner SQL table and the metadata information of the data layer. The association condition is rid the same in this example.
S12: and performing association assembly (such as by using a JOIN ON expression) ON each inner-layer SQL statement based ON the fields to be displayed and the association condition information among the logic tables to generate final query SQL.
For example, the outer query SQL in this example can be as follows:
Figure BDA0002507195380000111
referring to fig. 7, fig. 7 is a schematic structural diagram of a data query device according to an embodiment of the present application, and as shown in fig. 7, the data query device 70 includes:
an obtaining module 71, configured to obtain a query request parameter input by a user;
the analysis module 72 is configured to analyze the query request parameter to obtain a plurality of logic tables required by the query and field information included in each logic table;
a generating module 73, configured to generate a plurality of inner-layer SQL statements according to the plurality of logical tables and field information included in each of the logical tables;
the splicing module 74 is configured to splice the multiple inner-layer SQL statements to obtain a target SQL statement;
and the query module 75 is configured to perform data query by using the target SQL statement.
Optionally, each of the plurality of logical tables corresponds to an inner SQL statement; the generating module 73 includes:
an execution unit, configured to perform the following processes for each logical table and field information included in the logical table, respectively:
associating the logic table with prestored metadata information to obtain an actual target table name, and associating the field information with the prestored metadata information to obtain an actual target field name;
and generating an inner-layer SQL statement corresponding to the logic table according to the target table name, the target field name and the filtering condition information in the query request parameters.
Optionally, the execution unit includes:
the storage subunit is used for storing the target field names into a first linked list and storing the filter condition information in the query request parameters into a second linked list;
and the generating subunit is used for generating the inner-layer SQL statement according to the first linked list, the second linked list and the target table name.
Optionally, the splicing module 74 includes:
a determining unit, configured to determine association condition information between the plurality of logic tables according to the plurality of logic tables and the pre-stored metadata information;
and the splicing unit is used for splicing the plurality of inner-layer SQL sentences according to the target field names and the associated condition information among the logic tables to obtain the target SQL sentences.
Optionally, the data query device 70 further includes:
the selection module is used for selecting a target query engine from the plurality of query engines according to the user input information;
the query module 75 is specifically configured to:
and performing data query by using the target SQL statement through the target query engine.
It can be understood that the data query device 70 according to the embodiment of the present application can implement the processes implemented in the method embodiment shown in fig. 3 and achieve the same beneficial effects, and for avoiding repetition, the details are not repeated here.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 8 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 8, the electronic apparatus includes: one or more processors 801, memory 802, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 8 illustrates an example of a processor 801.
The memory 802 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the data query method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the data query method provided herein.
The memory 802, as a non-transitory computer readable storage medium, may be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (for example, the obtaining module 71, the parsing module 72, the generating module 73, the splicing module 74, and the query module 75 shown in fig. 7) corresponding to the data query method in the embodiment of the present application. The processor 801 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 802, that is, implements the data query method in the above-described method embodiments.
The memory 802 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device for data inquiry, and the like. Further, the memory 802 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 802 optionally includes memory located remotely from the processor 801, which may be connected via a network to an electronic device for implementing the data query method. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device for implementing the data query method may further include: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, and are exemplified by a bus in fig. 8.
The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus implementing the data query method, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 804 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the target SQL statement can be generated based on the query request parameters input by the user, and the data query is carried out by utilizing the target SQL statement, so that research personnel do not need to develop special tasks to query the data, the timeliness of the data query is improved, and the requirement of rapid business iteration is met.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A method of data query, comprising:
acquiring a query request parameter input by a user;
analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table;
generating a plurality of inner-layer Structured Query Language (SQL) sentences according to the plurality of logic tables and the field information included by each logic table;
splicing the plurality of inner-layer SQL sentences to obtain target SQL sentences;
and carrying out data query by using the target SQL statement.
2. The method of claim 1, wherein each of the plurality of logical tables corresponds to an inner SQL statement; the generating a plurality of inner-layer SQL statements according to the plurality of logical tables and the field information included in each logical table comprises:
for each logical table and the field information included in the logical table, respectively executing the following processes:
associating the logic table with prestored metadata information to obtain an actual target table name, and associating the field information with the prestored metadata information to obtain an actual target field name;
and generating an inner-layer SQL statement corresponding to the logic table according to the target table name, the target field name and the filtering condition information in the query request parameters.
3. The method of claim 2, wherein the generating the inner-layer SQL statement corresponding to the logical table according to the target table name, the target field name, and the filter condition information in the query request parameter comprises:
storing the target field names into a first linked list, and storing the filter condition information in the query request parameters into a second linked list;
and generating the inner-layer SQL statement according to the first linked list, the second linked list and the target table name.
4. The method of claim 2, wherein the splicing the plurality of inner-level SQL statements to obtain a target SQL statement comprises:
determining association condition information among the logic tables according to the logic tables and the pre-stored metadata information;
and splicing the plurality of inner-layer SQL sentences according to the target field names and the associated condition information among the logic tables to obtain the target SQL sentences.
5. The method of claim 1, further comprising:
selecting a target query engine from a plurality of query engines according to user input information;
the data query by using the target SQL statement comprises the following steps:
and performing data query by using the target SQL statement through the target query engine.
6. An apparatus for data querying, comprising:
the acquisition module is used for acquiring the query request parameters input by a user;
the analysis module is used for analyzing the query request parameters to obtain a plurality of logic tables required by the query and field information included by each logic table;
the generating module is used for generating a plurality of inner-layer SQL statements according to the plurality of logic tables and the field information included by each logic table;
the splicing module is used for splicing the plurality of inner-layer SQL sentences to obtain target SQL sentences;
and the query module is used for carrying out data query by utilizing the target SQL statement.
7. The apparatus of claim 6, wherein each of the plurality of logical tables corresponds to an inner SQL statement; the generation module comprises:
an execution unit, configured to perform the following processes for each logical table and field information included in the logical table, respectively:
associating the logic table with prestored metadata information to obtain an actual target table name, and associating the field information with the prestored metadata information to obtain an actual target field name;
and generating an inner-layer SQL statement corresponding to the logic table according to the target table name, the target field name and the filtering condition information in the query request parameters.
8. The apparatus of claim 7, wherein the execution unit comprises:
the storage subunit is used for storing the target field names into a first linked list and storing the filter condition information in the query request parameters into a second linked list;
and the generating subunit is used for generating the inner-layer SQL statement according to the first linked list, the second linked list and the target table name.
9. The apparatus of claim 7, wherein the splicing module comprises:
a determining unit, configured to determine association condition information between the plurality of logic tables according to the plurality of logic tables and the pre-stored metadata information;
and the splicing unit is used for splicing the plurality of inner-layer SQL sentences according to the target field names and the associated condition information among the logic tables to obtain the target SQL sentences.
10. The apparatus of claim 6, further comprising:
the selection module is used for selecting a target query engine from the plurality of query engines according to the user input information;
the query module is specifically configured to:
and performing data query by using the target SQL statement through the target query engine.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202010450068.5A 2020-05-25 2020-05-25 Data query method and device, electronic equipment and readable storage medium Pending CN111639078A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010450068.5A CN111639078A (en) 2020-05-25 2020-05-25 Data query method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010450068.5A CN111639078A (en) 2020-05-25 2020-05-25 Data query method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN111639078A true CN111639078A (en) 2020-09-08

Family

ID=72331331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010450068.5A Pending CN111639078A (en) 2020-05-25 2020-05-25 Data query method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111639078A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364051A (en) * 2020-11-25 2021-02-12 腾讯科技(深圳)有限公司 Data query method and device
CN112364025A (en) * 2020-11-30 2021-02-12 中国银行股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112416964A (en) * 2020-11-17 2021-02-26 深圳依时货拉拉科技有限公司 Data processing method, device and system, computer equipment and computer readable storage medium
CN112506948A (en) * 2020-12-03 2021-03-16 中国人寿保险股份有限公司 Index query method of service information and related equipment
CN112860728A (en) * 2021-02-19 2021-05-28 深圳市极致科技股份有限公司 Method and device for adding display columns to data query table based on user environment
CN112860727A (en) * 2021-02-20 2021-05-28 平安科技(深圳)有限公司 Data query method, device, equipment and medium based on big data query engine
CN112905627A (en) * 2021-03-23 2021-06-04 金岭教育科技(北京)有限公司 Data processing method, data processing device, computer equipment and storage medium
CN112988781A (en) * 2021-02-02 2021-06-18 北京金山云网络技术有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN113064914A (en) * 2021-04-22 2021-07-02 中国工商银行股份有限公司 Data extraction method and device
CN113282569A (en) * 2021-05-06 2021-08-20 南京苏宁软件技术有限公司 System architecture, design method and data query method for data application
CN113468208A (en) * 2021-07-19 2021-10-01 网易(杭州)网络有限公司 Method and device for generating data query statement, server and storage medium
CN113821533A (en) * 2021-09-30 2021-12-21 北京鲸鹳科技有限公司 Data query method, device, equipment and storage medium
CN114020765A (en) * 2021-11-04 2022-02-08 广州易方信息科技股份有限公司 Performance data extraction method and device, computer equipment and storage medium
CN115168408A (en) * 2022-08-16 2022-10-11 北京永洪商智科技有限公司 Query optimization method, device, equipment and storage medium based on reinforcement learning
WO2023029752A1 (en) * 2021-08-31 2023-03-09 深圳市兆珑科技有限公司 Data query method and apparatus, server, and computer-readable storage medium
CN116775680A (en) * 2023-05-31 2023-09-19 北京龙软科技股份有限公司 SQL-based method for operating MongoDB database

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050262048A1 (en) * 2004-05-05 2005-11-24 International Business Machines Corporation Dynamic database access via standard query language and abstraction technology
CN103093000A (en) * 2013-02-25 2013-05-08 用友软件股份有限公司 Database query modeling system and database query modeling method
CN106649828A (en) * 2016-12-29 2017-05-10 中国银联股份有限公司 Data query method and system
CN107609130A (en) * 2017-09-18 2018-01-19 链家网(北京)科技有限公司 A kind of method and server for selecting data query engine
CN107798026A (en) * 2016-09-05 2018-03-13 北京京东尚科信息技术有限公司 Data query method and apparatus
CN108197277A (en) * 2018-01-09 2018-06-22 福建星瑞格软件有限公司 A kind of unified data base administration querying method and device
CN109033123A (en) * 2018-05-31 2018-12-18 康键信息技术(深圳)有限公司 Querying method, device, computer equipment and storage medium based on big data
CN109947788A (en) * 2017-10-30 2019-06-28 北京京东尚科信息技术有限公司 Data query method and apparatus
CN110704479A (en) * 2019-09-12 2020-01-17 新华三大数据技术有限公司 Task processing method and device, electronic equipment and storage medium
CN110717319A (en) * 2019-09-24 2020-01-21 车智互联(北京)科技有限公司 Self-service report generation method and device, computing equipment and system
CN111177174A (en) * 2018-11-09 2020-05-19 百度在线网络技术(北京)有限公司 SQL statement generation method, device, equipment and computer readable storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050262048A1 (en) * 2004-05-05 2005-11-24 International Business Machines Corporation Dynamic database access via standard query language and abstraction technology
CN103093000A (en) * 2013-02-25 2013-05-08 用友软件股份有限公司 Database query modeling system and database query modeling method
CN107798026A (en) * 2016-09-05 2018-03-13 北京京东尚科信息技术有限公司 Data query method and apparatus
CN106649828A (en) * 2016-12-29 2017-05-10 中国银联股份有限公司 Data query method and system
CN107609130A (en) * 2017-09-18 2018-01-19 链家网(北京)科技有限公司 A kind of method and server for selecting data query engine
CN109947788A (en) * 2017-10-30 2019-06-28 北京京东尚科信息技术有限公司 Data query method and apparatus
CN108197277A (en) * 2018-01-09 2018-06-22 福建星瑞格软件有限公司 A kind of unified data base administration querying method and device
CN109033123A (en) * 2018-05-31 2018-12-18 康键信息技术(深圳)有限公司 Querying method, device, computer equipment and storage medium based on big data
CN111177174A (en) * 2018-11-09 2020-05-19 百度在线网络技术(北京)有限公司 SQL statement generation method, device, equipment and computer readable storage medium
CN110704479A (en) * 2019-09-12 2020-01-17 新华三大数据技术有限公司 Task processing method and device, electronic equipment and storage medium
CN110717319A (en) * 2019-09-24 2020-01-21 车智互联(北京)科技有限公司 Self-service report generation method and device, computing equipment and system

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416964A (en) * 2020-11-17 2021-02-26 深圳依时货拉拉科技有限公司 Data processing method, device and system, computer equipment and computer readable storage medium
CN112364051A (en) * 2020-11-25 2021-02-12 腾讯科技(深圳)有限公司 Data query method and device
CN112364051B (en) * 2020-11-25 2024-03-15 腾讯科技(深圳)有限公司 Data query method and device
CN112364025A (en) * 2020-11-30 2021-02-12 中国银行股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112364025B (en) * 2020-11-30 2023-09-22 中国银行股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112506948A (en) * 2020-12-03 2021-03-16 中国人寿保险股份有限公司 Index query method of service information and related equipment
CN112988781A (en) * 2021-02-02 2021-06-18 北京金山云网络技术有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN112860728A (en) * 2021-02-19 2021-05-28 深圳市极致科技股份有限公司 Method and device for adding display columns to data query table based on user environment
CN112860727A (en) * 2021-02-20 2021-05-28 平安科技(深圳)有限公司 Data query method, device, equipment and medium based on big data query engine
CN112860727B (en) * 2021-02-20 2024-01-12 平安科技(深圳)有限公司 Data query method, device, equipment and medium based on big data query engine
CN112905627A (en) * 2021-03-23 2021-06-04 金岭教育科技(北京)有限公司 Data processing method, data processing device, computer equipment and storage medium
CN113064914A (en) * 2021-04-22 2021-07-02 中国工商银行股份有限公司 Data extraction method and device
CN113282569A (en) * 2021-05-06 2021-08-20 南京苏宁软件技术有限公司 System architecture, design method and data query method for data application
CN113282569B (en) * 2021-05-06 2022-11-11 南京苏宁软件技术有限公司 System architecture, design method and data query method for data application
CN113468208A (en) * 2021-07-19 2021-10-01 网易(杭州)网络有限公司 Method and device for generating data query statement, server and storage medium
WO2023029752A1 (en) * 2021-08-31 2023-03-09 深圳市兆珑科技有限公司 Data query method and apparatus, server, and computer-readable storage medium
CN113821533B (en) * 2021-09-30 2023-09-08 北京鲸鹳科技有限公司 Method, device, equipment and storage medium for data query
CN113821533A (en) * 2021-09-30 2021-12-21 北京鲸鹳科技有限公司 Data query method, device, equipment and storage medium
CN114020765A (en) * 2021-11-04 2022-02-08 广州易方信息科技股份有限公司 Performance data extraction method and device, computer equipment and storage medium
CN115168408A (en) * 2022-08-16 2022-10-11 北京永洪商智科技有限公司 Query optimization method, device, equipment and storage medium based on reinforcement learning
CN116775680A (en) * 2023-05-31 2023-09-19 北京龙软科技股份有限公司 SQL-based method for operating MongoDB database

Similar Documents

Publication Publication Date Title
CN111639078A (en) Data query method and device, electronic equipment and readable storage medium
JP7333424B2 (en) Graph generation for distributed event processing systems
US11216302B2 (en) Modifying task dependencies at worker nodes using precompiled libraries
CN107077691B (en) Age-based policy for determining database cache hits
JP6521973B2 (en) Pattern matching across multiple input data streams
US9454558B2 (en) Managing an index of a table of a database
US10002170B2 (en) Managing a table of a database
WO2016049460A1 (en) Declarative language and visualization system for recommended data transformations and repairs
WO2010042238A1 (en) System and method for data warehousing and analytics on a distributed file system
US10599654B2 (en) Method and system for determining unique events from a stream of events
US9177037B2 (en) In-memory runtime for multidimensional analytical views
Amghar et al. Storing, preprocessing and analyzing tweets: finding the suitable noSQL system
CN112395333B (en) Method, device, electronic equipment and storage medium for checking data abnormality
Zhu et al. Building Big Data and Analytics Solutions in the Cloud
CN115803729A (en) Direct data loading of middleware generated records
US9323817B2 (en) Distributed storage system with pluggable query processing
Srivastava Learning Elasticsearch 7. x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch (English Edition)
Li et al. Fedsa: A data federation platform for law enforcement management
Steinkamp et al. HyDash: A dashboard for real-time business intelligence based on the hyPer Main memory database system
US20240037114A1 (en) Distributed data processing using embedded hermetic and deterministic language
US10956416B2 (en) Data schema discovery with query optimization
US20240119045A1 (en) Systems and Methods for Intelligent Database Report Generation
Bo Querying JSON streams
Rahman Data analysis and Rhadoop: case studies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination