CN112487036A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN112487036A
CN112487036A CN202011397493.9A CN202011397493A CN112487036A CN 112487036 A CN112487036 A CN 112487036A CN 202011397493 A CN202011397493 A CN 202011397493A CN 112487036 A CN112487036 A CN 112487036A
Authority
CN
China
Prior art keywords
data processing
data
hive
target sql
configuration page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011397493.9A
Other languages
Chinese (zh)
Inventor
季振宇
顾晨波
赵文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guotai Epoint Software Co Ltd
Original Assignee
Guotai Epoint Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guotai Epoint Software Co Ltd filed Critical Guotai Epoint Software Co Ltd
Priority to CN202011397493.9A priority Critical patent/CN112487036A/en
Publication of CN112487036A publication Critical patent/CN112487036A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data processing method and a data processing device, which belong to the technical field of computers, and the method comprises the following steps: displaying a visual configuration page; performing data modeling through a data import control in a configuration page to import data to be processed into a hive library; configuring a data processing model according to data processing requirements through a data processing control in a configuration page to obtain target SQL; when the execution requirement of the target SQL is online execution, using presto to execute the target SQL; when the execution requirement of the target SQL is timing execution, using hive to execute the target SQL; the problem that the data processing mode of hive SQL execution is single and the data processing requirement of a user cannot be met is solved; the automatic switching between hive SQL and presto SQL can be realized; to meet the data processing requirements of the user.

Description

Data processing method and device
Technical Field
The application relates to a data processing method and device, and belongs to the technical field of computers.
Background
hive is a data warehouse tool based on Hadoop. Hive is used for data extraction, transformation and loading, and is a mechanism capable of storing, querying and analyzing large-scale data stored in Hadoop. The hive data warehouse tool can map the structured data file into a database table, provide SQL query function and convert SQL sentences into MapReduce tasks for execution.
However, hive is not suitable for online transaction processing, nor provides a real-time query function, and in this case, the real-time online data processing requirement cannot be met.
Disclosure of Invention
The application provides a data processing method and device, which can solve the problems that the data processing mode of hive SQL execution is single, and the data processing requirements of users can not be met. The application provides the following technical scheme:
in a first aspect, a data processing method is provided, the method including:
displaying a visual configuration page;
performing data modeling through a data import control in the configuration page to import data to be processed into a hive library;
configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain target SQL;
when the execution requirement of the target SQL is online execution, executing the target SQL by using presto;
when the execution requirement of the target SQL is timing execution, using hive to execute the target SQL.
Optionally, the data modeling by the data import control in the configuration page includes:
configuring database connection through the data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query;
and/or the presence of a gas in the gas,
configuring a mapping relation through the data import control, and importing data from a relational library to the hive library;
and/or the presence of a gas in the gas,
the mapping relation is configured through the data import control, and data are imported from an unstructured file to the hive library;
and/or the presence of a gas in the gas,
importing the library table resource through the data import control; and when the base table resources support the configuration of the timing task, timing and synchronizing data to the hive base.
Optionally, the configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL, includes:
displaying fields in an input stream in the configuration page; and receiving the selection operation of the user on the field to obtain the target SQL comprising the selected field.
Optionally, the configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL, includes:
and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
Optionally, the configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL, includes:
when the default function of the hive library does not support the data processing requirement, acquiring a self-defined function packet, wherein the self-defined function packet comprises a self-defined function supporting the data processing requirement;
registering the user-defined function packet to the hive library, and triggering and executing the step of configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain a target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
In a second aspect, there is provided a data processing apparatus, the apparatus comprising:
the page display module is used for displaying a visual configuration page;
the data modeling module is used for carrying out data modeling through the data import control in the configuration page so as to import the data to be processed into the hive library;
the model establishing module is used for configuring a data processing model according to data processing requirements through the data processing control in the configuration page to obtain target SQL;
the first execution module is used for executing the target SQL by using presto when the execution requirement of the target SQL is online execution;
and the second execution module is used for executing the target SQL by using hive when the execution requirement of the target SQL is timing execution.
Optionally, the data modeling module is configured to:
configuring database connection through the data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query;
and/or the presence of a gas in the gas,
configuring a mapping relation through the data import control, and importing data from a relational library to the hive library;
and/or the presence of a gas in the gas,
the mapping relation is configured through the data import control, and data are imported from an unstructured file to the hive library;
and/or the presence of a gas in the gas,
importing the library table resource through the data import control; and when the base table resources support the configuration of the timing task, timing and synchronizing data to the hive base.
Optionally, the model building module is configured to:
displaying fields in an input stream in the configuration page; and receiving the selection operation of the user on the field to obtain the target SQL comprising the selected field.
Optionally, the model building module is configured to:
and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
Optionally, the model building module is configured to:
when the default function of the hive library does not support the data processing requirement, acquiring a self-defined function packet, wherein the self-defined function packet comprises a self-defined function supporting the data processing requirement;
registering the user-defined function packet to the hive library, and triggering and executing the step of configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain a target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
The beneficial effect of this application lies in: displaying a visual configuration page; performing data modeling through a data import control in a configuration page to import data to be processed into a hive library; configuring a data processing model according to data processing requirements through a data processing control in a configuration page to obtain target SQL; when the execution requirement of the target SQL is online execution, using presto to execute the target SQL; when the execution requirement of the target SQL is timing execution, using hive to execute the target SQL; the problem that the data processing mode of hive SQL execution is single and the data processing requirement of a user cannot be met is solved; target SQL can be executed using presto at model direct preview or runtime; automatically converting to hive to execute target SQL when the model is regularly executed after being released; automatically converting to hive to execute target SQL when the model is regularly executed after being released; the automatic switching between hive SQL and presto SQL can be realized; to meet the data processing requirements of the user.
In addition, a set of visual web interface enables a user to quickly arrange and generate SQL, real-time data preview is achieved, the user does not need to write complex SQL sentences, and operation difficulty is reduced.
The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.
Drawings
FIG. 1 is a flow chart of a data processing method provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of an interface for data import according to an embodiment of the present application;
FIG. 3 is a schematic interface diagram of a hive library provided by one embodiment of the present application;
FIG. 4 is a schematic diagram of a column select interface provided in one embodiment of the present application;
FIG. 5 is a schematic interface diagram of table associations provided by one embodiment of the present application;
FIG. 6 is an interface diagram of all data models provided by one embodiment of the present application;
FIG. 7 is an interface diagram of a data model provided by an embodiment of the present application;
FIG. 8 is a block diagram of a data processing apparatus provided in one embodiment of the present application;
fig. 9 is a block diagram of a data processing apparatus according to an embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application will be described in conjunction with the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
First, several terms referred to in the present application will be described.
Structured Query Language (SQL): is a special purpose programming language, a database query and programming language, used to access data and query, update and manage relational database systems.
hive: the system is a set of data warehouse analysis system constructed based on Hadoop. It provides a rich SQL query approach to analyze data stored in a Hadoop distributed file system. hive can map the structured data file into a database table and provide complete SQL query function. Hive can also convert SQL statements into MapReduce tasks to run, and the needed content is queried and analyzed through own SQL, and the set of SQL process is called Hive SQL for short.
Presto: is a distributed SQL query engine. It is designed specifically for high-speed, real-time data analysis. It supports standard ANSI SQL including complex queries, aggregations (aggregations), joins (joins), and window functions (window functions).
Presto's operating model is essentially different from Hive or MapReduce. hive translates the query into a multi-stage MapReduce task, running one after the other. Each task reads input data from disk and outputs intermediate results to disk. However, the Presto engine does not use MapReduce. It uses a custom query and execution engine and responsive operators to support the SQL syntax. All data processing is done in memory, except for the improved scheduling algorithm. Different processing ends form a processing pipeline through a network. This avoids unnecessary disk reads and writes and additional latency. Such a pipelined execution model runs multiple data processing segments at the same time, passing data from one processing segment to the next as soon as it is available. Such an approach would greatly reduce the end-to-end response time of various queries.
Metadata (Metadata): the data (data about data) describing data, also called intermediate data and relay data, is mainly information describing data property (property) and is used for supporting functions such as indicating storage location, history data, resource searching, file recording and the like. The metadata may be considered an electronic catalog.
Optionally, the present application is described by taking an execution subject of each embodiment as an example of a computer device, where the computer device may be a desktop computer, a notebook computer, a tablet computer, a mobile phone, and the like, and the embodiment does not limit the device type of the computer device.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present application. The method at least comprises the following steps:
step 101, displaying a visual configuration page.
The configuration page is a set of visual web interfaces and is used for enabling a user to quickly arrange and generate sql and achieve real-time data preview. Therefore, a user is not required to write complex sql statements, and the operation difficulty is reduced.
And 102, performing data modeling through a data import control in the configuration page to import the data to be processed into the hive library.
In one example, data modeling by a data import control in a configuration page includes: configuring database connection through a data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query; and/or, configuring a mapping relation through a data import control, and importing data from a relational library to a hive library; and/or, configuring a mapping relation through a data import control, and importing data from an unstructured file to a hive library; and/or importing the data into the library table resource through the data import control; and when the library table resources support the configuration timing task, timing and synchronizing data to the hive library.
Optionally, referring to the configuration page shown in fig. 2, the data import control in the configuration page includes: the data source classification selection control 21, the data source type selection control 22, the data source selection control 23 and the table selection control 24 can realize data import through different controls.
Referring to the hive library shown in fig. 3, the hive library provides a data search function, i.e., a data name input area 31 and a search control 32 are displayed in a configuration page. The user can realize the data search function by inputting the data to be searched in the data name input area 31 and clicking the search control 32. In addition, the hive library displayed by the configuration page includes file resources 33 and library table resources 34, in fig. 3, the library table resources 34 are taken as an example, and the library table resources 34 include information such as the sequence number of hive, the name of a source table, a database to which the hive belongs, the name of a hive table, an import state, an update state, the number of records, import time, a creator, scheduling configuration, and the like. In fig. 3, the library table resource 34 is illustrated as including the above information, in practical implementation, the library table resource 34 may include more or less information, and the content of the library table resource 34 is not limited in this embodiment.
And 103, configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain the target SQL.
Optionally, configuring, by a data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL, including: when the default function of the hive library does not support the data processing requirement, acquiring a custom function packet, wherein the custom function packet comprises a custom function supporting the data processing requirement; and registering the user-defined function packet to the hive library, and triggering and executing a data processing model configured according to data processing requirements through a data processing control in the configuration page to obtain the target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
Illustratively, a user can self-define the hive function by writing java code, thereby realizing complex data processing requirements. At the moment, the user uploads the compiled udf jar packet to hdfs; components are created udf, associated udf functions, and automatically registered with the hive library.
In one example, configuring, by a data processing control in a configuration page, a data processing model according to a data processing requirement to obtain a target SQL, includes: displaying fields in the input stream in a configuration page; and receiving the selection operation of the user on the fields to obtain the target SQL comprising the selected fields.
Referring to the schematic interface diagram of fig. 4 for obtaining target SQL through column selection, a configuration page displays a model step name input area 41 and a field 42 in an input stream; after the user selects the "user code" field and the "water usage" field, the target SQL for the "column select" model step is generated.
In another example, configuring, by a data processing control in a configuration page, a data processing model according to a data processing requirement to obtain a target SQL, includes: and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
Referring to the interface schematic diagram of obtaining target SQL through table association shown in fig. 5, a configuration page displays a model step name input area 51, an association type selection control 52, and two tables 53 selected by a user; and obtaining the target SQL after receiving the associated operation 'left connection' input by the user. The target SQL includes two tables 53 for association.
Referring to all of the data processing models of the configuration page display shown in FIG. 6, a detailed page for any one of the data processing models is shown in FIG. 7.
And step 104, when the execution requirement of the target SQL is online execution, executing the target SQL by using presto.
Such as: when the target SQL needs to be previewed or run directly, presto is used to execute the target SQL.
And 105, when the execution requirement of the target SQL is timing execution, using hive to execute the target SQL.
Such as: the target SQL is executed using hive after the model is released.
In summary, the data processing method provided in this embodiment,
fig. 8 is a block diagram of a data processing apparatus according to an embodiment of the present application. The device at least comprises the following modules: a page display module 810, a data modeling module 820, a model building module 830, a first execution module 840, and a second execution module 850.
A page display module 810, configured to display a visualized configuration page;
the data modeling module 820 is used for performing data modeling through the data import control in the configuration page so as to import the data to be processed into the hive library;
the model establishing module 830 is configured to configure a data processing model according to a data processing requirement through the data processing control in the configuration page, so as to obtain a target SQL;
a first execution module 840, configured to execute the target SQL using presto when the execution requirement of the target SQL is online execution;
a second executing module 850, configured to execute the target SQL using hive when the execution requirement of the target SQL is timing execution.
Optionally, the data modeling module 820 is configured to:
configuring database connection through the data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query;
and/or the presence of a gas in the gas,
configuring a mapping relation through the data import control, and importing data from a relational library to the hive library;
and/or the presence of a gas in the gas,
the mapping relation is configured through the data import control, and data are imported from an unstructured file to the hive library;
and/or the presence of a gas in the gas,
importing the library table resource through the data import control; and when the base table resources support the configuration of the timing task, timing and synchronizing data to the hive base.
Optionally, the model building module 830 is configured to:
displaying fields in an input stream in the configuration page; and receiving the selection operation of the user on the field to obtain the target SQL comprising the selected field.
Optionally, the model building module 830 is configured to:
and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
Optionally, the model building module 830 is configured to:
when the default function of the hive library does not support the data processing requirement, acquiring a self-defined function packet, wherein the self-defined function packet comprises a self-defined function supporting the data processing requirement;
registering the user-defined function packet to the hive library, and triggering and executing the step of configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain a target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
For relevant details reference is made to the above-described method embodiments.
It should be noted that: in the data processing apparatus provided in the above embodiment, when performing data processing, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the data processing apparatus is divided into different functional modules to complete all or part of the above described functions. In addition, the data processing apparatus and the data processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
Fig. 9 is a block diagram of a data processing apparatus according to an embodiment of the present application. The apparatus comprises at least a processor 901 and a memory 902.
Processor 901 may include one or more processing cores such as: 4 core processors, 8 core processors, etc. The processor 901 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 901 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 901 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 901 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one instruction for execution by processor 901 to implement the data processing methods provided by the method embodiments herein.
In some embodiments, the data processing apparatus may further include: a peripheral interface and at least one peripheral. The processor 901, memory 902 and peripheral interfaces may be connected by buses or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.
Of course, the data processing apparatus may also include fewer or more components, which is not limited in this embodiment.
Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the data processing method of the above-mentioned method embodiment.
Optionally, the present application further provides a computer product, which includes a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the data processing method of the above-mentioned method embodiment.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method of data processing, the method comprising:
displaying a visual configuration page;
performing data modeling through a data import control in the configuration page to import data to be processed into a hive library;
configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain target SQL;
when the execution requirement of the target SQL is online execution, executing the target SQL by using presto;
when the execution requirement of the target SQL is timing execution, using hive to execute the target SQL.
2. The method of claim 1, wherein the data modeling via a data import control in the configuration page comprises:
configuring database connection through the data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query;
and/or the presence of a gas in the gas,
configuring a mapping relation through the data import control, and importing data from a relational library to the hive library;
and/or the presence of a gas in the gas,
the mapping relation is configured through the data import control, and data are imported from an unstructured file to the hive library;
and/or the presence of a gas in the gas,
importing the library table resource through the data import control; and when the base table resources support the configuration of the timing task, timing and synchronizing data to the hive base.
3. The method of claim 1, wherein configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL comprises:
displaying fields in an input stream in the configuration page; and receiving the selection operation of the user on the field to obtain the target SQL comprising the selected field.
4. The method of claim 1, wherein configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL comprises:
and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
5. The method of claim 1, wherein configuring, by the data processing control in the configuration page, the data processing model according to the data processing requirement to obtain the target SQL comprises:
when the default function of the hive library does not support the data processing requirement, acquiring a self-defined function packet, wherein the self-defined function packet comprises a self-defined function supporting the data processing requirement;
registering the user-defined function packet to the hive library, and triggering and executing the step of configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain a target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
6. A data processing apparatus, characterized in that the apparatus comprises:
the page display module is used for displaying a visual configuration page;
the data modeling module is used for carrying out data modeling through the data import control in the configuration page so as to import the data to be processed into the hive library;
the model establishing module is used for configuring a data processing model according to data processing requirements through the data processing control in the configuration page to obtain target SQL;
the first execution module is used for executing the target SQL by using presto when the execution requirement of the target SQL is online execution;
and the second execution module is used for executing the target SQL by using hive when the execution requirement of the target SQL is timing execution.
7. The apparatus of claim 6, wherein the data modeling module is configured to:
configuring database connection through the data import control, and synchronizing a table structure into platform metadata in a manner of directly connecting database query;
and/or the presence of a gas in the gas,
configuring a mapping relation through the data import control, and importing data from a relational library to the hive library;
and/or the presence of a gas in the gas,
the mapping relation is configured through the data import control, and data are imported from an unstructured file to the hive library;
and/or the presence of a gas in the gas,
importing the library table resource through the data import control; and when the base table resources support the configuration of the timing task, timing and synchronizing data to the hive base.
8. The apparatus of claim 6, wherein the model building module is configured to:
displaying fields in an input stream in the configuration page; and receiving the selection operation of the user on the field to obtain the target SQL comprising the selected field.
9. The apparatus of claim 6, wherein the model building module is configured to:
and receiving the correlation operation executed on the two input tables in the configuration page to obtain the target SQL.
10. The apparatus of claim 6, wherein the model building module is configured to:
when the default function of the hive library does not support the data processing requirement, acquiring a self-defined function packet, wherein the self-defined function packet comprises a self-defined function supporting the data processing requirement;
registering the user-defined function packet to the hive library, and triggering and executing the step of configuring a data processing model according to data processing requirements through a data processing control in the configuration page to obtain a target SQL, wherein the data processing control corresponds to the functions supported by the hive library.
CN202011397493.9A 2020-12-04 2020-12-04 Data processing method and device Pending CN112487036A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011397493.9A CN112487036A (en) 2020-12-04 2020-12-04 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011397493.9A CN112487036A (en) 2020-12-04 2020-12-04 Data processing method and device

Publications (1)

Publication Number Publication Date
CN112487036A true CN112487036A (en) 2021-03-12

Family

ID=74939309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011397493.9A Pending CN112487036A (en) 2020-12-04 2020-12-04 Data processing method and device

Country Status (1)

Country Link
CN (1) CN112487036A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590217A (en) * 2021-07-26 2021-11-02 北京百度网讯科技有限公司 Function management method and device based on engine, electronic equipment and storage medium
CN115563183A (en) * 2022-09-22 2023-01-03 北京百度网讯科技有限公司 Query method, device and program product
CN117093589A (en) * 2023-10-16 2023-11-21 北京国基科技股份有限公司 Unstructured data warehousing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787119A (en) * 2016-03-25 2016-07-20 盛趣信息技术(上海)有限公司 Hybrid engine based big data processing method and system
CN110008232A (en) * 2019-04-11 2019-07-12 北京启迪区块链科技发展有限公司 Generation method, device, server and the medium of structured query sentence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787119A (en) * 2016-03-25 2016-07-20 盛趣信息技术(上海)有限公司 Hybrid engine based big data processing method and system
CN110008232A (en) * 2019-04-11 2019-07-12 北京启迪区块链科技发展有限公司 Generation method, device, server and the medium of structured query sentence

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590217A (en) * 2021-07-26 2021-11-02 北京百度网讯科技有限公司 Function management method and device based on engine, electronic equipment and storage medium
CN113590217B (en) * 2021-07-26 2022-12-02 北京百度网讯科技有限公司 Function management method and device based on engine, electronic equipment and storage medium
CN115563183A (en) * 2022-09-22 2023-01-03 北京百度网讯科技有限公司 Query method, device and program product
CN115563183B (en) * 2022-09-22 2024-04-09 北京百度网讯科技有限公司 Query method, query device and program product
CN117093589A (en) * 2023-10-16 2023-11-21 北京国基科技股份有限公司 Unstructured data warehousing method and device
CN117093589B (en) * 2023-10-16 2024-01-16 北京国基科技股份有限公司 Unstructured data warehousing method and device

Similar Documents

Publication Publication Date Title
CN109086409B (en) Microservice data processing method and device, electronic equipment and computer readable medium
CN112487036A (en) Data processing method and device
CN110019397B (en) Method and device for data processing
CN110442329A (en) Generation method, device, storage medium and the computer equipment of code segment
CN109840257B (en) Database query method, database query device, computer device and readable storage medium
CN109522341A (en) Realize method, apparatus, the equipment of the stream data processing engine based on SQL
CN115438087B (en) Data query method, device, storage medium and equipment based on cache library
CN112800058A (en) Method for realizing HBase secondary index
US11907264B2 (en) Data processing method, data querying method, and server device
JP2014075126A (en) Electronic apparatus including mind map user interface and mind map creation method using the same
US10872085B2 (en) Recording lineage in query optimization
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
CN114185874A (en) Big data based modeling method and device, development framework and equipment
US10289740B2 (en) Computer systems to outline search content and related methods therefor
CN111125216A (en) Method and device for importing data into Phoenix
CN115617338A (en) Method and device for quickly generating service page and readable storage medium
CN112905931B (en) Page information display method and device, electronic equipment and storage medium
US20140282477A1 (en) Automatic updating of data in application programs
CN113821514A (en) Data splitting method and device, electronic equipment and readable storage medium
CN113626032A (en) List page configuration method and device and storage medium
CN112613287A (en) Data list display method, device, equipment and storage medium
US20070250803A1 (en) High-level synthesis method and high-level synthesis system
CN111143464A (en) Data acquisition method and device and electronic equipment
CN112597182B (en) Optimization method, device, terminal and storage medium of data query statement
CN115840786B (en) Data lake data synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210312