CN116955386A - Data processing method and device and data processing task generation method and device - Google Patents

Data processing method and device and data processing task generation method and device Download PDF

Info

Publication number
CN116955386A
CN116955386A CN202210380778.4A CN202210380778A CN116955386A CN 116955386 A CN116955386 A CN 116955386A CN 202210380778 A CN202210380778 A CN 202210380778A CN 116955386 A CN116955386 A CN 116955386A
Authority
CN
China
Prior art keywords
target
task
data
statement
data processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210380778.4A
Other languages
Chinese (zh)
Inventor
熊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210380778.4A priority Critical patent/CN116955386A/en
Publication of CN116955386A publication Critical patent/CN116955386A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2435Active constructs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24535Query rewriting; Transformation of sub-queries or views
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to a data processing method, apparatus, computer device, storage medium and computer program product. The method comprises the following steps: when a data processing task is obtained, determining target task information of the data processing task; determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to target task information to obtain a target structured sentence; determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to a target compute engine; the target structured statement of the route is used for triggering the target computing engine to execute the data processing task to obtain a task processing result. The method can improve the processing efficiency of data processing.

Description

Data processing method and device and data processing task generation method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, and a data processing task generating method and apparatus.
Background
With the development of science and technology, massive data are generated, and data analysis is generally required to be performed on the data at present so as to optimize corresponding services and further meet the demands of users.
In the related art, a developer is usually required to write a structured sentence for data processing, then select a corresponding computing engine according to the use principle of each computing engine, and process data based on the written structured sentence by the computing engine.
However, writing structured statements and selecting a compute engine by hand can consume a significant amount of time, making data processing inefficient.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data processing method, apparatus, computer device, computer-readable storage medium, and computer program product that can improve data processing efficiency.
In a first aspect, the present application provides a data processing method, the method comprising:
when a data processing task is obtained, determining target task information of the data processing task;
determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to the target task information to obtain a target structured sentence;
determining a target computing engine matched with the data processing task according to the target task information;
And performing data processing on a target data source in the target task information based on the target structured statement through the target calculation engine to obtain a task processing result of the data processing task.
In a second aspect, the present application also provides a data processing apparatus, the apparatus comprising:
the task information acquisition module is used for determining target task information of the data processing task when acquiring the data processing task;
the sentence construction module is used for determining a sentence construction template matched with the data processing task and filling the sentence construction template according to the target task information to obtain a target structured sentence;
the engine determining module is used for determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to the target compute engine; the target structured statement is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
In one embodiment, the statement construction module is further configured to determine a target data source and a target processing condition in the target processing task information; filling the target data source and the target processing condition to corresponding positions in the statement structure template respectively to obtain an initial structured statement; and carrying out optimization processing on the initial structured statement to obtain a target structured statement.
In one embodiment, the statement construction module is further configured to determine first structure information corresponding to the target data source; determining second structural information related to the initial structural statement; constructing a data clipping statement according to the difference between the first structure information and the second structure information; adding the data clipping statement to the initial structuring statement to obtain a target structuring statement; the data clipping statement is used for clipping the data to be processed in the target data source.
In one embodiment, the statement construction module is further configured to determine a conditional filter statement and a data source connection statement in the initial structured statement; and when the condition filtering statement is positioned behind the data source connection statement, changing the condition filtering statement into the data source connection statement to obtain a target structured statement.
In one embodiment, the statement construction module is further configured to, when a plurality of target data sources are provided and the initial structured statement includes a data source connection statement, determine a data size corresponding to each of the target data sources; and adjusting the position of each target data source in the data source connection statement according to the data volume to obtain a target structured statement.
In one embodiment, the engine determination module is further configured to determine a number of target data sources in the target task information; and determining a target computing engine matched with the data processing task according to the number of the target data sources.
In one embodiment, the engine determining module is further configured to determine whether the target data source has been converted to a data source identifiable by the first computing engine when the number of target data sources is less than or equal to a preset number threshold; when the target data source is converted to a data source which can be identified by a first computing engine, the first computing engine is used as a target computing engine; and when the target data source is not converted to the data source which can be identified by the first computing engine, the second computing engine is used as the target computing engine.
In one embodiment, the engine determining module is further configured to obtain a pre-trained engine determining model when the number of the target data sources is greater than a preset number threshold; the engine determination model comprises a plurality of determination sub-models; determining metadata information corresponding to each target data source to obtain a metadata information set; respectively carrying out information processing on the metadata information set through each determination sub-model to obtain a result output by each determination sub-model; and synthesizing the respective output results of each determination sub-model to obtain a target calculation engine matched with the data processing task.
In one embodiment, the data processing apparatus is further configured to determine, by the target computing engine, at least one of a data clipping statement and a conditional filtering statement included in the target structured statement; filtering the data in the target data source by the target calculation engine based on at least one of the data clipping statement and the conditional filtering statement to obtain filtered data; and performing data processing on the filtered data according to the target index and the target dimension by the target calculation engine to obtain a task processing result of the data processing task.
In a third aspect, the present application also provides a computer device, where the computer device includes a memory and a processor, where the memory stores a computer program, and where the processor implements steps in any one of the data processing methods provided by the embodiments of the present application when the computer program is executed.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of any of the data processing methods provided by the embodiments of the present application.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of any of the data processing methods provided by the embodiments of the present application.
The data processing method, the data processing device, the computer equipment, the storage medium and the computer program product can determine the statement construction template matched with the data processing task by acquiring the data processing task. By determining target task information of the data processing task, the sentence construction template can be automatically filled according to the target task information to obtain a target structured sentence, and a target calculation engine suitable for processing the data processing task is automatically determined based on the target task information. The target structured statement and the target computing engine are automatically determined, and the target structured statement can be routed to the target computing engine, so that the target computing engine processes data through the target structured statement, and a task processing result of a data processing task is obtained. Because the target structured sentence and the target calculation engine are automatically determined, compared with the traditional method of manually compiling the target structured sentence and importing the compiled target structured sentence into the manually determined target calculation engine, the method can save time consumed by manually compiling the sentence and manually determining the calculation engine, thereby improving the efficiency of data processing.
A method, apparatus, computer device, computer readable storage medium, and computer program product for generating data processing tasks capable of improving data processing efficiency.
In a first aspect, the present application provides a data processing method, the method comprising:
displaying a task information set; the task information set comprises a plurality of task information;
responsive to an index adding operation for a task information set, moving the task information from the task information set to an index presentation area;
responsive to a dimension add operation for a set of task information, moving the task information from the set of task information to a dimension presentation area;
responding to the triggering operation of the task creation control, and displaying the creation result of the data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
In one embodiment, the method further comprises:
responding to triggering operation for a statement preview control, and displaying to generate a target structured statement according to task information in the index display area and task information in the dimension display area;
Displaying a detection result of grammar detection of the target structured statement;
when the grammar detection result represents that the target structured sentence does not accord with the grammar rule, responding to sentence adjustment operation aiming at the target structured sentence, and displaying the target structured sentence after sentence adjustment.
In a second aspect, the present application further provides a device for generating a data processing task, where the device includes:
the index determining module is used for displaying the task information set; the task information set comprises a plurality of task information; responsive to an index adding operation for a task information set, moving the task information from the task information set to an index presentation area;
the dimension determining module is used for responding to dimension adding operation aiming at a task information set and moving the task information from the task information set to a dimension display area;
the task creation module is used for responding to the triggering operation of the task creation control and displaying the creation result of the data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
In one embodiment, the generating device of the data processing task is further configured to display at least one product type, and respond to a selection operation for at least one product type to display a selected target product type and at least one service type included in the target product type; responsive to a selection operation for the at least one traffic type, exposing at least one data source that matches the selected target traffic type; responsive to a selection operation for the at least one data source, presenting a selected target data source; and displaying a task information set corresponding to the target data source.
In one embodiment, the generating device of the data processing task is further configured to, when there are a plurality of target data sources, respond to an association editing operation for the target data sources, display an association editing list and an association field selection list; responding to the selection operation aiming at the association relation editing list, and displaying the selected target association relation; responsive to a selection operation for the association field selection list, presenting a selected target association field; the target association relationship and the target association field are used for creating a data processing task.
In one embodiment, the generating device of the data processing task is further configured to display at least one task type, and respond to a selection operation for the at least one task type to display a selected target task type; the target task type is used for creating a data processing task and determining a statement construction template matched with the data processing task.
In one embodiment, the generating device of the data processing task is further configured to respond to a triggering operation for a statement preview control, and demonstrate that a target structured statement is generated according to the task information in the index showing area and the task information in the dimension showing area; displaying a detection result of grammar detection of the target structured statement; when the grammar detection result represents that the target structured sentence does not accord with the grammar rule, responding to sentence adjustment operation aiming at the target structured sentence, and displaying the target structured sentence after sentence adjustment.
In one embodiment, the generating device of the data processing task is further configured to determine target task information of the data processing task; the target task information comprises task information in an index display area and task information in a dimension display area; determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to the target task information to obtain a target structured sentence; determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to the target compute engine; the target structured statement is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
In a third aspect, the present application further provides a computer device, where the computer device includes a memory and a processor, where the memory stores a computer program, and the processor implements steps in any one of the methods for generating a data processing task provided by the embodiments of the present application when the computer program is executed by the processor.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium stores a computer program thereon, which when executed by a processor implements steps in any of the methods for generating data processing tasks provided by the embodiments of the present application.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of any of the methods for generating data processing tasks provided by the embodiments of the application.
The data processing task generating method, the data processing task generating device, the data processing device, the computer equipment, the storage medium and the computer program product can respond to the index adding operation aiming at the task information set by displaying the task information set, display the task information added by the index adding operation in the index display area, and respond to the dimension adding operation aiming at the task information set, display the task information added by the dimension adding operation in the dimension display area. By exposing the task creation control, a corresponding data processing task may be generated based on the task information exposed in the index exposure area and the task information exposed in the dimension exposure area in response to a trigger operation for the task creation control. The corresponding data processing task can be generated only by simple configuration based on the index adding operation and the dimension adding operation, so that the configuration threshold of the data processing task is reduced, the creation flow of the data processing task is simplified, the creation efficiency of the data processing task is improved, and the processing efficiency of the data processing is further improved.
Drawings
FIG. 1 is a diagram of an application environment for a data processing method in one embodiment;
FIG. 2 is a flow diagram of a data processing method in one embodiment;
FIG. 3 is a schematic diagram of a generation process of a target structured statement in one embodiment;
FIG. 4 is a flow diagram of the determination of a target compute engine in one embodiment;
FIG. 5 is a flow diagram of a method of generating data processing tasks in one embodiment;
FIG. 6 is a schematic diagram of a set of task information in one embodiment;
FIG. 7 is a schematic illustration of a data source in one embodiment;
FIG. 8 is a schematic diagram of an association editing interface in one embodiment;
FIG. 9 is a flow diagram of the generation of data processing tasks in one embodiment;
FIG. 10 is a flow chart of a data processing method in one embodiment;
FIG. 11 is a schematic structural framework of data processing in one embodiment;
FIG. 12 is a block diagram of a data processing apparatus in one embodiment;
FIG. 13 is a block diagram of an apparatus for generating data processing tasks in one embodiment;
FIG. 14 is an internal block diagram of a computer device in one embodiment;
fig. 15 is an internal structural view of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The data processing method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on the cloud or other servers. The terminal 102 and the server 104 may perform the data processing method of the present application alone or in combination. The following description will take as an example a data processing method of the present application performed in cooperation with the terminal 102 and the server 104. The terminal 102 may present the task information set and generate a data processing task according to the target task information in the task information set, and send the data processing task to the server 104. When the server 104 receives the data processing task, the server 104 may determine a corresponding sentence construction template and a target calculation engine based on target task information carried by the data processing task, and perform filling processing on the sentence construction template according to the target task information, so as to obtain a target structured sentence. The server 104 sends the target structured statement to the target computing engine, so that the target computing engine processes the data pointed by the target task information to obtain a task processing result of the data processing task, and returns the task processing result to the terminal 102, so that the terminal 102 can correspondingly display the task processing result.
The terminal 102 may be, but not limited to, various desktop computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
In one embodiment, as shown in fig. 2, a data processing method is provided, and the method is applied to a computer device, which may be a terminal or a server in fig. 1, for example. The data processing method comprises the following steps:
step S202, when the data processing task is obtained, determining target task information of the data processing task.
Specifically, when the data processing task is obtained, the computer device may parse the data processing task to obtain target task information of the data processing task. For example, the data processing task may carry target task information, so that the computer device may extract the target task information from the target task information.
The target task information comprises a target task type, a target data source and target processing information. The target task type refers to a task type to which the data processing task belongs. Task types include a persistence analysis type, a funnel analysis type, a multidimensional analysis type, and the like. Where retention analysis refers to analysis of retention of a user, for example, the ratio of unstable users to stable users may be determined by retention analysis. Funnel analysis refers to analysis of user conversion of a product at different stages, e.g., the conversion of a user during an update of a first version of an application to a second version of the application may be determined by funnel analysis. Multidimensional analysis refers to multidimensional analysis of data, such as analysis of data from a temporal dimension and a regional dimension.
The target data source refers to a data table that provides data required for data processing. The target processing information refers to information required for data analysis, and the target processing information comprises target indexes and target dimensions. The target index refers to a data index required for data analysis, and may be a unit or a method for measuring the development degree of things, for example, the target index may specifically be the number of people, the number of users, the sales of products, and the like. The target dimension refers to a data index required for data analysis, and may be some characteristic of things or phenomena, for example, the target dimension may be specifically a time dimension, a gender dimension, or a region dimension.
In one embodiment, the target processing information may further include a target filtering condition in addition to the target index and the target dimension, where the target filtering condition may be used to filter the data in the target data source, for example, the target filtering condition may be "screen the data collected in time a to time B". It is readily understood that when a structured statement is generated based on target filtering conditions, the generated structured statement may be a conditional filtering statement.
In one embodiment, a user may trigger the computer device to generate a corresponding data processing task according to a target to be analyzed, for example, a target application for generating the data processing task may be running in the terminal, so that the user may configure task information according to the target to be analyzed through the target application to obtain target task information, and the target application may generate a corresponding data processing task based on the target task information configured by the user. When a user wishes to analyze the user persistence of the application a, the user can configure the target data source as the user registry of the application a, the target index as the number of users, the target dimension as the time dimension, and the target task type as the persistence analysis type through the target application.
In one embodiment, when the user configures the target dimension and target index only through the target application, the target application may also employ a default target data source and a default target task type to generate the data processing task.
Step S204, determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to the target task information to obtain a target structured sentence.
In particular, when a data processing task is obtained, the computer device may determine a sentence construction template that matches the data processing task. The sentence construction template refers to a template for generating a structured sentence, for example, the sentence construction template may be a sentence frame written in advance by a programmer, so that only corresponding parameters need to be filled into the sentence frame afterwards, and the structured sentence can be obtained. The structured statement refers to a statement that can be recognized by the computer device for data processing, for example, the structured statement may be an SQL statement (Structured Query Language ). Further, when the sentence construction template is obtained, the computer device may further fill the target task information into the sentence construction template to obtain a target structured sentence. For example, the computer device may populate the sentence construction templates with target data sources and target process information in the target task information.
In one embodiment, the sentence construction templates corresponding to each task type can be generated in advance, and the corresponding relation between the task type and the sentence construction templates is obtained, so that the computer equipment can determine the sentence construction templates matched with the target task type in the target task information according to the corresponding relation, and the sentence construction templates matched with the target task type are used as the sentence construction templates matched with the data processing task.
In one embodiment, when the user configures the target task information through the target application, the user may further configure the sentence construction template, for example, select a sentence construction template to be used from a plurality of sentence construction templates, so that the target task information may carry the template identifier of the selected sentence construction template, and further the computer device may determine the sentence construction template matched with the data processing task directly based on the template identifier, that is, the computer device may use the sentence construction template corresponding to the template identifier as the sentence construction template matched with the data processing task.
In one embodiment, referring to fig. 3, when obtaining the target task information and the sentence construction template matched with the data processing task, the computer device may determine a first location in the sentence construction template where the data source needs to be filled, and fill the target data source in the target task information to the first location; the computer device may determine a second location in the sentence construction template where the index is to be filled and fill the target index in the target task information to the second location, and the computer device may determine a third location in the sentence construction template where the dimension is to be filled and fill the target dimension in the target task information to the third location, thereby obtaining the target structured sentence. FIG. 3 illustrates a process for generating a target structured statement in one embodiment.
In one embodiment, when the sentence construction template is written, the target task information can be specified in a form parameter mode, namely, the target data source, the target index and the target dimension are specified in a form parameter mode, and when the specific target data source, the target index and the target dimension are required to be filled into the sentence construction template, the specific target data source, the target index and the target dimension are required to replace the corresponding form parameter.
Step S206, determining a target computing engine matched with the data processing task according to the target task information.
Specifically, when the target task information is obtained, the computer device may also automatically determine a target computing engine suitable for processing the data processing task based on the target task information. Wherein, the calculation engine refers to a server for processing big data. For example, the computing engine may specifically be TDW-spark (a stable computing engine suitable for ultra-large data sources but with low analysis efficiency), clickhouse (a computing engine suitable for single data source analysis but with low analysis efficiency), prest (a computing engine suitable for multi-data source association analysis with lower efficiency than Clickhouse but higher than TDW-spark).
In one embodiment, the target computing engine may also be determined based on the target task type in the target task information. For example, when the target task type is the retention analysis type, the computing engine suitable for the retention analysis is determined to be the target computing engine, and when the target task type is the funnel analysis type, the computing engine suitable for the funnel analysis is determined to be the target computing engine.
In one embodiment, determining a target computing engine that matches a data processing task based on target task information includes: determining the number of target data sources in the target task information; a target compute engine is determined that matches the data processing task based on the number of target data sources.
Specifically, when obtaining the target task information, the computer device may determine the number of target data sources included in the target task information. For example, the target task information may exist in the form of key-value pairs (key), and when the target task information is acquired, the computer device may determine a key corresponding to the data source, count the number of key values corresponding to the data source included in the target task information, and use the counted number as the number of target data sources. Further, when the number of target data sources is obtained, the computer device may determine a target computing engine that matches the data processing task based on the number of target data sources. For example, since the Clickhouse is suitable for analyzing single data sources, it is possible to determine the Clickhouse as the target calculation engine when the number of target data sources is 1; because prest is suitable for performing association analysis on multiple data sources, prest can be determined to be a target computing engine when the number of target data sources is 2 or 3; also, since TDW-spark is suitable for oversized data sources, TDW-spark can be used as the target calculation engine when the number of target data sources exceeds 3.
In the above embodiment, since the number of the target data sources reflects the task characteristics of the data processing task, by determining the target computing engine based on the task characteristics of the data processing task, the determined target computing engine can be adapted to process the data processing task, thereby improving the processing efficiency of the data processing task.
Step S208, routing the target structured statement to a target computing engine; the target structured statement of the route is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
Specifically, when the target computing engine is determined, the computer device may route the target structured sentence to the target computing engine, so that the target computing engine determines data pointed by the target task information through the target structured sentence, and performs data processing on the data pointed by the target task information, to obtain a task processing result of the data processing task. For example, because the target data source in the target task information is filled into the sentence construction template, the target data source is included in the target structured sentence generated based on the sentence construction template, so that the target calculation engine can analyze the target structured sentence to obtain the target data source, and the target data source is used as the data pointed by the target task information. The computer equipment can process the data pointed by the target task information, namely, process the data in the target data source, and obtain the task processing result of the data processing task.
In the data processing method, the sentence construction template matched with the data processing task can be determined by acquiring the data processing task. By determining target task information of the data processing task, the sentence construction template can be automatically filled according to the target task information to obtain a target structured sentence, and a target calculation engine suitable for processing the data processing task is automatically determined based on the target task information. The target structured statement and the target computing engine are automatically determined, and the target structured statement can be routed to the target computing engine, so that the target computing engine processes data through the target structured statement, and a task processing result of a data processing task is obtained. Because the target structured sentence and the target calculation engine are automatically determined, compared with the traditional method of manually compiling the target structured sentence and importing the compiled target structured sentence into the manually determined target calculation engine, the method can save time consumed by manually compiling the sentence and manually determining the calculation engine, thereby improving the efficiency of data processing.
Further, since the target computing engine is determined based on the target task information, the determined target computing engine can be made suitable for the data processing task, thereby further improving the efficiency of the data processing by the target computing engine suitable for the data processing task. In addition, compared with the manual research on the calculation engine and the determination of the target calculation engine based on the research result, the method and the device can further improve the determination accuracy of the target calculation engine.
In one embodiment, filling the sentence construction template according to the target task information to obtain the target structured sentence includes: determining a target data source and target processing conditions in target processing task information; filling the target data source and the target processing condition to corresponding positions in the sentence structure template respectively to obtain an initial structured sentence; and carrying out optimization treatment on the initial structured statement to obtain the target structured statement.
Specifically, when obtaining the target task information and the sentence construction template matched with the data processing task, the computer device may extract the target data source and the target processing condition in the target task information, determine a position in the sentence construction template for filling the target data source, and fill the target data source to the position. The computer device may also determine a location in the sentence construction template to populate the target process condition and populate the target process condition to the location. When the target data source and the target processing condition are respectively filled into the corresponding positions in the sentence structure template, an initial structured sentence can be obtained, and at the moment, the computer equipment can take the initial structured sentence as the target structured sentence and can further optimize the initial structured sentence so as to obtain the target structured sentence. For example, the computer device may perform grammar detection on the initial structured statement and modify the portion that does not conform to the grammar rules to obtain the target structured statement.
In this embodiment, by optimizing the initial structured statement, a more accurate structured statement may be obtained, so that the execution success rate of the data processing task is improved based on the more accurate structured statement.
In one embodiment, the target task information includes a target data source; optimizing the initial structured statement to obtain a target structured statement, wherein the optimizing comprises the following steps: determining first structure information corresponding to a target data source; determining second structure information related to the initial structured statement; constructing a data clipping statement according to the difference between the first structure information and the second structure information; adding the data clipping statement to the initial structuring statement to obtain a target structuring statement; the data clipping statement is used for clipping the data to be processed in the target data source.
The structure information specifically includes a partition table identifier of the partition table and a column identifier of the data column, where the partition table identifier refers to information that uniquely identifies one partition table, and the column identifier refers to an identifier that uniquely identifies one data column.
Specifically, when obtaining the target task information, the computer device may determine a target data source in the target task information and determine first structure information of the target data source. For example, when the target data source includes a plurality of partition tables, the computer device may determine the partition table included in the target data source and the data column included in each of the partition tables, and use the partition table identification of the determined partition table and the column identification of the data column as the first structure information. When the target data source does not include a plurality of partition tables, but only one data table, the computer apparatus may determine a data column included in the target data source, and use a column identification of the determined data column as the first structure information. The partition table refers to a sub-table obtained by dividing the data table according to a preset dividing manner, for example, the data table may be divided according to months, so as to obtain a plurality of partition tables. A data column refers to a column in a data table.
Further, when the initial structured statement is obtained, the computer device may determine second structural information related to the initial structured statement. For example, the computer device may determine a column identifier and a partition table identifier that appear in the initialization structure statement, and take the column identifier and the partition table identifier located in the initialization structure statement as the second structure information. Further, the computer device may determine a difference between the first structure information and the second structure information, generate a data clipping statement according to the difference, and add the generated data clipping statement to a preset position in the initial structural statement to obtain the target structural statement. By adding the data clipping statement into the initial structuring statement, the target computing engine can execute the data clipping statement preferentially in the process of executing the target structuring statement so as to clip the data in the target data source, and then access and analyze the clipped data to obtain the task processing result of the data processing task.
Illustratively, when the first structure information includes partition table 1, partition table 2, and partition table 3, and partition table 2 includes column 1 and column 2, partition table 3 includes column 4; when the second structure information includes column 1 in partition table 2 and column 4 in partition table 3, the computer device may determine that the computing engine actually only needs to perform data processing on column 1 in partition table 2 and column 4 in partition table 3 based on the first structure information and the second structure information, without accessing and processing data in partition table 1, partition table 2 and partition table 3, so that the computer device constructs a data clipping statement according to a difference between the first structure information and the second structure information, so that the computing engine may preferentially clip column 2 in partition table 1 and column 2 in partition table 2 in the data source based on the data clipping statement, read only column 1 in partition table 2 and column 4 in partition table 3 from the data source, and perform subsequent processing on the read data.
In the embodiment, the data clipping statement is constructed, so that the partition table and the data column which are not required to be accessed can be clipped based on the data clipping statement, thereby avoiding accessing all partition tables and data columns in the data source, further reducing the frequency of access requests and improving the data processing efficiency.
In one embodiment, the optimizing the initial structured statement to obtain the target structured statement includes: determining a conditional filtering statement and a data source connection statement in the initial structured statement; when the conditional filtering statement is located behind the data source connection statement, the conditional filtering statement is changed into the data source connection statement, and the target structured statement is obtained.
In one embodiment, the conditional filtering statement may be a white statement in SQL, for example, the conditional filtering statement is "Where name= 'C'" in "SELECT x FROM Websites WHERE name = 'C'". Where "SELECT FROM Websites WHERE name = 'C'" refers to all Websites named "C" selected from the "Websites" table.
The data source connection statement refers to a statement for combining two or more data sources, and in one embodiment, the data source connection statement may specifically be a Join statement in SQL, for example, SELECT FROM table1 in Join table2 ON a.id=b.id; the selection table1 INNER JOIN table2 ON a.id=b.id refers to that the table1 and the table2 are intersected, and the data with the same id in the table1 table and the table2 table is returned.
Specifically, when the initial structured statement is obtained, the computer device may determine whether a conditional filtering statement and a data source connection statement exist in the initial structured statement, for example, determine whether a conditional filtering statement exists according to a keyword "where" and determine whether a data source connection statement exists according to a keyword "join", and determine whether the conditional filtering statement is located after the data source connection statement when determining that the conditional filtering statement and the data source connection statement exist. When the condition filtering statement is determined to be located behind the data source connection statement, the computer equipment adjusts the position of the condition filtering statement in the initialization structure statement, and changes the condition filtering statement into the data source connection statement, so that the target structural statement is obtained.
For example, in the initialization structure statement "SELECT 1 from_table 1 in_table 2ON a.id=b.id; …; when name= 'C' ", the position of the conditional filtering statement may be adjusted to obtain a target structured statement" SELECT FROM table1 INNER JOIN (WHERE name= 'C') table2ON a.id=b.id … ".
In this embodiment, the position of the conditional filtering statement is adjusted from the data source connection statement to the data source connection statement, so that when the computing engine runs the target structured statement, the conditional filtering statement is preferentially executed, so that the data in the target data source is filtered based on the conditional filtering statement, and the filtered data is processed based on the data source connection statement. Compared with the traditional method of executing the data source connection statement first to process all data in the target data source based on the data source connection statement and then executing the conditional filtering statement, the method can greatly reduce the data quantity required to be processed by the data source connection statement and improve the processing efficiency of data processing.
In one embodiment, the optimizing the initial structured statement to obtain the target structured statement includes: when a plurality of target data sources are provided and the initial structuring statement comprises a data source connection statement, respectively determining the data volume corresponding to each target data source; and adjusting the position of each target data source in the data source connection statement according to the data volume to obtain a target structured statement.
Specifically, when the target task information includes a plurality of target data sources and the initialization structure sentence generated based on the target task information includes a data source connection sentence, the computer device may determine the data size corresponding to each data source, for example, the computer device may determine the data size according to the amount of data included in the data sources. Further, since two or more data sources are connected in the data source connection statement, the computer device can adjust the position of the target data source in the data source connection statement according to the size of the data source, so as to obtain the target structured statement.
In one embodiment, a modification rule of the data source location may be preset, and the computer device may modify the location of each target data source in the data source connection statement based on the modification rule. For example, in the data source connection statement, a target data source with small data volume is placed before a target data table with large data volume, so as to obtain a target structured statement. By placing the table with small data volume in front of the table with large data volume, the execution efficiency of the data source connection statement can be improved.
In the above embodiment, by adjusting the position of each target data source in the data source connection statement based on the size of the data volume, the processing efficiency of the computing engine for processing the data through the data source connection statement can be improved.
In one embodiment, determining a target compute engine that matches a data processing task based on a number of target data sources includes: when the number of the target data sources is smaller than or equal to a preset number threshold, determining whether the target data sources are converted to the data sources which can be identified by the first calculation engine; when the target data source is converted to the data source which can be identified by the first computing engine, the first computing engine is used as the target computing engine; the second compute engine is configured to act as a target compute engine when the target data source is not converted to a data source identifiable by the first compute engine.
The number of data sources to which the first computing engine is applicable is smaller than or equal to the number of data sources used by the second computing engine, for example, the first computing engine may be specifically a Clickhouse engine adapted to process a single data source, and the second computing engine may be specifically a prest engine adapted to process 2 to 3 data sources.
In particular, the computer device may determine a target compute engine based on the number of target data sources in the target task information. When the target task information is obtained, the computer device may count the number of target data sources included in the target task information, and determine whether the number of target data sources is less than or equal to a preset number threshold. The preset number threshold may be determined by the characteristics of each computing engine, for example, when a first computing engine is adapted to process a single data source and a second computing engine is adapted to process 2 to 3 data sources, the computer device may determine that the preset number threshold is 1.
When the number of the target data sources is smaller than or equal to the preset number threshold, in order to further improve the efficiency of data processing, the computer device may further determine whether the target data sources have been converted to the data sources that can be identified by the first computing engine, and if the target data sources have been converted to the data sources that can be identified by the first computing engine, the computer device uses the first computing engine as the target computing engine. If the target data source is not converted to the data source that can be identified by the first computing engine, it may be considered that the time taken to convert the target data source to the data source that can be identified by the first computing engine may be greater than the time taken for the second computing engine to perform data processing on the target data source, so in order to improve the processing efficiency of data processing, the computer device may use the second computing engine as the target computing engine.
In one embodiment, the computer device has stored therein a format conversion rule for converting the target data source to a data source identifiable by the first computing engine, such that the computer device can format convert the target data source based on the format conversion rule.
In the above embodiment, since the preset number threshold is determined by the features of the computing engine, the target computing engine is determined by the difference between the number of target data sources and the preset number threshold, so that the determined target computing engine is suitable for processing the target data sources in the target task information, thereby improving the processing efficiency of data processing.
In one embodiment, determining a target compute engine that matches a data processing task based on a number of target data sources includes: when the number of the target data sources is greater than a preset number threshold, acquiring a pre-trained engine determination model; the engine determination model comprises a plurality of determination sub-models; determining metadata information corresponding to each target data source to obtain a metadata information set; respectively carrying out information processing on the metadata information set through each determination sub-model to obtain a result output by each determination sub-model; and synthesizing the results output by each determination sub-model to obtain a target calculation engine matched with the data processing task.
In particular, the computer device may also determine the target computing engine by a pre-trained engine determination model when the number of target data sources is greater than a preset number threshold. The computer equipment can acquire metadata information corresponding to each target data source to obtain a metadata information set, the metadata information set is input into an engine determination model, and the engine determination model analyzes the metadata set to obtain the target calculation engine. The metadata information refers to data information related to a data source, for example, the size of the data source, a partition table included in the data source, an association relationship between the data sources, and the like. The engine determination model can comprise a plurality of determination sub-models, each sub-model can output a result based on the metadata information set, so that the computer equipment synthesizes the result output by each determination sub-model to obtain a target calculation engine matched with the data processing task. For example, the result output by each determination sub-model may be a probability value of each calculation engine in the preset calculation engine set as the target calculation engine, so that the computer device superimposes the probability values output by the plurality of determination sub-models corresponding to each calculation engine to obtain a total probability value, and the computer device uses the calculation engine with the highest total probability value as the target calculation engine.
In one implementation, the determining sub-model may specifically be a decision tree, where when the metadata information set is input, the decision tree may make a decision on the metadata information set to obtain an output result.
In one implementation, a developer may obtain a large number of data sources, and label each data source to obtain a data source label, where the data source label includes a target computing engine to which the data source is applicable. And the research staff can determine the metadata information corresponding to each data source, input the metadata information into an engine determination model to be trained, and output a prediction calculation engine based on the input metadata information by the engine determination model. And the engine determining model determines the difference between the predictive computation engine and the data element label, adjusts model parameters based on the difference, and stops until a preset training stopping condition is reached, so that a trained engine determining model is obtained. The engine determination model may be a model trained based on an XGBoost algorithm (an ensemble learning algorithm).
In one embodiment, referring to fig. 4, the preset computing engine includes a first computing engine, a second computing engine and a third computing engine, where the first computing engine may be specifically a click house, the second computing engine may be specifically a prest, and the third computing engine may be specifically a TDW-spark. When the target task information is obtained in S401, the computer device may determine the number of target data sources based on the target task information in S402, and when it is determined that the number of target data sources is less than or equal to the first number threshold, for example, when it is determined that the number of target data sources is a single target data source, S404 determines whether the target data source has been converted to a data source that can be identified by the first computing engine, S405 uses the first computing engine as the target computing engine if the target data source has been converted to a data source that can be identified by the first computing engine, and S406 uses the second computing engine as the target computing engine if the target data source has not been converted to a data source that can be identified by the first computing engine. If the number of the target data sources is greater than the first number threshold and less than or equal to the second data amount threshold in S407, for example, if two or three target data sources are provided, the computer device determines a metadata information set corresponding to the target data sources in S408, inputs the metadata information set into the engine determination model, determines that the second computing engine is selected as the target computing engine through the engine determination model, or determines that the third computing engine is the target computing engine. If the number of target data sources is greater than the second number threshold in S409, for example, if there are 3 target data sources, S410 the computer device may use the third computing engine as the target computing engine. FIG. 4 illustrates a flow diagram for determining a target compute engine in one embodiment.
In the above embodiment, since the metadata information set reflects the characteristics of the data source, the target calculation engine based on the pre-trained machine learning model and determined by the metadata information set can be more adapted to the characteristics of the target data source. Since the target computing engine can be more adapted to the characteristics of the target data source, the processing efficiency of data processing can be improved by the target computing engine.
In one embodiment, the method further comprises: determining, by the target calculation engine, at least one of a data clipping statement and a conditional filtering statement included in the target structured statement; filtering the data in the target data source by the target calculation engine based on at least one of the data clipping statement and the condition filtering statement to obtain filtered data; and performing data processing on the filtered data according to the target index and the target dimension by using the target calculation engine to obtain a task processing result of the data processing task.
Specifically, when routing the target structured statement to the target compute engine, the target compute engine may determine whether the target structured statement includes a data clipping statement and a conditional filtering statement. If at least one of the data clipping statement and the conditional filtering statement exists, the target computing engine may preferably execute the data clipping statement and/or the conditional filtering statement to filter the data in the target data source to obtain filtered data, and then perform subsequent processing on the filtered data, for example, performing multidimensional analysis on the filtered data, and the like. The data is filtered preferentially, and then the multi-dimensional analysis is carried out on the filtered data, so that the data volume for carrying out the multi-dimensional analysis can be reduced, and the analysis efficiency of the multi-dimensional analysis is improved.
In one embodiment, when the data clipping statement and the conditional filtering statement simultaneously exist in the target structural statement, the data clipping statement and the conditional filtering statement can be executed sequentially according to the sequence of the positions of the data clipping statement and the conditional filtering statement in the target structural statement.
In the above embodiment, the data which does not need to be processed in the target data source is filtered by preferentially clipping the data based on at least one of the data clipping statement and the conditional filtering statement, so as to obtain the filtered data, and the filtered data is processed, so that the data volume involved in the subsequent processing can be reduced, and the efficiency of data processing is improved.
In one embodiment, as shown in fig. 5, a method for generating a data processing task is provided, and an example in which the method is applied to the terminal in fig. 1 is described. The generation method of the data processing task comprises the following steps:
step S502, a task information set is displayed; the task information set includes a plurality of task information.
Specifically, a target application used for creating a data processing task can be operated in the terminal, and when the data processing task needs to be created, a user can trigger the target application in the terminal to display a task information set. The task information set comprises a plurality of task information, and a user can select target task information from the task information set, so that the terminal generates corresponding data processing tasks based on the target task information selected by the user.
In one embodiment, referring to FIG. 6, FIG. 6 illustrates a schematic diagram of a set of task information in one embodiment. The user may trigger the terminal to display the task information set 601, where the task information in the task information set may specifically be a data index, a data dimension, or a filtering condition.
In one embodiment, the task information included in the task information set may be preset task information, for example, a plurality of data indexes, a plurality of data dimensions and a plurality of filtering conditions may be preset, so that a user may select a target index, a target dimension or a target filtering condition from the task information set.
In one embodiment, when determining a target data source for generating a data processing task, the terminal may obtain a preset set of task information that matches the target data source. The target data source may be a default data source, or may be a data source selected by the user from the data source set.
Step S504, in response to the index adding operation for the task information set, the task information is moved from the task information set to the index presentation area.
Specifically, the user may trigger an index addition operation with respect to the task information set, so that the terminal may move task information from the task information set to the index presentation area based on the index addition operation. For example, the user may drag the task information in the task information set to the index display area, or when the user clicks the task information in the task information set, the terminal may move the data index clicked by the user to the index display area. As will be readily appreciated, the task information presented in the index presentation area may be the target index.
In one embodiment, referring to fig. 6, when a user drags task information from the task information set to the index display area 602, the terminal may display the dragged task information in the index display area 602 and use the task information as a target index.
Step S506, in response to the dimension adding operation for the task information set, the task information is moved from the task information set to the dimension showing area.
Specifically, the user may trigger a dimension addition operation for the task information set, so that the terminal may move the task information from the task information set to the dimension presentation area based on the dimension addition operation. For example, the user may drag the task information in the task information set to the dimension display area, or when the user clicks the task information in the task information set, the terminal may move the data index clicked by the user to the dimension display area. As will be readily appreciated, the task information presented in the dimension presentation area may be the target dimension.
In one embodiment, referring to fig. 6, when a user drags task information from a task information set to a dimension display area 603, the terminal may display the dragged task information in the dimension display area 603 and take the task information as a target dimension.
Step S508, responding to the triggering operation of the task creation control, and displaying the creation result of the data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
Specifically, the terminal can also display a task creation control, and when the user triggers the task creation control, the terminal can respond to the triggering operation of the user on the task creation control and generate a corresponding data processing task based on the task information of the index display area and the task information in the dimension display area. That is, based on the determined target dimension and target index, a data processing task is generated and a creation result of the data processing task is presented.
In one embodiment, when the user clicks on the task creation control, the terminal may also display a task name input control so that the user may input the task name of the data processing task through the task name input control.
In one embodiment, when it is determined that the user clicks the task creation control, the terminal may further acquire a default target data source and a default target task type, and the terminal uses the target index and the target dimension selected by the user as target processing conditions, and creates a corresponding data processing task based on the target processing conditions and based on the default target data source and the target task type.
In the above method for generating a data processing task, by displaying a task information set, task information added by the index adding operation may be displayed in the index display area in response to the index adding operation for the task information set, and task information added by the dimension adding operation may be displayed in the dimension display area in response to the dimension adding operation for the task information set. By exposing the task creation control, a corresponding data processing task may be generated based on the task information exposed in the index exposure area and the task information exposed in the dimension exposure area in response to a trigger operation for the task creation control. The corresponding data processing task can be generated only by simple configuration based on the index adding operation and the dimension adding operation, so that the configuration threshold of the data processing task is reduced, the creation flow of the data processing task is simplified, the creation efficiency of the data processing task is improved, and the processing efficiency of the data processing is further improved.
In one embodiment, the method further comprises: displaying at least one product type, and displaying a selected target product type and at least one service type included in the target product type in response to a selection operation for the at least one product type; responsive to a selection operation for at least one service type, exposing at least one data source that matches the selected target service type; responsive to a selection operation for at least one data source, presenting a selected target data source; the task information set displaying comprises the following steps: a set of task information corresponding to the target data source is presented.
Specifically, the user may trigger the terminal to display the data source by selecting the product type and the service type. For example, when it is desired to determine a target data source, the user may trigger the terminal to display a data source display interface as shown in fig. 7, and display at least one product type 701 through the data source display interface. The product type refers to a type of product, for example, the product type may be a merchant product type, an industry application type, a security architecture type, and the like. The user may select the target product type from the at least one product type according to his own needs, for example, when the user desires to perform data processing on a service belonging to the merchant product type, the user may select the merchant product type from the at least one product type, so that the terminal may determine and display the at least one service type 702 included in the target product type in response to a selection operation of the user. The service type value is a service type, and the user can select a target service type from at least one service type according to own requirements, for example, when the user desires to perform data processing on a merchant bill belonging to a merchant product type, the user can select the merchant bill from at least one service type.
Further, when the target service type is determined, the terminal may acquire and present at least one data source 703 matching the target service type, and the user may select a target data source for data processing from the at least one data source presented.
In one embodiment, referring to fig. 7, the terminal may also be presented with a data source description 704 and information describing the data source is presented by the data source description, e.g., the data source description may be "business transaction summary table", "bill summary table", etc., so that the user may select a target data source by the data source description.
In one embodiment, the data source may be presented in combination with a database identification and a data table identification, as the data source may be specifically a data table in a database. For example, the illustrated data source may be "database name: a library A; data table name: and table B).
In one embodiment, the terminal may display the hotness 705 of the data source, through which the user may select the target data source. The heat of the data source can be determined by selecting the number of times of the data source by a user in a preset history period. For example, the more times a target data source is selected during a history period, the higher the heat of that data source. FIG. 7 shows a schematic representation of a presentation of a data source in one embodiment.
In one embodiment, a first correspondence between the product type and the service type may be stored in the server, so that when determining the target product type, the server may determine at least one service type matching the target product type based on the first correspondence, and send the matched service type to the terminal, so that the terminal displays correspondingly. The server may further store a second correspondence between the service type and the data source, and when determining the target service type, the server may determine at least one data source matching the target service type based on the second correspondence, and send the matched data source to the terminal, so that the terminal displays the corresponding data.
In the above embodiment, by displaying the product type and the service type, it is possible to facilitate the user to narrow the range of the target data source to be selected based on the product type and the service type, thereby reducing the number of displayed data sources based on the data source whose range is narrowed. As the number of the displayed data sources is reduced, the user can conveniently and quickly determine the target data sources from the reduced number of the data sources, and the determination efficiency of the target data sources is improved.
In one embodiment, the method further comprises: when a plurality of target data sources are provided, an association editing list and an association field selection list are displayed in response to an association editing operation for the target data sources; responding to the selection operation aiming at the association relation editing list, and displaying the selected target association relation; responsive to a selection operation for the association field selection list, presenting the selected target association field; the target association and the target association field are used to create a data processing task.
Specifically, when a plurality of target data sources are provided, the user can trigger the incidence relation editing operation for the target data sources, so that the terminal displays the incidence relation editing interface based on the incidence relation editing operation, and the user can edit the incidence relation among the plurality of target data sources through the displayed incidence relation editing interface.
For example, when the user determines the target data source through the interface shown in fig. 7, the terminal may display the target data source 604 selected by the user in fig. 6, and when the user clicks the target data source 604, the terminal may display the association editing interface 801 shown in fig. 8 in a pop-up window manner by using the clicking operation of the user as the association editing operation for the target data source.
The association edit interface may display an association edit list 802. The user may select the association relationship between the plurality of target data sources according to the requirement and through the association relationship editing list 802, so that the terminal may display the selected target association relationship in response to the user selecting operation for the association relationship editing list, for example, the user may determine to perform intersection processing on the plurality of target data sources through the association relationship editing list or perform union processing on the plurality of target data sources.
Further, an association field selection list 803 can be displayed in the association relation editing interface, and a user can select an association relation between fields in a plurality of target data sources through the association field selection list, so that the terminal can display the selected target association field in response to a selection operation of the user on the association field selection list. For example, when the user desires to perform intersection processing on field 1 in the a0 data source and field 2 in the a1 data source, the user may determine that the target association is "intersection" based on the association list, determine that the target association field in the a0 data source is "field 1" based on the association list, and determine that the target association field in the a1 data source is "field 2". FIG. 8 illustrates a schematic diagram of an association editing interface in one embodiment.
In one embodiment, when determining the target association relationship and the target association field, the terminal may generate a data processing task based on the target association relationship and the target association field, and send the data processing task to the server, so that the server generates a data source connection statement based on the target association relationship and the target association field in the data processing task, and generates a corresponding target structured statement based on the data source connection statement.
In the embodiment, the data processing task can be created only by carrying out simple association relation configuration, and the generation efficiency of the data processing task is improved.
In one embodiment, the method further comprises: in response to a filtering condition adding operation for the task information set, moving the task information from the task information set to a filtering condition display area; responding to the editing operation aiming at the task information in the filtering condition display area, and displaying the task information comprising the data filtering range obtained by editing; the edited task information is used to create a data processing task.
Specifically, since the generated target structured sentence may include a conditional filtering sentence, the terminal may display the task information set and simultaneously display a filtering condition display area 605 as shown in fig. 6, so that the user may select an initial filtering condition from the task information set according to the requirement and trigger the terminal to display the initial filtering condition in the filtering condition display area. For example, the user may drag the task information in the task information set, at this time, the terminal takes the drag operation of the user for the task information as a filtering condition adding operation, and when determining that the end point of the drag operation is the filtering condition display area, displays the task information dragged by the user in the filtering condition display area, and takes the task information displayed in the filtering condition display area as an initial filtering condition.
Further, the user can edit the task information displayed in the filtering condition display area, that is, can edit the initial filtering condition to obtain the target filtering condition, so that the terminal can display the edited target filtering condition. The displayed target filtering conditions comprise an edited data filtering range. For example, the user may click on the task information displayed in the filtering condition display area, so that the terminal displays the filtering condition editing page by using the click operation of the user as the editing operation. The displayed filter condition editing page corresponding to different task information has different contents, for example, when the task information is 'date', that is, when the initial filter condition is 'date', a calendar can be displayed in the displayed filter condition editing page, so that a user can determine a date range through the displayed calendar. When the date range is determined, the terminal can display the target filtering conditions including the date range in the filtering condition display area. For example, the "date (X year X month X day-Y year Y month Y day)" is displayed in the filter condition display area, the target filter condition may be characterized as "data of which date range from X year X month X day to Y year Y month Y day is filtered out from the data source", and further, when the computer device obtains the condition filter statement generated based on the target filter condition, the data of which date range from X year X month X day to Y year Y month Y day may be filtered out from the target data source by running the condition filter statement.
In the above embodiment, since task information that can be displayed in the filtering condition display area can be selected autonomously based on the requirement, not only flexibility but also user experience is improved.
In one embodiment, the method further comprises: displaying at least one task type, and displaying a selected target task type in response to a selection operation for the at least one task type; the target task type is used to create a data processing task and determine a statement construction template that matches the data processing task.
Specifically, the user may further configure, through the terminal, a task type of the data processing task to be generated. For example, referring to fig. 6, at least one task type 606 may be displayed in the terminal, the user may select a target task type from the displayed at least one task type according to the need, and the terminal may display the selected target task type in response to a selection operation of the user, for example, prominently display the target task type. The target task type is used for creating a data processing task and determining a corresponding statement construction template in the process of executing the data task.
In the embodiment, the target task type is displayed, so that the user can conveniently determine the task type selected by the user according to the displayed target task type, and the user experience is improved.
In one embodiment, the method further comprises: in response to a triggering operation for the statement preview control, displaying a target structured statement generated according to the task information in the index display area and the task information in the dimension display area; displaying a detection result of grammar detection on the target structured statement; when the grammar detection result represents that the target structured sentence does not accord with the grammar rule, responding to the sentence adjustment operation aiming at the target structured sentence, and displaying the target structured sentence after sentence adjustment.
Specifically, the terminal can also display a statement preview control, and when the statement preview control is determined to be triggered by the user, for example, when the statement preview control is determined to be clicked by the user, the terminal can generate a target structured statement through the configured target dimension and target index and display the target structured statement. When the target structured sentence is generated, the terminal can also carry out grammar detection on the generated target structured sentence according to a preset grammar rule, and a detection result is obtained and displayed. When the grammar detection result indicates that the target structured sentence does not accord with the grammar rule, the user can conduct sentence adjustment on the target structured sentence, so that the terminal can respond to sentence adjustment operation on the target structured sentence and display the target structured sentence after sentence adjustment.
In one embodiment, the illustrated grammar detection result may include a sentence line in the target structured sentence, where the sentence line does not conform to the preset grammar rule, and an adjustment suggestion for adjusting the sentence line, and the user may adjust the sentence line, which does not conform to the preset grammar rule, according to the adjustment suggestion.
In one embodiment, when the user clicks on the statement preview control, the terminal may generate the target structured statement prior to creating the data processing task. When the user does not click on the statement preview control, and directly clicks on the task creation control, the target structured statement may be generated after the data processing task is created.
In one embodiment, when a user clicks a statement preview control and performs statement adjustment on a target structured statement to obtain a target structured statement after statement adjustment, the terminal may generate a data processing task based on the target structured statement after statement adjustment and send the data processing task to the server, so that the server determines a corresponding target computing engine based on target task information in the data processing task and sends the target structured statement after statement adjustment to the target computing engine.
In the above embodiment, by previewing and grammar detecting the generated target structured sentence, sentence adjustment can be performed in time when the target structured sentence does not conform to the grammar rule, thereby improving the accuracy of the generated target structured sentence.
In one embodiment, the terminal may generate the target structured statement based on at least one of a target dimension, a target index, a target filtering condition, a target association relationship, a target association field, a target data source, and a target task type. For example, when the user is configured with the target dimension and the target index, the terminal can generate a corresponding target structuring statement based on the target dimension and the target index configured by the user and a default target data source and target task type; when the user is configured with a target dimension, a target index, a target filtering condition, a target association relationship, a target association field, a target data source and a target task type, the terminal can generate a corresponding target structuring statement based on the target dimension, the target index, the target filtering condition, the target association relationship, the target association field, the target data source and the target task type.
In one embodiment, the method further comprises: determining target task information of a data processing task; the target task information comprises task information located in an index display area and task information located in a dimension display area; determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to target task information to obtain a target structured sentence; determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to a target compute engine; the target structured statement of the route is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
Specifically, when determining the task information located in the index display area and the task information located in the dimension display area, the terminal may use the task information located in the index display area and the task information located in the dimension display area as target task information corresponding to the data processing task, that is, the target task information includes the target index located in the index display area and the target dimension located in the dimension display area. When determining the target task information corresponding to the data processing task, the terminal can also determine a sentence construction template matched with the data processing task, and the target task information is filled into the sentence construction template to obtain a target structured sentence. The terminal can determine a corresponding target computing engine based on the target task information, route the target structuring to the target computing engine, process data pointed by the target task information based on the received target structuring statement and process the data to obtain a task processing result of the data processing task.
In one embodiment, the target task information includes at least one of a target task type, a target data source, and target processing information; the target task type at least comprises one of a multi-dimensional analysis type, a retention analysis type and a funnel analysis type; the target processing information at least comprises one of target filtering conditions, target indexes, target dimensions, target association relations and target association fields; the target filtering condition at least comprises one of a time filtering condition, a list filtering condition and a numerical filtering condition; the target association relationship includes at least one of an intersection relationship and a union relationship.
In a specific embodiment, referring to fig. 9, there is provided a method for generating a data processing task, the method comprising:
s902, displaying at least one product type through the terminal, and responding to the selection operation for the at least one product type, displaying the selected target product type and at least one service type included in the target product type.
S904, at least one data source matched with the selected target service type is displayed through the terminal in response to the selection operation for at least one service type.
S906, displaying a task information set corresponding to the target data source through the terminal; the task information set includes a plurality of task information.
S908, the terminal responds to the index adding operation aiming at the task information set, and the task information is moved from the task information set to the index display area, so that the target index is obtained.
S910, the terminal responds to dimension adding operation aiming at the task information set, and the task information is moved from the task information set to the dimension display area, so that the target dimension is obtained.
S912, moving task information from the task information set to a filtering condition display area through a terminal in response to a filtering condition adding operation for the task information set to obtain an initial filtering condition; and displaying the target filter condition including the edited data filter range in response to the editing operation for the initial filter condition in the filter condition display area.
S914, displaying at least one task type through the terminal, and responding to the selection operation aiming at the at least one task type, displaying the selected target task type.
S916, when there are a plurality of target data sources, the association edit list and the association field selection list are presented by the terminal in response to the association edit operation for the target data sources.
S918, displaying the selected target association through the terminal in response to the selection operation for the association editing list, and displaying the selected target association through the terminal in response to the selection operation for the association field selection list.
S920, determining target task information by at least one of a target dimension, a target index, a target filtering condition, a target association relationship, a target association field, a target data source and a target task type through the terminal in response to the triggering operation of the task creation control, creating a data processing task based on the target task information, and displaying the creation result of the data processing task.
In the above method for generating a data processing task, by displaying a task information set, task information added by the index adding operation may be displayed in the index display area in response to the index adding operation for the task information set, and task information added by the dimension adding operation may be displayed in the dimension display area in response to the dimension adding operation for the task information set. By exposing the task creation control, a corresponding data processing task may be generated based on the task information exposed in the index exposure area and the task information exposed in the dimension exposure area in response to a trigger operation for the task creation control. The corresponding data processing task can be generated by simply and answering the configuration based on the index adding operation and the dimension adding operation, so that the configuration threshold of the data processing task is reduced, the generation flow of the data processing task is simplified, the generation efficiency of the data processing task is improved, and the processing efficiency of the data processing is improved.
In one particular embodiment, referring to FIG. 10, a data processing method is provided, the method comprising:
s1002, when a data processing task is obtained, determining target task information of the data processing task through the computer equipment.
S1004, determining a statement construction template matched with the target task type in the target task information through the computer equipment.
S1006, determining a target data source and target processing conditions in target processing task information through computer equipment; and filling the target data source and the target processing condition to corresponding positions in the sentence structure template respectively to obtain an initial structured sentence.
S1008, determining, by the computer device, first structural information corresponding to the target data source and second structural information related to the initial structural statement.
S1010, constructing a data clipping statement by the computer equipment according to the difference between the first structure information and the second structure information, and adding the data clipping statement to the initial structural statement.
S1012, determining a condition filtering statement and a data source connection statement in the initial structured statement through the computer equipment, and changing the condition filtering statement into the data source connection statement when the condition filtering statement is located behind the data source connection statement.
S1004, when a plurality of target data sources are provided and the initial structured statement comprises a data source connection statement, determining the data size corresponding to each target data source through the computer equipment.
S1016, adjusting the positions of the target data sources in the data source connection statement according to the data volume by the computer equipment to obtain the target structured statement.
S1018, determining, by the computer device, a number of target data sources in the target task information, determining a target compute engine that matches the data processing task based on the number of target data sources, and routing the target structured statement to the target compute engine.
S1020, determining, by the target computing engine, at least one of a data clipping statement and a conditional filtering statement included in the target structured statement.
S1022, filtering the data in the target data source by the target calculation engine based on at least one of the data clipping statement and the conditional filtering statement to obtain filtered data.
S1024, performing data processing on the filtered data according to the target index and the target dimension by the target calculation engine to obtain a task processing result of the data processing task.
In the data processing method, the sentence construction template matched with the data processing task can be determined by acquiring the data processing task. By determining target task information for the data processing task, the sentence construction template may be automatically populated according to the target task information to obtain a target structured sentence, and a target computing engine adapted to process the data processing task may be automatically determined based on the target task information. The target structured statement and the target computing engine are automatically obtained, and the target structured statement can be routed to the target computing engine, so that the target computing engine processes data through the target structured statement, and a task processing result of a data processing task is obtained. Because the target structured sentence and the target calculation engine are automatically obtained, compared with the traditional method of manually compiling the target structured sentence and importing the compiled target structured sentence into the manually determined target calculation engine, the method can save time consumed by manually compiling the sentence and manually determining the calculation engine, thereby improving the efficiency of data processing.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
The application also provides an application scene, which applies the data processing method. Specifically, the application of the data processing method in the application scene is as follows:
referring to fig. 11, when a user desires to sum the refund rate of an a product, the user may select the a product as a target product type and select a refund as a target service type, so that the terminal may display a data source matching the refund of the a product, for example, a detailed table for recording refund details. Further, the user may select a target data source from the presented data sources and trigger the terminal to present a set of task information associated with the target data source. The user can drag the task information in the task information set to the index display area and the dimension display area according to the requirement to obtain a target index and a target dimension, for example, the user can take the time in the task information set as the target dimension and take the refund number in the task information as the index. Further, the user may drag the task information in the task information set to the filter condition display area according to the requirement, and edit the task information displayed in the filter condition display area to obtain the target filter condition, for example, the target filter condition is edited as "date (X year X month-Y year Y month)". When the terminal displays a plurality of task types, the user also selects the multidimensional analysis, so that the terminal takes the multidimensional analysis selected by the user as a target task type. When the target index, the target dimension, the target filtering condition, the target data source and the target task type are determined, the terminal can generate a data processing task based on the target index, the target dimension, the target filtering condition, the target data source and the target task type and send the generated data processing task to the server.
When the server receives the data processing task, the server may determine target task information in the data processing task, determine a target data source in the target task information, and perform authority verification and data format verification on the target data source, for example, verify whether a user has authority to process data in the target data source, and whether a format of the data in the target data source meets a preset format requirement. Further, the server may perform parameter verification on the target task information to determine whether the target task information includes illegal information, and determine a metadata set of the target data source while determining that the target task information does not include illegal information, that the authority verification passes, and that the data format verification passes. The server determines a target computing engine matched with the metadata set of the target data source, and sends target structure sentences generated based on the target task information to the target computing engine so that the target computing engine can count the refund rate of the product A to obtain a statistical result. FIG. 11 illustrates a structural framework diagram of data processing in one embodiment.
The application further provides an application scene, and the application scene applies the data processing method.
Specifically, the application of the data processing method in the application scene is as follows:
when the user desires to perform data analysis, the user can start the instant messaging application and start the target application through the instant messaging application, wherein the target application is a sub-application in the instant messaging application. When the target application is started, a user can configure target task information through the target application, trigger the target application to generate a data processing task matched with the target task information, and send the data processing task to a background server of the target application. When the background server of the target application receives the data processing task, the background server of the target application can generate a target structured statement, the target structured statement is sent to a target computing engine, and the target computing engine executes the target structured statement to obtain a task processing result of the data processing task.
The above application scenario is only illustrative, and it is to be understood that the application of the data processing method provided by the embodiments of the present application is not limited to the above scenario.
Based on the same inventive concept, the embodiment of the application also provides a data processing device for realizing the above related data processing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the data processing device provided below may refer to the limitation of the data processing method hereinabove, and will not be repeated herein.
In one embodiment, as shown in FIG. 12, there is provided a data processing apparatus 1200 comprising: a task information acquisition module 1202, a statement construction module 1204, and an engine determination module 1206, wherein:
the task information acquisition module 1202 is configured to determine target task information of a data processing task when acquiring the data processing task.
The sentence construction module 1204 is configured to determine a sentence construction template that matches the data processing task, and fill the sentence construction template according to the target task information, so as to obtain a target structured sentence.
An engine determination module 1206 for determining a target computing engine that matches the data processing task based on the target task information; routing the target structured statement to a target compute engine; the target structured statement of the route is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
In one embodiment, the statement construction module 1204 is further configured to determine a target data source and a target processing condition in the target processing task information; filling the target data source and the target processing condition to corresponding positions in the sentence structure template respectively to obtain an initial structured sentence; and carrying out optimization treatment on the initial structured statement to obtain the target structured statement.
In one embodiment, the sentence construction module 1204 is further configured to determine first structural information corresponding to the target data source; determining second structure information related to the initial structured statement; constructing a data clipping statement according to the difference between the first structure information and the second structure information; adding the data clipping statement to the initial structuring statement to obtain a target structuring statement; the data clipping statement is used for clipping the data to be processed in the target data source.
In one embodiment, the statement construction module 1204 is further configured to determine a conditional filter statement and a data source connection statement in the initial structured statement; when the conditional filtering statement is located behind the data source connection statement, the conditional filtering statement is changed into the data source connection statement, and the target structured statement is obtained.
In one embodiment, the statement construction module 1204 is further configured to, when there are a plurality of target data sources and the initial structured statement includes a data source connection statement, determine a data size corresponding to each of the target data sources; and adjusting the position of each target data source in the data source connection statement according to the data volume to obtain a target structured statement.
In one embodiment, the engine determination module 1206 is further configured to determine a number of target data sources in the target task information; a target compute engine is determined that matches the data processing task based on the number of target data sources.
In one embodiment, the engine determination module 1206 is further configured to determine whether the target data source has been converted to a data source identifiable by the first computing engine when the number of target data sources is less than or equal to the preset number threshold; when the target data source is converted to the data source which can be identified by the first computing engine, the first computing engine is used as the target computing engine; the second compute engine is configured to act as a target compute engine when the target data source is not converted to a data source identifiable by the first compute engine.
In one embodiment, the engine determination module 1206 is further configured to obtain a pre-trained engine determination model when the number of target data sources is greater than a preset number threshold; the engine determination model comprises a plurality of determination sub-models; determining metadata information corresponding to each target data source to obtain a metadata information set; respectively carrying out information processing on the metadata information set through each determination sub-model to obtain a result output by each determination sub-model; and synthesizing the results output by each determination sub-model to obtain a target calculation engine matched with the data processing task.
In one embodiment, the data processing apparatus 1200 is further configured to determine, by the target computing engine, at least one of a data clipping statement and a conditional filtering statement included in the target structured statement; filtering the data in the target data source by the target calculation engine based on at least one of the data clipping statement and the condition filtering statement to obtain filtered data; and performing data processing on the filtered data according to the target index and the target dimension by using the target calculation engine to obtain a task processing result of the data processing task.
In one embodiment, as shown in fig. 13, there is provided a generating apparatus 1300 of a data processing task, including: an index determination module 1302, a dimension determination module 1304, and a task creation module 1306, wherein:
an index determination module 1302 for displaying a set of task information; the task information set comprises a plurality of task information; in response to an index adding operation for the task information set, the task information is moved from the task information set to the index presentation area.
The dimension determination module 1304 is configured to move the task information from the task information set to the dimension presentation area in response to a dimension addition operation for the task information set.
A task creation module 1306, configured to respond to a trigger operation for a task creation control, and display a creation result of a data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
In one embodiment, the generating device 1300 of the data processing task is further configured to display at least one product type, and respond to a selection operation for the at least one product type to display a selected target product type and at least one service type included in the target product type; responsive to a selection operation for at least one service type, exposing at least one data source that matches the selected target service type;
responsive to a selection operation for at least one data source, presenting a selected target data source; a set of task information corresponding to the target data source is presented.
In one embodiment, the generating device 1300 of the data processing task is further configured to, when there are a plurality of target data sources, display an association editing list and an association field selection list in response to an association editing operation for the target data sources; responding to the selection operation aiming at the association relation editing list, and displaying the selected target association relation; responsive to a selection operation for the association field selection list, presenting the selected target association field; the target association and the target association field are used to create a data processing task.
In one embodiment, the generating device 1300 of the data processing task is further configured to display at least one task type, and respond to a selection operation for the at least one task type to display a selected target task type; the target task type is used to create a data processing task and to determine a statement construction template that matches the data processing task.
In one embodiment, the generating device 1300 of the data processing task is further configured to, in response to a triggering operation for the statement preview control, demonstrate that a target structured statement is generated according to the task information in the index showing area and the task information in the dimension showing area; displaying a detection result of grammar detection on the target structured statement; when the grammar detection result represents that the target structured sentence does not accord with the grammar rule, responding to the sentence adjustment operation aiming at the target structured sentence, and displaying the target structured sentence after sentence adjustment.
In one embodiment, the generating device 1300 of the data processing task is further configured to determine target task information of the data processing task; the target task information comprises task information located in an index display area and task information located in a dimension display area; determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to target task information to obtain a target structured sentence; determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to a target compute engine; the target structured statement of the route is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
The above-described data processing apparatus, and the respective modules in the generation apparatus of the data processing task may be realized in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 14. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data processing data. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to carry out a method of generating data processing tasks.
In one embodiment, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 15. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to carry out a method of generating data processing tasks. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by persons skilled in the art that the structures shown in fig. 14-15 are block diagrams of only portions of structures associated with the present inventive arrangements and are not intended to limit the computer apparatus to which the present inventive arrangements are applicable, and that a particular computer apparatus may include more or less components than those shown, or may be combined with certain components, or may have different arrangements of components.
In an embodiment, there is also provided a computer device including a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of any of the data processing method embodiments described above when the computer program is executed.
In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, performs the steps of any of the data processing method embodiments described above.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the steps of any of the data processing method embodiments described above.
In an embodiment, there is also provided a computer device including a memory and a processor, the memory storing a computer program, the processor implementing the steps of any of the data processing task generating method embodiments described above when executing the computer program.
In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, performs the steps of an embodiment of a method of generating any of the above-mentioned data processing tasks.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of the generating method embodiment of any of the data processing tasks described above.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (20)

1. A method of data processing, the method comprising:
when a data processing task is obtained, determining target task information of the data processing task;
determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to the target task information to obtain a target structured sentence;
Determining a target computing engine matched with the data processing task according to the target task information;
routing the target structured statement to the target compute engine;
the target structured statement is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
2. The method according to claim 1, wherein the filling the sentence construction template according to the target task information to obtain a target structured sentence includes:
determining a target data source and target processing conditions in the target processing task information;
filling the target data source and the target processing condition to corresponding positions in the statement structure template respectively to obtain an initial structured statement;
and carrying out optimization processing on the initial structured statement to obtain a target structured statement.
3. The method according to claim 2, wherein the optimizing the initial structured statement to obtain the target structured statement includes:
determining first structure information corresponding to the target data source;
determining second structural information related to the initial structural statement;
Constructing a data clipping statement according to the difference between the first structure information and the second structure information;
adding the data clipping statement to the initial structuring statement to obtain a target structuring statement;
the data clipping statement is used for clipping the data to be processed in the target data source.
4. The method according to claim 2, wherein the optimizing the initial structured statement to obtain the target structured statement includes:
determining a conditional filtering statement and a data source connection statement in the initial structuring statement;
and when the condition filtering statement is positioned behind the data source connection statement, changing the condition filtering statement into the data source connection statement to obtain a target structured statement.
5. The method according to claim 2, wherein the optimizing the initial structured statement to obtain the target structured statement includes:
when a plurality of target data sources are provided and the initial structuring statement comprises a data source connection statement, respectively determining the data size corresponding to each target data source;
And adjusting the position of each target data source in the data source connection statement according to the data volume to obtain a target structured statement.
6. The method of claim 1, wherein determining a target computing engine that matches the data processing task based on the target task information comprises:
determining the number of target data sources in the target task information;
and determining a target computing engine matched with the data processing task according to the number of the target data sources.
7. The method of claim 6, wherein determining a target compute engine that matches the data processing task based on the number of target data sources comprises:
when the number of the target data sources is smaller than or equal to a preset number threshold, determining whether the target data sources are converted to the data sources which can be identified by the first calculation engine;
when the target data source is converted to a data source which can be identified by a first computing engine, the first computing engine is used as a target computing engine;
and when the target data source is not converted to the data source which can be identified by the first computing engine, the second computing engine is used as the target computing engine.
8. The method of claim 6, wherein determining a target compute engine that matches the data processing task based on the number of target data sources comprises:
when the number of the target data sources is larger than a preset number threshold, acquiring a pre-trained engine determination model; the engine determination model comprises a plurality of determination sub-models;
determining metadata information corresponding to each target data source to obtain a metadata information set;
respectively carrying out information processing on the metadata information set through each determination sub-model to obtain a result output by each determination sub-model;
and synthesizing the respective output results of each determination sub-model to obtain a target calculation engine matched with the data processing task.
9. The method according to any one of claims 1 to 8, further comprising:
determining, by the target computing engine, at least one of a data clipping statement and a conditional filtering statement included in the target structured statement;
filtering the data in the target data source by the target calculation engine based on at least one of the data clipping statement and the conditional filtering statement to obtain filtered data;
And performing data processing on the filtered data according to the target index and the target dimension by the target calculation engine to obtain a task processing result of the data processing task.
10. A method of generating a data processing task, the method comprising:
displaying a task information set; the task information set comprises a plurality of task information;
responsive to an index adding operation for a task information set, moving the task information from the task information set to an index presentation area;
responsive to a dimension add operation for a set of task information, moving the task information from the set of task information to a dimension presentation area;
responding to the triggering operation of the task creation control, and displaying the creation result of the data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
11. The method according to claim 10, wherein the method further comprises:
displaying at least one product type, and displaying a selected target product type and at least one service type included in the target product type in response to a selection operation for the at least one product type;
Responsive to a selection operation for the at least one traffic type, exposing at least one data source that matches the selected target traffic type;
responsive to a selection operation for the at least one data source, presenting a selected target data source;
the task information set presentation device comprises:
and displaying a task information set corresponding to the target data source.
12. The method according to claim 10, wherein the method further comprises:
when a plurality of target data sources exist, responding to the association relation editing operation aiming at the target data sources, and displaying an association relation editing list and an association field selection list;
responding to the selection operation aiming at the association relation editing list, and displaying the selected target association relation;
responsive to a selection operation for the association field selection list, presenting a selected target association field; the target association relationship and the target association field are used for creating a data processing task.
13. The method according to claim 10, wherein the method further comprises:
in response to a filtering condition adding operation for a task information set, moving the task information from the task information set to a filtering condition display area;
Responding to the editing operation aiming at the task information in the filtering condition display area, and displaying the task information comprising the data filtering range obtained by editing; the task information including the edited data filtering range is used for creating a data processing task.
14. The method according to claim 10, wherein the method further comprises:
displaying at least one task type, and displaying a selected target task type in response to a selection operation for the at least one task type; the target task type is used for creating a data processing task and determining a statement construction template matched with the data processing task.
15. The method according to any one of claims 10 to 14, further comprising:
determining target task information of a data processing task; the target task information comprises task information in an index display area and task information in a dimension display area;
determining a sentence construction template matched with the data processing task, and filling the sentence construction template according to the target task information to obtain a target structured sentence;
Determining a target computing engine matched with the data processing task according to the target task information;
routing the target structured statement to the target compute engine; the target structured statement is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
16. A data processing apparatus, the apparatus comprising:
the task information acquisition module is used for determining target task information of the data processing task when acquiring the data processing task;
the sentence construction module is used for determining a sentence construction template matched with the data processing task and filling the sentence construction template according to the target task information to obtain a target structured sentence;
the engine determining module is used for determining a target computing engine matched with the data processing task according to the target task information; routing the target structured statement to the target compute engine; the target structured statement is used for triggering the target computing engine to execute the data processing task to obtain a task processing result.
17. A device for generating a data processing task, the device comprising:
the index determining module is used for displaying the task information set; the task information set comprises a plurality of task information; responsive to an index adding operation for a task information set, moving the task information from the task information set to an index presentation area;
the dimension determining module is used for responding to dimension adding operation aiming at a task information set and moving the task information from the task information set to a dimension display area;
the task creation module is used for responding to the triggering operation of the task creation control and displaying the creation result of the data processing task; the data processing task is created according to the task information moving to the index display area and the task information moving to the dimension display area.
18. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any one of claims 1 to 9 or the steps of the method of any one of claims 10 to 15.
19. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 9, or the steps of the method of any one of claims 10 to 15.
20. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the method of any one of claims 1 to 9 or the steps of the method of any one of claims 10 to 15.
CN202210380778.4A 2022-04-12 2022-04-12 Data processing method and device and data processing task generation method and device Pending CN116955386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210380778.4A CN116955386A (en) 2022-04-12 2022-04-12 Data processing method and device and data processing task generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210380778.4A CN116955386A (en) 2022-04-12 2022-04-12 Data processing method and device and data processing task generation method and device

Publications (1)

Publication Number Publication Date
CN116955386A true CN116955386A (en) 2023-10-27

Family

ID=88455151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210380778.4A Pending CN116955386A (en) 2022-04-12 2022-04-12 Data processing method and device and data processing task generation method and device

Country Status (1)

Country Link
CN (1) CN116955386A (en)

Similar Documents

Publication Publication Date Title
US20200226133A1 (en) Knowledge map building system and method
US8180795B2 (en) Apparatus and method for distribution of a report with dynamic write-back to a data source
US20080082908A1 (en) Apparatus and method for data charting with adaptive learning
US20140331179A1 (en) Automated Presentation of Visualized Data
CN107357812A (en) A kind of data query method and device
CN114115844A (en) Page generation method and device, computer equipment and storage medium
US9934292B2 (en) Dynamic presentation of a results set by a form-based software application
Moncrieff et al. An open source, server-side framework for analytical web mapping and its application to health
Piccialli et al. S-InTime: A social cloud analytical service oriented system
US9230022B1 (en) Customizable result sets for application program interfaces
CN116955386A (en) Data processing method and device and data processing task generation method and device
CN115617338A (en) Method and device for quickly generating service page and readable storage medium
CN114117161A (en) Display method and device
CN116738960B (en) Document data processing method, system, computer equipment and storage medium
US12001710B2 (en) Dynamic update of consolidated data based on granular data values
US20240028250A1 (en) Dynamic update of consolidated data based on granular data values
CN117453327A (en) Application interface acquisition method, device, computer equipment and storage medium
CA2663859C (en) Apparatus and method for updating a report through view time interaction
US20130060806A1 (en) Data Solution Composition Architecture
CN118151914A (en) Visual configuration method, device and storage medium for dynamic data source
CN115481616A (en) Target text acquisition method and device, computer equipment and storage medium
CN116244347A (en) Recording method and device of service data and computer equipment
CN116738953A (en) Report generation method, report generation device, computer equipment and computer readable storage medium
CN118012961A (en) Data processing method, data management system, electronic device and storage medium
CN117130606A (en) Front-end page configuration method, front-end page configuration device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination