CN113688602A - Task processing method and device - Google Patents

Task processing method and device Download PDF

Info

Publication number
CN113688602A
CN113688602A CN202111244929.5A CN202111244929A CN113688602A CN 113688602 A CN113688602 A CN 113688602A CN 202111244929 A CN202111244929 A CN 202111244929A CN 113688602 A CN113688602 A CN 113688602A
Authority
CN
China
Prior art keywords
task
processed
fingerprint
configuration parameters
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111244929.5A
Other languages
Chinese (zh)
Inventor
许时昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CLP Cloud Digital Intelligence Technology Co Ltd
Original Assignee
CLP Cloud Digital Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CLP Cloud Digital Intelligence Technology Co Ltd filed Critical CLP Cloud Digital Intelligence Technology Co Ltd
Priority to CN202111244929.5A priority Critical patent/CN113688602A/en
Publication of CN113688602A publication Critical patent/CN113688602A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application discloses a task processing method, which comprises the following steps: acquiring a task to be processed; generating a task fingerprint according to the task parameter of the task to be processed; according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition; and executing the task to be processed according to the configuration parameters of the target historical task. Therefore, the configuration parameters of the target historical task can be understood as the optimized configuration parameters of the target historical task, and the similarity between the target historical task and the task fingerprint of the task to be processed meets the first preset condition (namely the similarity is higher), so that the task to be processed can be executed according to the configuration parameters of the target historical task, the task processing efficiency and precision are improved, and the user experience is further improved.

Description

Task processing method and device
Technical Field
The present application relates to the field of data processing, and in particular, to a method and an apparatus for processing a task.
Background
In the prior art, parameters of Hive operation are basically unified through configuration files, such as Hive-site. The prior art has the defects that an automatic optimization scheme of a task level is not provided, manual optimization cost is high, maintenance is difficult, intelligent parameter optimization cannot be performed on different tasks, session level parameters are manually adjusted according to subjective experiences by people, certain subjectivity, one-sidedness and limitation are achieved, errors can be caused in the parameter adjustment process, and the problems that parameter adjustment is inaccurate and not optimized enough due to manual negligence or other human factors are prone to occur.
Disclosure of Invention
The application provides a task processing method, so that the efficiency and the accuracy of task processing can be improved, and further the user experience is improved.
In a first aspect, the present application provides a task processing method, including:
acquiring a task to be processed;
generating a task fingerprint according to the task parameter of the task to be processed;
according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition;
and executing the task to be processed according to the configuration parameters of the target historical task.
Optionally, the acquiring the task to be processed includes:
acquiring the task to be processed through SQL service; the task to be processed is a Hive task.
Optionally, the generating a task fingerprint according to the task parameter of the task to be processed includes:
constructing a task vector according to the task parameters of the task to be processed;
generating a task fingerprint according to the task vector;
the task parameters comprise an input file format, a file size, a shuffle/join number and a table schema field number of the task to be processed.
Optionally, the first preset condition is a historical task with the highest similarity to the task fingerprint among all historical tasks.
Optionally, the obtaining the task to be processed according to the configuration parameters of the target historical task includes
Parsing the syntax tree of the task to be processed according to the configuration parameters to obtain task execution plan information;
and controlling each task node to execute the task to be processed according to the task execution plan information.
Optionally, the method further includes:
and if the task to be processed fails to be executed, continuing to execute the step of determining a target historical task with the similarity meeting preset conditions with the task fingerprint according to the task fingerprint until the task to be processed is executed completely.
Optionally, the method further includes:
if the task to be processed is executed, adjusting the configuration parameters according to the operation parameters of the task to be processed and the operation parameters of the historical task with the similarity degree with the task fingerprint meeting a second preset condition to obtain adjusted configuration parameters;
and taking the adjusted configuration parameters as the configuration parameters corresponding to the tasks to be processed.
In a second aspect, the present application provides a task processing device, the device comprising:
the acquisition unit is used for acquiring the tasks to be processed;
the generating unit is used for generating a task fingerprint according to the task parameter of the task to be processed;
the determining unit is used for determining a target historical task of which the similarity with the task fingerprint meets a first preset condition according to the task fingerprint;
and the execution unit is used for executing the acquired task to be processed according to the configuration parameters of the target historical task.
Optionally, the obtaining unit is configured to:
acquiring the task to be processed through SQL service; the task to be processed is a Hive task.
Optionally, the generating unit is configured to:
constructing a task vector according to the task parameters of the task to be processed;
generating a task fingerprint according to the task vector;
the task parameters comprise an input file format, a file size, a shuffle/join number and a table schema field number of the task to be processed.
Optionally, the first preset condition is a historical task with the highest similarity to the task fingerprint among all historical tasks.
Optionally, the execution unit is configured to:
parsing the syntax tree of the task to be processed according to the configuration parameters to obtain task execution plan information;
and controlling each task node to execute the task to be processed according to the task execution plan information.
Optionally, the execution unit is further configured to:
and if the task to be processed fails to be executed, continuing to execute the step of determining a target historical task with the similarity meeting preset conditions with the task fingerprint according to the task fingerprint until the task to be processed is executed completely.
Optionally, the apparatus further includes an adjusting unit, configured to:
if the task to be processed is executed, adjusting the configuration parameters according to the operation parameters of the task to be processed and the operation parameters of the historical task with the similarity degree with the task fingerprint meeting a second preset condition to obtain adjusted configuration parameters;
and taking the adjusted configuration parameters as the configuration parameters corresponding to the tasks to be processed.
In a third aspect, the present application provides a readable medium comprising executable instructions, which when executed by a processor of an electronic device, perform the method according to any of the first aspect.
In a fourth aspect, the present application provides an electronic device comprising a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor performs the method according to any one of the first aspect.
According to the technical scheme, the task processing method is provided, and in the embodiment, the task to be processed is obtained firstly; then, generating a task fingerprint according to the task parameter of the task to be processed; then, according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition; and finally, the task to be processed is acquired according to the configuration parameters of the target historical task. Therefore, the task processing method and the device can generate the task fingerprint according to the task parameter of the task to be processed, execute the task to be processed by utilizing the configuration parameter of the target historical task, the similarity of which meets the first preset condition, of which the similarity is higher, and execute the task to be processed according to the configuration parameter of the target historical task because the configuration parameter of the target historical task can be understood as the optimal configuration parameter of the target historical task and the similarity of which meets the first preset condition (namely, the similarity is higher) The system has the advantages that problems of low efficiency, time and labor consumption, instability and high cost are solved, problems of certain subjectivity, one-sidedness and limitation of manually adjusted configuration parameters are solved, efficiency and accuracy of task processing are improved, and user experience is improved.
Further effects of the above-mentioned unconventional preferred modes will be described below in conjunction with specific embodiments.
Drawings
In order to more clearly illustrate the embodiments or prior art solutions of the present application, the drawings needed for describing the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings can be obtained by those skilled in the art without inventive exercise.
Fig. 1 is a schematic flowchart of a task processing method according to an embodiment of the present application;
fig. 2 is a schematic system architecture diagram of a task processing system according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a task processing device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following embodiments and accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the prior art, parameters of Hive operation are basically unified through configuration files, such as Hive-site. The prior art has the defects that an automatic optimization scheme of a task level is not provided, manual optimization cost is high, maintenance is difficult, intelligent parameter optimization cannot be performed on different tasks, session level parameters are manually adjusted according to subjective experiences by people, certain subjectivity, one-sidedness and limitation are achieved, errors can be caused in the parameter adjustment process, and the problems that parameter adjustment is inaccurate and not optimized enough due to manual negligence or other human factors are prone to occur.
The application provides a task processing method, in the embodiment, a task to be processed is obtained first; then, generating a task fingerprint according to the task parameter of the task to be processed; then, according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition; and finally, the task to be processed is acquired according to the configuration parameters of the target historical task. Therefore, the task processing method and the device can generate the task fingerprint according to the task parameter of the task to be processed, execute the task to be processed by utilizing the configuration parameter of the target historical task, the similarity of which with the task fingerprint meets the first preset condition, and execute the task to be processed, because the configuration parameter of the target historical task can be understood as the optimized configuration parameter of the target historical task, and the similarity of the target historical task and the task fingerprint of the task to be processed meets the first preset condition (namely, the similarity is higher), the task to be processed can be executed according to the configuration parameter of the target historical task, and thus, the task to be processed does not need to be manually adjusted according to subjective experience as in the prior art, and the problems that the accuracy of adjusting the configuration parameter is low and the accuracy of adjusting the configuration parameter is low due to operation and evaluation errors in the manual operation process are avoided, The system has the advantages that problems of low efficiency, time and labor consumption, instability and high cost are solved, problems of certain subjectivity, one-sidedness and limitation of manually adjusted configuration parameters are solved, efficiency and accuracy of task processing are improved, and user experience is improved. It should be noted that the embodiment of the present application may be applied to an electronic device (such as a mobile phone, a tablet, a computer, etc.) or a server. In addition to the above-mentioned embodiments, other embodiments are also possible, and are not limited herein.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a task processing method in an embodiment of the present application is shown, and in this embodiment, the method may include the following steps:
s101: and acquiring the task to be processed.
In this embodiment, the task to be processed is a Hive task. The task to be processed may be an SQL task corresponding to Hive, and may include, for example, an SQL statement, a library table to be queried, an engine identifier to be used, and metadata information (such as the number of shuffle/join (i.e., table association) and the number of table schema fields) corresponding to the library table.
In one implementation, the to-be-processed task may be obtained through an SQL service. In the task processing system shown in fig. 2, the SQL service may be an aql submit module in fig. 2, and a user submits a task to be processed through SQL, and the aql submit module may obtain the task to be processed.
S102: and generating a task fingerprint according to the task parameter of the task to be processed.
In this embodiment, the task parameters of the to-be-processed task may be understood as being capable of reflecting task information of the to-be-processed task, and in an implementation manner, the task parameters of the to-be-processed task may include parameters such as an input file format, a file size, a number of shuffle/join (i.e., table association), and a number of table schema fields of the to-be-processed task. The task fingerprint may be understood to reflect a task characteristic corresponding to the task to be processed, that is, the task fingerprint may represent key information of the task to be processed (for example, may reflect a content of the task to be processed, a base table to be queried, an engine identifier to be used, and a feature of a task parameter such as metadata information corresponding to the base table, and a task similarity degree), so that the task to be processed may be queried or determined according to the task fingerprint.
In this embodiment, a task vector may be constructed according to task parameters of the task to be processed, it should be noted that key parameters (i.e., key information) in a parameter list (i.e., task parameters) are found through experience, machine learning, and the like, and a vector constructed according to the key parameters may be referred to as a task vector. Then, a task fingerprint may be generated according to the task vector, for example, a similarity index may be constructed for the task vector to construct the task fingerprint.
As shown in fig. 2, in the Task processing system, a Task Fingerprint module in the system may obtain Task parameters (such as table schema, file format, file size, shuffle, and the like) of the to-be-processed Task from an aql submit module, generate a Task Fingerprint according to the Task parameters of the to-be-processed Task, and send the Task Fingerprint to a Driver module.
S103: and according to the task fingerprint, determining a target historical task with the similarity meeting a first preset condition with the task fingerprint.
In this embodiment, a plurality of historical tasks and configuration parameters respectively corresponding to the historical tasks may be stored in advance, where it should be noted that the configuration parameters corresponding to the historical tasks are optimal configuration parameters of the historical tasks, that is, the configuration parameters may enable the execution efficiency of the historical tasks to be the highest, for example, the configuration parameters may enable the execution time of the historical tasks to be the shortest, and/or the execution time of the historical tasks to be the least, and/or the execution resources of the historical tasks to be the least. As shown in fig. 2, in the task processing system, a Config optimization module (i.e., an optimizer) in the system stores a plurality of historical tasks and configuration parameters corresponding to the historical tasks in advance. The configuration parameter of the task may be understood as a parameter corresponding to each task node that executes the task, such as a session level parameter of the task node, which may include, for example, the number of processes that execute the task engine.
In an implementation manner, the first preset condition may be a historical task with the highest similarity to the task fingerprint among all historical tasks. That is to say, in this embodiment, the similarity between the task fingerprint corresponding to the task to be processed and each historical task may be determined first, and the historical task with the highest similarity value may be taken as the target historical task.
As shown in fig. 2, in the task processing system, according to the task fingerprint, the Hooks interface in the Driver module in the system may determine, from the Config optimization module, a target historical task whose similarity to the task fingerprint satisfies a first preset condition, and acquire a configuration parameter of the target historical task.
S104: and executing the task to be processed according to the configuration parameters of the target historical task.
Because the similarity between the target historical task and the task fingerprint of the task to be processed is the highest, that is, the similarity between the task parameter of the target historical relic and the task parameter of the task to be processed is the highest, it can be considered that the configuration parameter corresponding to the historical task can also make the execution efficiency of the task to be processed the highest, for example, the configuration parameter can make the execution time of the task to be processed the shortest, and/or the resource consumed or occupied by the execution of the task to be processed the shortest.
In this embodiment, syntax tree parsing may be performed on the to-be-processed task according to the configuration parameters to obtain task execution plan information, where the task execution plan information may be understood as execution information of each task node, such as an execution sequence, time, and resource configuration parameters; for example, the Antlr4 toolkit can be called by SQL statements to generate a syntax tree structure of a tree, i.e., task execution plan information. Then, each task node can be controlled to execute the task to be processed according to the task execution plan information.
In the task processing system shown in fig. 2, a Hooks interface in a Driver module in the system may send configuration parameters to an AST module, so that the AST module generates an abstract syntax tree according to SQL statements in the configuration parameters, a PLAN module performs relationship-based optimization according to the abstract syntax tree, and generates a physical execution PLAN (i.e., task execution PLAN information) based on cost optimization, in this process, the AST module and the PLAN module send task fingerprints, configuration parameters, and operation parameters of the task to be processed (for example, each part may consume time and resource usage conditions, a part may include syntax parsing time, syntax tree optimization time, spark engine time, each stage consumes time, and resource usage conditions may include resources such as machine memory, cpu, network bandwidth, and the like) to the Driver for collection; the Exec module controls each task node to execute the task to be processed according to the task execution plan information, and in the process, the Exec module sends task fingerprints, configuration parameters and operation parameters (such as time consumption and resource use conditions of each part) of the task to be processed to the Driver module for collection; it should be noted that the Driver may store the collected operation parameters into the Job history module, specifically, the Job history module may store a historical execution condition of the SQL Task, the Job history module may count all SQL Task parameters, time resource cost, and the like in the history, the Task finger print generated according to the SQL statement and the metadata information may be stored in the Task finger print in the Job history module, the Config module may store a parameter configuration used when the SQL engine is executed, and the presentation module in the Job history module may store time cost generated in the execution process of the SQL Task.
According to the technical scheme, the task processing method is provided, and in the embodiment, the task to be processed is obtained firstly; then, generating a task fingerprint according to the task parameter of the task to be processed; then, according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition; and finally, the task to be processed is acquired according to the configuration parameters of the target historical task. Therefore, the task processing method and the device can generate the task fingerprint according to the task parameter of the task to be processed, execute the task to be processed by utilizing the configuration parameter of the target historical task, the similarity of which with the task fingerprint meets the first preset condition, and execute the task to be processed, because the configuration parameter of the target historical task can be understood as the optimized configuration parameter of the target historical task, and the similarity of the target historical task and the task fingerprint of the task to be processed meets the first preset condition (namely, the similarity is higher), the task to be processed can be executed according to the configuration parameter of the target historical task, and thus, the task to be processed does not need to be manually adjusted according to subjective experience as in the prior art, and the problems that the accuracy of adjusting the configuration parameter is low and the accuracy of adjusting the configuration parameter is low due to operation and evaluation errors in the manual operation process are avoided, The system has the advantages that problems of low efficiency, time and labor consumption, instability and high cost are solved, problems of certain subjectivity, one-sidedness and limitation of manually adjusted configuration parameters are solved, efficiency and accuracy of task processing are improved, and user experience is improved.
In one implementation, the method further comprises:
and if the task to be processed fails to be executed, continuing to execute the step of determining a target historical task with the similarity meeting preset conditions with the task fingerprint according to the task fingerprint until the task to be processed is executed completely.
In this embodiment, if the to-be-processed task fails to be executed, which indicates that the configuration parameters are not matched with the to-be-processed task, the configuration parameters of a new historical task may be continuously queried, for example, a historical task with a similarity to the task fingerprint after a similarity corresponding to a previous configuration parameter may be queried, assuming that the previous historical task is a configuration parameter corresponding to a historical task with a highest similarity, which may use the configuration parameter corresponding to the historical task with the second highest similarity as the configuration parameter of the new historical task, that is, the step of determining, according to the task fingerprint, a target historical task with a similarity to the task fingerprint meeting a preset condition is continuously executed until the execution of the to-be-processed task is completed.
In one embodiment of the present application, the method further comprises: if the task to be processed is executed, adjusting the configuration parameters according to the operation parameters of the task to be processed and the operation parameters of the historical task with the similarity degree with the task fingerprint meeting a second preset condition to obtain adjusted configuration parameters; and taking the adjusted configuration parameters as the configuration parameters corresponding to the tasks to be processed.
The second preset condition can be understood that the similarity between the task fingerprints of the historical task and the task fingerprint of the task to be processed is greater than a preset threshold.
It should be noted that, because the configuration parameters and the operation parameters of the similar task (i.e., the historical task whose similarity to the task fingerprint satisfies the second preset condition) are different, the configuration parameters corresponding to the to-be-processed task (i.e., the configuration parameters for executing the to-be-processed task) may be optimized according to the operation parameters, the configuration parameters, and the operation parameters and the configuration parameters of the historical task whose similarity to the task fingerprint satisfies the second preset condition, in combination with machine learning or an optimal parameter algorithm, etc., to obtain the optimal parameters, i.e., the adjusted configuration parameters, and store the adjusted configuration parameters as the configuration parameters corresponding to the to-be-processed task.
As shown in fig. 2, in the task processing system, a Config optimizer module in the system may obtain a task fingerprint, configuration parameters, and running parameters of a task to be processed from a Job history module, the Config optimizer module may optimize the running parameters, configuration parameters, and running parameters and configuration parameters corresponding to the task to be processed of a historical task whose similarity to the task fingerprint satisfies a second preset condition, in combination with machine learning or an optimal parameter algorithm, and the like, the configuration parameters corresponding to the task to be processed (i.e., the configuration parameters for executing the task to be processed) to obtain optimal parameters, i.e., adjusted configuration parameters, and store the adjusted configuration parameters as the configuration parameters corresponding to the task to be processed, so that the running state of the historical task is analyzed to obtain next running parameters of the similar task, thereby performing task-level parameter optimization, and further the efficiency of subsequent task processing is improved.
Fig. 3 shows a specific embodiment of a task processing device according to the present application. The apparatus of this embodiment is a physical apparatus for executing the method of the above embodiment. The technical solution is essentially the same as the above embodiments, and the apparatus in this embodiment includes:
an obtaining unit 301, configured to obtain a task to be processed;
a generating unit 302, configured to generate a task fingerprint according to a task parameter of the to-be-processed task;
a determining unit 303, configured to determine, according to the task fingerprint, a target historical task whose similarity to the task fingerprint satisfies a first preset condition;
an executing unit 304, configured to execute the task to be processed according to the configuration parameter of the target historical task.
Optionally, the obtaining unit 301 is configured to:
acquiring the task to be processed through SQL service; the task to be processed is a Hive task.
Optionally, the generating unit 302 is configured to:
constructing a task vector according to the task parameters of the task to be processed;
generating a task fingerprint according to the task vector;
the task parameters comprise an input file format, a file size, a shuffle/join number and a table schema field number of the task to be processed.
Optionally, the first preset condition is a historical task with the highest similarity to the task fingerprint among all historical tasks.
Optionally, the execution unit 304 is configured to:
parsing the syntax tree of the task to be processed according to the configuration parameters to obtain task execution plan information;
and controlling each task node to execute the task to be processed according to the task execution plan information.
Optionally, the execution unit 304 is further configured to:
and if the task to be processed fails to be executed, continuing to execute the step of determining a target historical task with the similarity meeting preset conditions with the task fingerprint according to the task fingerprint until the task to be processed is executed completely.
Optionally, the apparatus further includes an adjusting unit, configured to:
if the task to be processed is executed, adjusting the configuration parameters according to the operation parameters of the task to be processed and the operation parameters of the historical task with the similarity degree with the task fingerprint meeting a second preset condition to obtain adjusted configuration parameters;
and taking the adjusted configuration parameters as the configuration parameters corresponding to the tasks to be processed.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. On the hardware level, the electronic device comprises a processor and optionally an internal bus, a network interface and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 4, but that does not indicate only one bus or one type of bus.
And the memory is used for storing the execution instruction. In particular, a computer program that can be executed by executing instructions. The memory may include both memory and non-volatile storage and provides execution instructions and data to the processor.
In a possible implementation manner, the processor reads the corresponding execution instruction from the nonvolatile memory to the memory and then runs the execution instruction, and the corresponding execution instruction can also be obtained from other devices so as to form the task processing device on a logic level. The processor executes the execution instructions stored in the memory, so that the task processing method provided by any embodiment of the application is realized through the executed execution instructions.
The method executed by the task processing device according to the embodiment shown in fig. 1 of the present application may be applied to a processor, or may be implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The embodiment of the present application further provides a readable storage medium, where the readable storage medium stores an execution instruction, and when the stored execution instruction is executed by a processor of an electronic device, the electronic device can be caused to execute the task processing method provided in any embodiment of the present application, and is specifically configured to execute the method described in the above task processing.
The electronic device described in the foregoing embodiments may be a computer.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for processing a task, the method comprising:
acquiring a task to be processed;
generating a task fingerprint according to the task parameter of the task to be processed;
according to the task fingerprint, determining a target historical task of which the similarity with the task fingerprint meets a first preset condition;
and executing the task to be processed according to the configuration parameters of the target historical task.
2. The method of claim 1, wherein the obtaining the task to be processed comprises:
acquiring the task to be processed through SQL service; the task to be processed is a Hive task.
3. The method according to claim 1, wherein generating a task fingerprint according to the task parameters of the task to be processed comprises:
constructing a task vector according to the task parameters of the task to be processed;
generating a task fingerprint according to the task vector;
the task parameters comprise an input file format, a file size, a shuffle/join number and a table schema field number of the task to be processed.
4. The method according to claim 1, wherein the first preset condition is a historical task with highest similarity to the task fingerprint in all historical tasks.
5. The method of claim 1, wherein the obtaining the task to be processed is performed according to configuration parameters of the target historical task, and comprises
Parsing the syntax tree of the task to be processed according to the configuration parameters to obtain task execution plan information;
and controlling each task node to execute the task to be processed according to the task execution plan information.
6. The method according to any one of claims 1-4, further comprising:
and if the task to be processed fails to be executed, continuing to execute the step of determining a target historical task with the similarity meeting preset conditions with the task fingerprint according to the task fingerprint until the task to be processed is executed completely.
7. The method according to any one of claims 1-4, further comprising:
if the task to be processed is executed, adjusting the configuration parameters according to the operation parameters of the task to be processed and the operation parameters of the historical task with the similarity degree with the task fingerprint meeting a second preset condition to obtain adjusted configuration parameters;
and taking the adjusted configuration parameters as the configuration parameters corresponding to the tasks to be processed.
8. A task processing apparatus, characterized in that the apparatus comprises:
the acquisition unit is used for acquiring the tasks to be processed;
the generating unit is used for generating a task fingerprint according to the task parameter of the task to be processed;
the determining unit is used for determining a target historical task of which the similarity with the task fingerprint meets a first preset condition according to the task fingerprint;
and the execution unit is used for executing the acquired task to be processed according to the configuration parameters of the target historical task.
9. A readable medium, characterized in that the readable medium comprises executable instructions, which when executed by a processor of an electronic device, the electronic device performs the method of any of claims 1-7.
10. An electronic device comprising a processor and a memory storing execution instructions, wherein the processor performs the method of any one of claims 1-7 when the processor executes the execution instructions stored by the memory.
CN202111244929.5A 2021-10-26 2021-10-26 Task processing method and device Pending CN113688602A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111244929.5A CN113688602A (en) 2021-10-26 2021-10-26 Task processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111244929.5A CN113688602A (en) 2021-10-26 2021-10-26 Task processing method and device

Publications (1)

Publication Number Publication Date
CN113688602A true CN113688602A (en) 2021-11-23

Family

ID=78588064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111244929.5A Pending CN113688602A (en) 2021-10-26 2021-10-26 Task processing method and device

Country Status (1)

Country Link
CN (1) CN113688602A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471168A (en) * 2021-12-14 2022-12-13 国网上海市电力公司 Automatic flow processing method and device, electronic equipment and computer readable medium
CN117234711A (en) * 2023-09-05 2023-12-15 合芯科技(苏州)有限公司 Dynamic allocation method, system, equipment and medium for Flink system resources

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605662A (en) * 2013-10-21 2014-02-26 华为技术有限公司 Distributed computation frame parameter optimizing method, device and system
CN106250233A (en) * 2016-07-21 2016-12-21 鄞州浙江清华长三角研究院创新中心 MapReduce performance optimization system and optimization method
CN107463434A (en) * 2017-08-11 2017-12-12 恒丰银行股份有限公司 Distributed task processing method and device
CN111367591A (en) * 2020-03-30 2020-07-03 中国工商银行股份有限公司 Spark task processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605662A (en) * 2013-10-21 2014-02-26 华为技术有限公司 Distributed computation frame parameter optimizing method, device and system
CN106250233A (en) * 2016-07-21 2016-12-21 鄞州浙江清华长三角研究院创新中心 MapReduce performance optimization system and optimization method
CN107463434A (en) * 2017-08-11 2017-12-12 恒丰银行股份有限公司 Distributed task processing method and device
CN111367591A (en) * 2020-03-30 2020-07-03 中国工商银行股份有限公司 Spark task processing method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471168A (en) * 2021-12-14 2022-12-13 国网上海市电力公司 Automatic flow processing method and device, electronic equipment and computer readable medium
CN117234711A (en) * 2023-09-05 2023-12-15 合芯科技(苏州)有限公司 Dynamic allocation method, system, equipment and medium for Flink system resources
CN117234711B (en) * 2023-09-05 2024-05-07 合芯科技(苏州)有限公司 Dynamic allocation method, system, equipment and medium for Flink system resources

Similar Documents

Publication Publication Date Title
CN109815991B (en) Training method and device of machine learning model, electronic equipment and storage medium
CN108664613B (en) Data query method and device, computer equipment and storage medium
CN113688602A (en) Task processing method and device
WO2015167466A1 (en) Query plan post optimization analysis and reoptimization
CN113672396B (en) Streaming computing job processing method and device
CN109885384B (en) Task parallelism optimization method and device, computer equipment and storage medium
CN115061788A (en) Task dependency relationship detection method and device, server and storage medium
CN107784043B (en) Monitoring method, device and system for data table of data warehouse
CN108463813B (en) Method and device for processing data
CN112434056A (en) Method and device for inquiring detailed data
CN110955460B (en) Service process starting method and device, electronic equipment and storage medium
CN107844490B (en) Database dividing method and device
CN110083602B (en) Method and device for data storage and data processing based on hive table
CN110750539A (en) Redis database-based information query method and device and electronic equipment
CN112887113A (en) Method, device and system for processing data
WO2022253131A1 (en) Data parsing method and apparatus, computer device, and storage medium
CN114595146A (en) AB test method, device, system, electronic equipment and medium
CN110865877A (en) Task request response method and device
CN112765286A (en) Query method and device based on relational database
CN114328577A (en) Data query method and device
CN110083624B (en) Stream data processing method, stream data processing apparatus, and computer medium
CN111159229B (en) Data query method and device
CN113377791A (en) Data processing method, system and computing equipment
CN110688530B (en) Json data processing method and device
CN112783922B (en) Query method and device based on relational database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211123