CN112099933B - Task operation and query method and device, electronic equipment and storage medium - Google Patents

Task operation and query method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112099933B
CN112099933B CN202010997697.XA CN202010997697A CN112099933B CN 112099933 B CN112099933 B CN 112099933B CN 202010997697 A CN202010997697 A CN 202010997697A CN 112099933 B CN112099933 B CN 112099933B
Authority
CN
China
Prior art keywords
data
target
intermediate data
task
acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010997697.XA
Other languages
Chinese (zh)
Other versions
CN112099933A (en
Inventor
蔡杰
叶青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010997697.XA priority Critical patent/CN112099933B/en
Publication of CN112099933A publication Critical patent/CN112099933A/en
Application granted granted Critical
Publication of CN112099933B publication Critical patent/CN112099933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a task operation, a query method, a device, electronic equipment and a storage medium, which relate to the field of cloud computing, and particularly can be applied to the field of big data without limitation, and the specific implementation scheme is as follows: adding a data type label into each piece of intermediate data generated according to the current operation target task, and locally storing each piece of intermediate data; in response to receiving an intermediate data acquisition instruction aiming at a target task in the target task operation process, acquiring a target data type in the intermediate data acquisition instruction; and acquiring target acquisition data and feeding the target acquisition data back to a sender of the intermediate data acquisition instruction. The application solves the problem that the server can only acquire the intermediate data after the job node finishes the task job; the method and the device realize that intermediate data are sent to the server in the operation process of the operation node, and provide basis for subsequent simultaneous operation of multiple tasks and improvement of resource utilization rate.

Description

Task operation and query method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of cloud computing technologies, and in particular, but not limited to, application in the field of big data technologies, and in particular, to a task job, a query method, a device, an electronic device, and a storage medium.
Background
The job flow calculation control is a technique for efficiently managing and asynchronously processing data transmission for large data jobs under a distributed cluster. The data processing process of a typical big data job task mainly comprises three key steps: data extraction, conversion and loading.
At this stage, for a big data task, the job node typically writes all data generated by the job task into a particular database at the final stage (load stage). However, with the continuous development and improvement of large data platforms, intermediate data (e.g., status data, log data, and job meta information, etc.) generated during task jobs also need to be processed and converted and provided to users or other job nodes.
However, in the present stage, only after the job node finishes the task operation, the user side or other job nodes can acquire the intermediate data, so that only one task can be processed in a period of time, multiple tasks cannot be simultaneously operated, and the resource utilization rate is reduced.
Disclosure of Invention
The disclosure provides a task job, a query method, a device, electronic equipment and a storage medium.
According to an aspect of the present disclosure, there is provided a task job method, applied to a job node, including:
Adding a data type label into each piece of intermediate data generated according to the current operation target task, and locally storing each piece of intermediate data;
in response to receiving an intermediate data acquisition instruction aiming at a target task in the target task operation process, acquiring a target data type in the intermediate data acquisition instruction;
and matching the target data type with the data type labels of the intermediate data of the target task stored locally, acquiring target acquisition data, and feeding the target acquisition data back to the sender of the intermediate data acquisition instruction.
According to another aspect of the present disclosure, there is provided a task query method including:
in the process of operating a target task by a first operation node, sending an intermediate data acquisition instruction aiming at the target task to the first operation node;
the intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the locally stored intermediate data of the target task;
and receiving the target acquisition data fed back by the first operation node.
According to another aspect of the present disclosure, there is provided a task work device applied to a work node, including:
The data type label adding module is used for adding a data type label into each piece of intermediate data generated according to the current operation target task and locally storing each piece of intermediate data;
the target data type acquisition module is used for responding to the intermediate data acquisition instruction aiming at the target task in the target task operation process and acquiring the target data type in the intermediate data acquisition instruction;
the target acquisition data acquisition module is used for matching the target data type with the data type labels of the intermediate data of the target task stored locally, acquiring target acquisition data and feeding the target acquisition data back to the sender of the intermediate data acquisition instruction.
According to another aspect of the present disclosure, there is provided a task query apparatus including:
the intermediate data acquisition instruction sending module is used for sending an intermediate data acquisition instruction aiming at a target task to a first operation node in the process of operating the target task by the first operation node;
the intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the locally stored intermediate data of the target task;
And the target acquisition data receiving module is used for receiving the target acquisition data fed back by the first operation node.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the task work method of any one of the embodiments of the present application or the task query method of any one of the embodiments of the present application.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the task work method of any one of the embodiments of the present application, or the task query method of any one of the embodiments of the present application.
According to the technology disclosed by the application, the problem that the server can only acquire intermediate data after the task operation is finished by the operation node at the present stage is solved; the method and the device realize that intermediate data are sent to the server in the operation process of the operation node, and provide basis for subsequent simultaneous operation of multiple tasks and improvement of resource utilization rate.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:
FIG. 1 is a schematic diagram of a task job method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a task job method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a task job method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a task job method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a task query method according to an embodiment of the application;
FIG. 6 is a schematic diagram of a task query method according to an embodiment of the application;
FIG. 7 is a schematic diagram of a task operating system according to an embodiment of the application;
FIG. 8 is a schematic diagram of a task job method according to an embodiment of the present application;
FIG. 9 is a block diagram of a task work device according to an embodiment of the present application;
FIG. 10 is a block diagram of a task querying device according to an embodiment of the application;
Fig. 11 is a block diagram of an electronic device for implementing a task job method or a task query method of an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a task operation method according to an embodiment of the present application, where the embodiment is suitable for a situation that an operation node receives an acquisition instruction of intermediate data and feeds back target acquisition data meeting a condition to a sender of the intermediate data acquisition instruction, the method may be executed by a task operation device, and the device may be implemented by software and/or hardware and is specifically configured in the operation node, where the operation node involved in the embodiment may be an electronic device such as a server, a computer or a tablet computer, and it should be noted that the operation node involved in the embodiment may also be a cloud server, and the embodiment is not limited thereto. Specifically, referring to fig. 1, the method specifically includes the following steps:
S110, adding a data type label into each piece of intermediate data generated according to the current job target task, and locally storing each piece of intermediate data.
The current task target task may be any data processing task being executed, for example, a classification task of image data, an understanding task of voice data, or a cleaning task of log data, which is not limited in this embodiment. Alternatively, each intermediate data generated by the job target task may be log data, audit data, or index (metrics) data, which is not limited in this embodiment. It should be understood that the data types of the intermediate data in this embodiment may include log data, audit data, index data, or the like, which is not limited in this embodiment.
In an alternative implementation manner of this embodiment, during the operation of the target task, the operation node may process the generated intermediate data in real time, for example, add a type tag to each generated intermediate data, and permanently store the type-tagged intermediate data in the local file system or the local database.
In an alternative implementation of this embodiment, the same tag may be added to the same type of intermediate data; for example, if 10 pieces of intermediate data are generated during the operation of the target task, where the 10 pieces of intermediate data include 5 pieces of log data and 5 pieces of audit data, a data type tag (for example, a tag such as a character a or a number 1, etc., which is not limited in this embodiment) corresponding to the log data may be added to the 5 pieces of log data, and a data type tag corresponding to the audit data may be added to the 5 pieces of audit data, respectively; further, the 5 log data tagged with the data type and the 5 audit data are permanently stored in the local sub-file system or local database.
The method has the advantages that after the intermediate data of different data types are added into different data labels, each intermediate data is stored, the intermediate data of different data types can be stored in a partitioned mode, and a basis is provided for the subsequent acquisition of the intermediate data of the specified type.
S120, responding to the intermediate data acquisition instruction aiming at the target task in the process of target task operation, and acquiring the target data type in the intermediate data acquisition instruction.
The intermediate data acquisition instruction for the target task related in the embodiment may set an acquisition time (for example, 1 minute, 5 minutes, or 10 minutes, etc., which is not limited in the embodiment) for each interval of the user side, and send the intermediate data acquisition instruction for the target task to the job node; the intermediate data acquisition instruction for the target task may be sent to the current job node (node of the job target task) by another job node, and is not limited in this embodiment.
It should be noted that, the target data type in the intermediate data acquisition instruction in this embodiment may include one or more types of intermediate data; the target data type in the intermediate data collection instruction may be one or more of status data, log data, audit data or index data, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, if the job node receives an intermediate data acquisition instruction for the target task during the process of performing the job on the target task, the job node may further acquire the target data type in the received intermediate data acquisition instruction.
For example, if the job node receives the intermediate data acquisition instruction a for the target task during the process of operating the target task, the job node may further acquire the target data type in the intermediate data acquisition instruction a, for example, the acquired target data type may be status data; but also log data and audit data.
And S130, matching the target data type with the data type labels of the intermediate data of the locally stored target task, acquiring target acquisition data, and feeding the target acquisition data back to the sender of the intermediate data acquisition instruction.
The sender of the intermediate data collection instruction may be a user end, or may be a job node for other tasks of the job, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, after receiving the intermediate data acquisition instruction for the target task and acquiring the target data type in the intermediate data acquisition instruction, the job node may further match the acquired target data type with the data type of each intermediate data of the target task stored locally, so as to acquire the target acquisition data, and feed back the acquisition data to the sender of the intermediate data acquisition instruction.
For example, if the job node determines that the target data type in the target data acquisition instruction for the target task is log data, the job node may determine, as target acquisition data, intermediate data marked as log data in each intermediate data of the target task stored locally, and may feed back the target acquisition data to the sender of the intermediate data acquisition instruction.
According to the scheme of the embodiment, the operation node adds a data type tag into each piece of intermediate data generated according to the current operation target task, and locally stores each piece of intermediate data; in response to receiving an intermediate data acquisition instruction aiming at a target task in the target task operation process, acquiring a target data type in the intermediate data acquisition instruction; matching the target data type with the data type labels of the intermediate data of the locally stored target task to obtain target acquisition data, and feeding the target acquisition data back to the sender of the intermediate data acquisition instruction, so that the problem that the server can only obtain the intermediate data after the task operation of the operation node is finished at the present stage is solved; the method and the device realize that intermediate data are sent to the server in the operation process of the operation node, and provide basis for subsequent simultaneous operation of multiple tasks and improvement of resource utilization rate.
Fig. 2 is a schematic diagram of a task operation method according to an embodiment of the present application, which is a further refinement of the foregoing technical solutions, where the technical solutions in this embodiment may be combined with each of the alternatives in one or more embodiments described above. As shown in fig. 2, the task job method includes the following:
s210, carrying out content analysis on the currently generated target intermediate data, and adding a data type tag into the target intermediate data according to a content analysis result; writing target intermediate data into a message queue; and sequentially writing each intermediate data in the message queue into the local storage space.
The target intermediate data may be intermediate data generated by the job node at the current time in the process of performing the job on the target task, for example, intermediate data generated when the target task job starts, or intermediate data generated when the target task job reaches a certain time, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, after the job node performs the job target task and generates the target intermediate data, the job node may further analyze the content of the generated target intermediate data, and add a data type tag into the target intermediate data according to the content analysis result; and writing the target intermediate data added with the tag into a message queue, and sequentially and permanently writing each intermediate data in the message queue into a local storage space.
For example, if the job node is at the 10 th minute of the job target task, 20 pieces of intermediate data are generated, and the 20 pieces of intermediate data are target intermediate data; further, performing content analysis on the 20 target intermediate data, if the content analysis result determines that the 20 target intermediate data contains 10 log data and 10 audit data, adding the 10 log data to the tags of the data types of the log data respectively, and adding the 10 audit data to the tags of the data types of the audit data respectively; further, writing 10 log data and 10 audit data added with the data type label into a message queue respectively, and sequentially (for example, in first-out order) writing 20 target intermediate data in the message queue into a local storage space permanently; the local storage space in this embodiment may be a hard Disk or an SSD (Solid State Disk) of the operation node, which is not limited in this embodiment.
In an optional implementation manner of this embodiment, the content parsing of the currently generated target intermediate data may include: storing the target intermediate data in a JSON file or an XML file, and further, realizing the content analysis of the target intermediate data in a JSON data analysis or XML data analysis mode. It should be noted that, in this embodiment, the content analysis of the target intermediate data may also be implemented in other manners, and in this embodiment, details thereof will not be described herein.
S220, responding to the intermediate data acquisition instruction aiming at the target task in the target task operation process, and acquiring the target data type in the intermediate data acquisition instruction.
S230, matching the target data type with the data type labels of the intermediate data of the locally stored target task, obtaining target acquisition data, and feeding the target acquisition data back to the sender of the intermediate data acquisition instruction.
According to the scheme of the embodiment, the operation node can add a data type label into the intermediate data according to the content analysis result of the intermediate data; after each intermediate data added with the tag is written into the message queue, each intermediate data is locally stored, so that the intermediate data of different data types can be stored in a partitioned mode, and a basis is provided for the subsequent acquisition of the intermediate data of the specified type.
Fig. 3 is a schematic diagram of a task operation method according to an embodiment of the present application, which is a further refinement of the foregoing technical solutions, where the technical solutions in this embodiment may be combined with each of the alternatives in one or more embodiments described above. As shown in fig. 3, the task job method includes the following:
S310, adding a data serial number into each intermediate data according to the generation sequence of each intermediate data.
Optionally, the intermediate data acquisition instruction sent by the sender may further include a target data sequence number.
In an optional implementation manner of this embodiment, during the operation of the job node on the target task, the job node may add a data sequence number to each intermediate data according to the generation sequence of each intermediate data.
Illustratively, the job node sequentially generates first intermediate data, second intermediate data and third intermediate data in the process of working on the target task; a data sequence number of "1" may be added to the first intermediate data, a data sequence number of "2" may be added to the second intermediate data, and a data sequence number of "3" may be added to the third intermediate data.
And S320, adding a data type label into each piece of intermediate data generated according to the current job target task, and locally storing each piece of intermediate data.
S330, in response to receiving the intermediate data acquisition instruction for the target task in the target task operation process, acquiring the target data type and the target data serial number in the intermediate data acquisition instruction.
The intermediate data acquisition instruction in this embodiment may include a target data type and a target data sequence number; the target data sequence number may be any one or more sequence numbers, for example, sequence numbers 1 and 3, 2 or 5, which are not limited in this embodiment.
S340, matching the target data type with the data type labels of the intermediate data of the target task stored locally to obtain alternative acquisition data; and acquiring target acquisition data matched with the target data serial number from the candidate acquisition data.
In an optional implementation manner of this embodiment, after the job node receives the intermediate data acquisition instruction for the target task and acquires the target data type in the intermediate data acquisition instruction during the job process of the target task, the target data type may be further matched with the data type tag of each intermediate data of the target task stored locally, so as to acquire the alternative acquired data.
Further, target acquisition data matched with the target data serial number can be acquired from each candidate acquisition data.
For example, if the target data type in the intermediate data acquisition instruction for the target task received by the job node is index data, and the target data sequence numbers are 1 and 2, the index data in each intermediate data of the target task stored locally can be acquired, and each acquired index data is the candidate acquisition data; further, index data with a serial number of 1 and a serial number of 2 are obtained from the obtained index data, wherein the obtained index data with the serial numbers of 1 and 2 are target acquisition data.
S350, feeding the target acquisition data back to a sender of the intermediate data acquisition instruction.
According to the scheme of the embodiment, the operation node adds a data serial number into each intermediate data according to the generation sequence of each intermediate data; matching the target data type with the data type label of each intermediate data of the target task stored locally to obtain alternative acquisition data; and acquiring target acquisition data matched with the target data serial number from each candidate acquisition data, acquiring the target acquisition data accurately in real time in the operation process of the operation node, and feeding back the target acquisition data to a sender to provide a basis for the corresponding tasks of the operation of other subsequent operation nodes.
Fig. 4 is a schematic diagram of a task operation method according to an embodiment of the present application, which is a further refinement of the foregoing technical solutions, where the technical solutions in this embodiment may be combined with each of the alternatives in one or more embodiments described above. As shown in fig. 4, the task job method includes the following:
s410, adding a data serial number into each intermediate data according to the generation sequence of each intermediate data.
And S420, adding a data type label into each piece of intermediate data generated according to the current job target task, and locally storing each piece of intermediate data.
S430, in response to receiving an intermediate data acquisition instruction for the target task in the target task operation process, acquiring a target data type, a target data serial number and a quota instruction in the intermediate data acquisition instruction.
In an alternative implementation of this embodiment, the intermediate data collection instruction may include a target data type, a target data sequence number, and a quota instruction. Wherein the quota instruction may define a data size of the target acquisition data. For example, if the quota instruction is 10MB, the data size of the target acquisition data should be less than or equal to 10MB.
S440, matching the target data type with the data type labels of the intermediate data of the target task stored locally to obtain alternative acquisition data; and acquiring target acquisition data matched with the target data serial number from the candidate acquisition data.
S450, data compression is carried out on the target acquisition data according to the quota instruction contained in the intermediate data acquisition instruction.
In an optional implementation manner of this embodiment, after the job node obtains the target collected data, the job node may further perform data compression on the target collected data according to a quota instruction included in the intermediate data collection instruction.
In a specific example of the present embodiment, if the quota instruction included in the intermediate data collection instruction is 10MB, the job node may compress the data amount of the target collection data into data in the avro format of 10MB or less.
The advantage of this arrangement is that the data processing pressure of the sender of the received target collected data can be reduced, and the work efficiency of the sender of the received target collected data is not reduced due to the excessive data volume.
S460, feeding the target acquisition data back to the sender of the intermediate data acquisition instruction.
S470, after the operation of the target task is completed, constructing a result data packet according to the execution result of the target task and various intermediate data of the target task stored currently; and sending the result data packet to a server.
In an optional implementation manner of this embodiment, after the job node completes the job on the target task, a result data packet may be configured according to the execution result of the target task and various intermediate data of the target task that are currently stored; and transmitting the constructed result data packet to the server.
For example, the target task execution result and various intermediate data of the currently stored target task may be packaged according to a certain data format (for example, ". Zip" format or ". Rar" format, etc.), and the packaged data packet is sent to the server; or storing the execution result of the target task and various intermediate data of the currently stored target task in a JSON file or an XML file according to the generated sequence, and sending the JSON file or the XML file to a server.
It should be noted that, the server in this embodiment may be a management node, and after receiving the result data packet, the server may analyze the result data packet and send the analyzed data to other job nodes, so that other nodes perform other task jobs.
After the target acquisition data is fed back to the sender of the intermediate data acquisition instruction, after the operation node completes the operation on the target task, a result data packet is constructed according to the execution result of the target task and all intermediate data of the target task stored currently; the result data packet is sent to the server, the job node can continue to execute other job tasks distributed by the server, the job-completed tasks can not occupy the resources of the job node any more, intermediate data can be obtained in the process of the operation of the target task, and the resources of the job node can be released timely after the operation of the target task is completed.
Fig. 5 is a schematic diagram of a task query method according to an embodiment of the present application, where the embodiment is applicable to a case where intermediate data generated during a task of a job node is collected by a management node, the method may be performed by a task query device, and the device may be implemented by software and/or hardware, and specifically configured in the management node, where the management node involved in the embodiment may be a server. Specifically, referring to fig. 5, the method specifically includes the following steps:
S510, in the process of operating the target task by the first operation node, sending an intermediate data acquisition instruction aiming at the target task to the first operation node.
The intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the intermediate data of the locally stored target task.
The first job node may be a job node of the job target task in each embodiment. In an optional implementation manner of this embodiment, the management node sends, to the first job node, an intermediate data acquisition instruction for the target task in a process of the first job node operating the target task.
Optionally, the management node may send an intermediate data acquisition instruction for the target task to the first job node at preset intervals (for example, 10 minutes, 20 minutes, or 1 hour, etc., which is not limited in this embodiment); or when the management node receives an acquisition instruction of the intermediate data of the target task, the management node sends an intermediate data acquisition instruction of the target task to the first operation node; the target node may be any node except the first node in the distributed cluster, which is not limited in this embodiment.
In an alternative implementation of this embodiment, the intermediate data acquisition instruction may further include a target data sequence number or request definition.
S520, receiving target acquisition data fed back by the first operation node.
In an optional implementation manner of this embodiment, after sending a data acquisition instruction for a target task to the first job node in the process of the first job node working on the target task, the target acquisition data fed back by the first job node may be acquired at a very small interval (for example, 5ms,30ms, or 1s, etc., which is not limited in this embodiment).
In the scheme of the embodiment, in the process of operating the target task by the first operation node, the management node wants the first operation node to send an intermediate data acquisition instruction aiming at the target task; further receiving target acquisition data fed back by the first operation node, intermediate data generated in the task operation process can be obtained in the operation process of the operation node, and a basis is provided for parallel operation of a plurality of data processing tasks.
Fig. 6 is a schematic diagram of a task query method according to an embodiment of the present application, where the technical solution in this embodiment is further refined, and the technical solution in this embodiment may be combined with each of the alternatives in one or more embodiments described above. As shown in fig. 6, the task query method includes the following:
S610, creating a third data transmission port.
The third data transmission port is used for acquiring a start instruction or a closing instruction aiming at the first operation node operation target task.
In an optional implementation manner of this embodiment, the management node (server) may create a third data transmission port, and when receiving a start instruction of the target task sent by the user side through the third data transmission port, may further send an instruction to start the job target task to the first job node; the user terminal may be a mobile client terminal or a WEB (network) terminal, which is not limited in this embodiment.
In another optional implementation manner of this embodiment, during the process of the first job node operating the target task, if the closing instruction of the target task sent by the user side is received through the third data transmission port, an instruction for ending the operation target task may be further sent to the first job node, where the first job node will pause the operation on the target task.
The method has the advantages that the starting instruction or the closing instruction aiming at the target task of the first operation node is acquired by creating the specific data transmission port, so that the target task can be timely controlled to be started or closed by the operation node, and the flexibility of the system is improved.
S620, creating a first data transmission port and a second data transmission port.
The first data transmission port is used for sending an intermediate data acquisition instruction to the first operation node; the second data transmission port is used for receiving target acquisition data fed back by the first operation node.
In an alternative implementation manner of this embodiment, a first data transmission port for sending an intermediate data acquisition instruction to the first operation node and a second data transmission port for receiving the target acquisition data fed back by the first operation node may be created in the server, respectively.
In an alternative implementation manner of this embodiment, after the first job node starts the job target task, the first data transmission port and the second data transmission port may be opened, and an intermediate data acquisition instruction may be sent to the first job node through the first data transmission port, and the target acquisition data fed back by the first job node may be received through the second data transmission port.
The advantage of this arrangement is that it provides a basis for obtaining intermediate data during the course of the job tasks of the job node.
In another alternative implementation of this embodiment, the first data transmission port and the second data transmission port may be closed after the first job node ends the job target task.
The advantage of this arrangement is that the first data transmission port and the second data transmission port are closed in time after the first job node finishes the job target task, so that the resource consumption of the server can be saved.
It can be understood that, in the present application, a fourth data transmission port corresponding to the first data transmission port and a fifth data transmission port corresponding to the second data transmission port may also be created in the first operation node; the fourth data transmission port can be used for receiving an intermediate data acquisition instruction sent by the server through the first data transmission port; the fifth data transmission port may be used to feed back target acquisition data to the second data transmission port of the server.
S630, in the process of operating the target task by the first operation node, sending an intermediate data acquisition instruction aiming at the target task to the first operation node.
S640, receiving target acquisition data fed back by the first operation node.
S650, locally storing the target acquisition data.
In an optional implementation manner of this embodiment, after receiving the target collected data fed back by the first job node, the server may store the target collected data in a storage space of the server; in a specific example of this embodiment, the target acquisition data may be stored locally according to a data type of the target acquisition data, for example, the target acquisition data of the same data type are stored in the same area of the local storage system.
S660, responding to an intermediate data query request for the target task sent by the second job node, and acquiring target query data matched with the intermediate data query request from each locally stored intermediate data of the target task; and feeding back the target query data to the second operation node.
The second operation node may be any operation node except the first operation node in the distributed cluster.
In an optional implementation manner of this embodiment, when the server receives a query request for intermediate data of the target task sent by the second job node, the target query data matched with the intermediate data query request may be obtained from each locally stored intermediate data of each target task; and further feeding back the target query data to the second job node.
For example, if the server receives a query request of the second job node for log data of the target task, the server may acquire all log data in each intermediate data of the target task stored locally, and feed back the acquired log data to the second job node.
In the scheme of the embodiment, after receiving the target collected data fed back by the first operation node, the target collected data can be locally stored; further, in response to the intermediate data query request for the target task sent by the second job node, target query data matched with the intermediate data query request can be obtained from each locally stored intermediate data of the target task; and feeding back the target query data to the second operation node, and providing basis for simultaneous operation of a plurality of operation nodes aiming at the data of the same task.
For a better understanding of the present application, FIG. 7 is a schematic diagram of a task operating system according to an embodiment of the present application; referring to fig. 7, the system includes a management node (i.e., a server) 710 and a job node 720, and it should be noted that a task job system may include a plurality of job nodes, and fig. 7 illustrates only one job node, which is not limited to this embodiment.
Wherein the management node 710 includes a routing layer 711, a job management platform 712, and a monitoring service client 713; the routing layer 711 includes a third data transmission port 7111; the monitoring service client 713 includes a first data transfer port 7131, a second data transfer port 7132, and a local file system 7133.
The job node 720 includes: monitoring server 721, message queue 722, local file system 723, production queue 724; the monitoring server 721 includes a fourth data transmission port 7211 and a fifth data transmission port 7212.
In an alternative implementation manner of this embodiment, (1) when a task is started, the proxy service opens a local port forwarding to the routing layer 711 of the management node 710 through the ssh tunnel, and if a third data transmission port 7111 is opened, the port may be forwarded to the port of the routing layer 711 of the uppermost layer of the management node 710, and the routing layer may forward to the port of the corresponding service according to the requested service; and further allows the remote job node to send a request to the management node through the opened third data transmission port 7111. The address and port of the opened job node are stored on the configuration file, and are written into the job configuration according to the submitted spark or mapreduce job.
(2) The management node 710 may start a service for monitoring the jobs, and may perform data acquisition on each submitted job, for example, a first data transmission port 7131 is locally started for the monitoring service client 713; each submitted job starts a monitoring service server at the corresponding job node, in this embodiment, taking job node 720 as an example, the monitoring service server 721 is started at job node 720; knowing the first data transmission port 7131 of the management node 710 according to the file information copied in the step (1), and opening a fourth data transmission port 7211 forwarded by a remote port through the ssh tunnel; the management node 710 may send a request to the first data transmission port 7131 via the monitoring service client 713, thereby allowing the monitoring service 721 of the job node 720 to obtain the request at the fourth data transmission port 7211. For example, the monitoring service client 713 may initiate a request to the monitoring service server 721 to acquire job data at fixed intervals, where the request includes each data tag and a required data amount (quota instruction), and the monitoring service server 721 acquires the required tag data and the required data amount from the local message queue 722, performs processing such as sequential compression, and returns the processing. The monitoring service client 713 obtains the data and decompresses the data, stores the data in the local file system 7133, and other services can poll the storage engine for the respective service data. Thus, in this way, remote job data can be accessed by polling the local file system 7133.
It should be noted that, in an alternative implementation of this embodiment, a proxy service in the distributed cluster may be used to proxy forward a request that needs to access a management node. When a specific running job instance of a hadoop job is scheduled to run on a different machine, if metadata of the management node needs to be accessed, a proxy port (e.g., a third data transmission port 7111 shown in fig. 7) of the master machine can be directly accessed, and the port forwards the request to a gateway of the management node, and the gateway can dispatch the request to different service modules for processing according to the request data.
For a better understanding of embodiments of the present application, FIG. 8 is a schematic diagram of a task work method according to an embodiment of the present application; the method may be performed by the job node 720, and the task job process includes the following steps:
s810, marking various collected data such as job logs, audit information, various index data of the job, job state information and the like.
S820, writing each labeled data into a message pair column.
S830, asynchronously persisting the data in the message queue to the local file system.
It should be noted that, in an alternative implementation manner of this embodiment, the monitoring service client on the management node periodically polls the file data persisted by the remote node, and each time the client sends an http request of a fetchMessage, the request receives the production data of the remote running job. The requested content of the fetchMessage has the data of which tags are requested, the requested serial number, and the requested quota. After the request is returned, the data in the avro format is obtained. After obtaining the data, the management node further writes the data on a local hbase storage according to label classification, and each service consumes the label data which polls the hbase to further obtain each item of data generated by remote operation.
Correspondingly, the monitoring service on the job node collects, sorts and returns job generation data to the client, the monitoring service is an http service before the specific job is started, and the monitoring service receives the fetchMessage request of the client. As an http service, the monitoring server uniformly processes the request of the management node, reads the request from the file system written in the existing message queue according to the label, the serial number and the quota of the request, further compresses the request into data in the avro format, and returns the data to the management node.
It should be noted that, in this embodiment, the monitoring service defines various states of remote operation, including the operation states such as STARTING, RUNNING, SUSPENDED, COMPLETED, FAILED and KILLED. The user may request to close the job when the job is running, for example, some real-time jobs may change by first closing the job, the user may not need to directly go to the job cluster for manual operation, and may only need to send a request to close the job to the monitoring service. The monitoring client can send a kilo http request, and the server can further close the operation after receiving the request.
In this embodiment, if the network connection is disconnected after the remote job is started, the local monitoring client cannot acquire the request, however, in this embodiment, the client cannot be turned off because no data exists, the client can determine whether the job is finished through the pid of the remote job, and the next polling can be performed if the job is not finished. Therefore, if the network connection is interrupted, the data before the network is recovered after the network is normal, and the remote operation can not stop the task because of the network interruption.
In this embodiment, when the remote job is normally ended, the end state is sent to the local message queue, so that the management node knows that the job has ended, and therefore informs other data consumers that the job has ended, and sends a request for turning off the monitoring service, and the request also performs the cleaning work of the job.
According to the scheme, the operation monitoring service is provided, the data information of the remote operation is acquired through polling, meanwhile, compression serialization and sequential writing of data are supported, and a convenient interface is provided for the local management node to consume the data. The operation can be monitored according to the operation state, the remote operation does not need to consider how to transmit information back by what method, and only the operation information needs to be written into a local production message queue. The proxy service for enabling the remote operation to access the management node is provided, so that the remote operation can conveniently carry out corresponding business logic processing according to the existing metadata under certain requirements, and the remote operation only needs to call the metadata interface and does not need to care about information such as a machine address port of the management node. The problem of management and acquisition of data under the distributed cluster is solved, the data processing requirement of big data operation can be met, and the use cost can be further reduced.
Fig. 9 is a block diagram of a task work device according to an embodiment of the present application, which can perform the task work method according to any of the embodiments of the present application, and which can be implemented in software and/or hardware. Specifically, referring to fig. 9, the apparatus specifically includes: a data type tag joining module 910, a target data type acquisition module 920, and a target acquisition data acquisition module 930.
The data type tag adding module 910 is configured to add a data type tag to each intermediate data generated according to the current job target task, and store each intermediate data locally;
the target data type obtaining module 920 is configured to obtain a target data type in an intermediate data acquisition instruction in response to receiving the intermediate data acquisition instruction for a target task in a target task operation process;
the target acquisition data acquiring module 930 is configured to match the target data type with a data type tag of each intermediate data of the locally stored target task, acquire target acquisition data, and feed back the target acquisition data to the sender of the intermediate data acquisition instruction.
According to the scheme of the embodiment, the operation node adds a data type label into each piece of intermediate data generated according to the current operation target task through a data type label adding module, and locally stores each piece of intermediate data; the method comprises the steps that a target data type acquisition module responds to an intermediate data acquisition instruction aiming at a target task in the process of target task operation, and the target data type in the intermediate data acquisition instruction is acquired; the target acquisition data is matched with the data type labels of the intermediate data of the target task stored locally through the target acquisition data acquisition module, the target acquisition data is acquired, and the target acquisition data is fed back to the sender of the intermediate data acquisition instruction, so that the problem that the server can acquire the intermediate data only after the task operation is finished by the operation node in the current stage is solved; the method and the device realize that intermediate data are sent to the server in the operation process of the operation node, and provide basis for subsequent simultaneous operation of multiple tasks and improvement of resource utilization rate.
Optionally, the data type tag adding module 910 is specifically configured to perform content analysis on the currently generated target intermediate data, and add a data type tag to the target intermediate data according to the content analysis result; writing target intermediate data into a message queue; and sequentially writing each intermediate data in the message queue into the local storage space.
Optionally, the intermediate data acquisition instruction further includes: a target data sequence number;
the apparatus further comprises: the data serial number adding module is used for adding the data serial number into each intermediate data according to the generation sequence of each intermediate data;
the target data type obtaining module 920 is specifically configured to match a target data type with a data type tag of each intermediate data of the target task stored locally, so as to obtain alternative collected data; and acquiring target acquisition data matched with the target data serial number from the candidate acquisition data.
Optionally, the intermediate data acquisition instruction further includes: a quota instruction;
the apparatus further comprises: and the data compression module is used for carrying out data compression on the target acquisition data according to the quota instruction contained in the intermediate data acquisition instruction.
Optionally, the apparatus further comprises: the result data packet construction module is used for constructing a result data packet according to the execution result of the target task and various intermediate data of the target task stored currently after the operation of the target task is completed; and sending the result data packet to a server.
Optionally, the data type of the intermediate data related in this embodiment includes at least one of the following: log data, audit data, or index data.
The task operation device can execute the task operation method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may refer to the task operation method provided in any embodiment of the present application.
Fig. 10 is a block diagram of a task query apparatus according to an embodiment of the present application, which may perform the task query method according to any of the embodiments of the present application, and may be implemented in software and/or hardware. Specifically, referring to fig. 10, the apparatus specifically includes: an intermediate data acquisition instruction transmitting module 100 and a target acquisition data receiving module 101.
The intermediate data acquisition instruction sending module 100 is configured to send an intermediate data acquisition instruction for a target task to a first operation node in a process of operating the target task by the first operation node;
the intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the intermediate data of the locally stored target task;
The target acquisition data receiving module 101 is configured to receive target acquisition data fed back by the first operation node.
According to the scheme of the embodiment, the server sends an intermediate data acquisition instruction aiming at a target task to a first operation node in the process of operating the target task by the first operation node through an intermediate data acquisition instruction sending module; the target acquisition data receiving module receives the target acquisition data fed back by the first operation node, so that intermediate data generated in the task operation process can be obtained in the operation process of the operation node, and a basis is provided for parallel operation of a plurality of data processing tasks.
Optionally, the apparatus further comprises: the first data transmission port creation module is used for creating a first data transmission port and a second data transmission port; the first data transmission port is used for sending an intermediate data acquisition instruction to the first operation node; the second data transmission port is used for receiving target acquisition data fed back by the first operation node.
Optionally, the apparatus further comprises: the third data transmission port creation module is used for creating a third data transmission port; the third data transmission port is used for acquiring a start instruction or a closing instruction aiming at the first operation node operation target task.
Optionally, the apparatus further comprises: the local storage module is used for locally storing the target acquisition data; the apparatus further comprises: the target query data acquisition module is used for responding to the intermediate data query request aiming at the target task and sent by the second job node, and acquiring target query data matched with the intermediate data query request from each locally stored intermediate data of the target task; and feeding back the target query data to the second operation node.
The task query device can execute the task query method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may refer to the task operation method provided in any embodiment of the present application.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
As shown in fig. 11, a block diagram of an electronic device for implementing the task job method or the task query method according to the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 11, the electronic device includes: one or more processors 1101, memory 1102, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). In fig. 11, a processor 1101 is taken as an example.
Memory 1102 is a non-transitory computer-readable storage medium provided by the present application. The memory stores instructions executable by at least one processor to cause the at least one processor to execute the task job method or the task query method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the task work method or the task query method provided by the present application.
The memory 1102 is used as a non-transitory computer readable storage medium for storing a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the task job method in the embodiment of the present application (e.g., the data type tag adding module 910, the target data type acquiring module 920, and the target acquisition data acquiring module 930 shown in fig. 9), or program instructions/modules corresponding to the task query method in the embodiment of the present application (e.g., the intermediate data acquisition instruction transmitting module 100 and the target acquisition data receiving module 101 shown in fig. 10). The processor 1101 executes various functional applications of the server and data processing, i.e., implements the task job method or task query method in the above-described method embodiments by running non-transitory software programs, instructions, and modules stored in the memory 1102.
Memory 1102 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device of the task job method or the task query method, and the like. In addition, memory 1102 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 1102 optionally includes memory remotely located relative to processor 1101, which may be connected via a network to an electronic device for implementing the task job method or task query method of embodiments of the present application. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device for implementing the task operation method or the task query method according to the embodiment of the present application may further include: an input device 1103 and an output device 1104. The processor 1101, memory 1102, input device 1103 and output device 1104 may be connected by a bus or other means, for example in fig. 11.
The input device 1103 may receive input numeric or character information, and generate key signal inputs related to user settings and function controls of the electronic device of the task work method or task query method, such as input devices of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, etc. The output device 1104 may include a display device, auxiliary lighting (e.g., LEDs), and haptic feedback (e.g., a vibration motor), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme provided by the embodiment of the application, the problem that the server can only acquire intermediate data after the task operation is finished by the operation node in the current stage is solved; the method and the device realize that intermediate data are sent to the server in the operation process of the operation node, and provide basis for subsequent simultaneous operation of multiple tasks and improvement of resource utilization rate.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (17)

1. A task job method, applied to a job node, comprising:
adding a data type label into each piece of intermediate data generated according to the current operation target task, and locally storing each piece of intermediate data;
in response to receiving an intermediate data acquisition instruction aiming at a target task in the target task operation process, acquiring a target data type in the intermediate data acquisition instruction;
Matching the target data type with a data type label of each intermediate data of the target task stored locally, acquiring target acquisition data, and feeding the target acquisition data back to a sender of an intermediate data acquisition instruction;
the adding a data type tag into each intermediate data generated according to the current job target task, and locally storing each intermediate data includes:
performing content analysis on the currently generated target intermediate data, and adding a data type tag into the target intermediate data according to a content analysis result;
writing the target intermediate data into a message queue;
and writing each intermediate data in the message queue into a local storage space.
2. The method of claim 1, wherein the intermediate data acquisition instructions further comprise: a target data sequence number;
before locally storing each of the intermediate data, the method further comprises:
adding a data serial number into each intermediate data according to the generation sequence of each intermediate data;
matching the target data type with the data type label of each intermediate data of the target task stored locally to obtain target acquisition data, wherein the method comprises the following steps:
Matching the target data type with a data type label of each intermediate data of the target task stored locally to obtain alternative acquisition data;
and acquiring target acquisition data matched with the target data serial number from each piece of alternative acquisition data.
3. The method of claim 1, wherein the intermediate data acquisition instructions further comprise: a quota instruction;
before feeding back the target acquisition data to the sender of the intermediate data acquisition instruction, the method further comprises:
and carrying out data compression on the target acquisition data according to the quota instruction contained in the intermediate data acquisition instruction.
4. The method of claim 1, wherein after feeding back the target acquisition data to the sender of the intermediate data acquisition instructions, further comprising:
after the operation of the target task is completed, constructing a result data packet according to the execution result of the target task and various currently stored intermediate data of the target task;
and sending the result data packet to a server.
5. The method of any of claims 1-4, wherein the data type of the intermediate data comprises at least one of:
Log data, audit data, or index data.
6. A task query method, comprising:
in the process of operating a target task by a first operation node, sending an intermediate data acquisition instruction aiming at the target task to the first operation node;
the intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the locally stored intermediate data of the target task;
receiving the target acquisition data fed back by the first operation node;
wherein the method is performed by a server;
after receiving the target acquisition data fed back by the first operation node, the method further comprises the following steps:
locally storing the target acquisition data;
the method further comprises the steps of:
responding to an intermediate data query request for the target task sent by a second job node, and acquiring target query data matched with the intermediate data query request from each locally stored intermediate data of the target task;
and feeding the target query data back to the second operation node.
7. The method of claim 6, wherein, during the operation of the target task at the first operation node, before sending the intermediate data collection instruction for the target task to the first operation node, further comprising:
Creating a first data transmission port and a second data transmission port;
the first data transmission port is used for sending an intermediate data acquisition instruction to the first operation node; the second data transmission port is used for receiving target acquisition data fed back by the first operation node.
8. The method of claim 6, wherein, during the operation of the target task at the first operation node, before sending the intermediate data collection instruction for the target task to the first operation node, further comprising:
creating a third data transmission port;
the third data transmission port is used for acquiring a start instruction or a close instruction of the target task aiming at the first operation node operation.
9. A task work device for a work node, comprising:
the data type label adding module is used for adding a data type label into each piece of intermediate data generated according to the current operation target task and locally storing each piece of intermediate data;
the target data type acquisition module is used for responding to the intermediate data acquisition instruction aiming at the target task in the target task operation process and acquiring the target data type in the intermediate data acquisition instruction;
The target acquisition data acquisition module is used for matching the target data type with the data type labels of the intermediate data of the target task stored locally, acquiring target acquisition data and feeding the target acquisition data back to a sender of an intermediate data acquisition instruction;
wherein the data type tag adding module is specifically configured to
Performing content analysis on the currently generated target intermediate data, and adding a data type tag into the target intermediate data according to a content analysis result;
writing the target intermediate data into a message queue;
and writing each intermediate data in the message queue into a local storage space.
10. The apparatus of claim 9, wherein the intermediate data acquisition instructions further comprise: a target data sequence number;
the apparatus further comprises:
the data serial number adding module is used for adding a data serial number into each intermediate data according to the generation sequence of each intermediate data;
the target data type acquisition module is specifically used for matching a target data type with a data type label of each intermediate data of the target task stored locally to acquire alternative acquisition data;
And acquiring target acquisition data matched with the target data serial number from each piece of alternative acquisition data.
11. The apparatus of claim 9, wherein the intermediate data acquisition instructions further comprise: a quota instruction;
the apparatus further comprises:
and the data compression module is used for carrying out data compression on the target acquisition data according to the quota instruction contained in the intermediate data acquisition instruction.
12. The apparatus of claim 9, wherein the apparatus further comprises:
the result data packet construction module is used for constructing a result data packet according to the execution result of the target task and various currently stored intermediate data of the target task after the operation of the target task is completed;
and sending the result data packet to a server.
13. A task query device, comprising:
the intermediate data acquisition instruction sending module is used for sending an intermediate data acquisition instruction aiming at a target task to a first operation node in the process of operating the target task by the first operation node;
the intermediate data acquisition instruction is used for indicating the first operation node to acquire target acquisition data matched with the target data type in the intermediate data acquisition instruction in the locally stored intermediate data of the target task;
The target acquisition data receiving module is used for receiving the target acquisition data fed back by the first operation node;
the apparatus further comprises:
the local storage module is used for locally storing the target acquisition data;
the apparatus further comprises:
the target query data acquisition module is used for responding to the intermediate data query request aiming at the target task sent by the second job node and acquiring target query data matched with the intermediate data query request from the locally stored intermediate data of the target task;
and feeding the target query data back to the second operation node.
14. The apparatus of claim 13, wherein the apparatus further comprises:
the first data transmission port creation module is used for creating a first data transmission port and a second data transmission port;
the first data transmission port is used for sending an intermediate data acquisition instruction to the first operation node; the second data transmission port is used for receiving target acquisition data fed back by the first operation node.
15. The apparatus of claim 13, wherein the apparatus further comprises:
the third data transmission port creation module is used for creating a third data transmission port;
The third data transmission port is used for acquiring a start instruction or a close instruction of the target task aiming at the first operation node operation.
16. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the task work method of any one of claims 1-5 or the task query method of any one of claims 6-8.
17. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the task work method of any one of claims 1-5 or the task query method of any one of claims 6-8.
CN202010997697.XA 2020-09-21 2020-09-21 Task operation and query method and device, electronic equipment and storage medium Active CN112099933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010997697.XA CN112099933B (en) 2020-09-21 2020-09-21 Task operation and query method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010997697.XA CN112099933B (en) 2020-09-21 2020-09-21 Task operation and query method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112099933A CN112099933A (en) 2020-12-18
CN112099933B true CN112099933B (en) 2023-11-07

Family

ID=73756484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010997697.XA Active CN112099933B (en) 2020-09-21 2020-09-21 Task operation and query method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112099933B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112948860A (en) * 2021-03-05 2021-06-11 华控清交信息科技(北京)有限公司 Data processing method, related node and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843842A (en) * 2016-12-23 2017-06-13 光锐恒宇(北京)科技有限公司 The update method and device of a kind of application profiles
CN109783445A (en) * 2018-12-27 2019-05-21 北京海数宝科技有限公司 Date storage method, device, computer equipment and storage medium
CN110427779A (en) * 2019-08-13 2019-11-08 威富通科技有限公司 A kind of the Encrypt and Decrypt method and data server of database table field
CN110750563A (en) * 2018-07-20 2020-02-04 北京京东尚科信息技术有限公司 Multi-model data processing method, system, device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7395536B2 (en) * 2002-11-14 2008-07-01 Sun Microsystems, Inc. System and method for submitting and performing computational tasks in a distributed heterogeneous networked environment
US10754872B2 (en) * 2016-12-28 2020-08-25 Palantir Technologies Inc. Automatically executing tasks and configuring access control lists in a data transformation system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843842A (en) * 2016-12-23 2017-06-13 光锐恒宇(北京)科技有限公司 The update method and device of a kind of application profiles
CN110750563A (en) * 2018-07-20 2020-02-04 北京京东尚科信息技术有限公司 Multi-model data processing method, system, device, electronic equipment and storage medium
CN109783445A (en) * 2018-12-27 2019-05-21 北京海数宝科技有限公司 Date storage method, device, computer equipment and storage medium
CN110427779A (en) * 2019-08-13 2019-11-08 威富通科技有限公司 A kind of the Encrypt and Decrypt method and data server of database table field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于大数据的运营商数据管理平台研究;顾志峰;;电信快报(第05期);全文 *

Also Published As

Publication number Publication date
CN112099933A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN111831420B (en) Method for task scheduling, related device and computer program product
US11909604B2 (en) Automatic provisioning of monitoring for containerized microservices
JP7194162B2 (en) Data processing method, device, electronic device and storage medium
US20210312125A1 (en) Method, device, and storage medium for parsing document
CN111694857B (en) Method, device, electronic equipment and computer readable medium for storing resource data
CN112181683A (en) Concurrent consumption method and device for message middleware
JP2021108147A (en) Applet data acquisition method, device, apparatus, and storage medium
KR102528210B1 (en) Streaming computing method and apparatus based on dag interaction
CN112491617B (en) Link tracking method, device, electronic equipment and medium
CN112311597B (en) Message pushing method and device
CN113051043A (en) Micro-service anomaly compensation method and device
CN112099933B (en) Task operation and query method and device, electronic equipment and storage medium
CN112084395A (en) Search method, search device, electronic device, and storage medium
EP3972222B1 (en) Method, apparatus, electronic device, readable storage medium and program for adjusting instance number
CN114217996A (en) Sound mixing method and device
CN114201294A (en) Task processing method, device and system, electronic equipment and storage medium
CN110750419B (en) Offline task processing method and device, electronic equipment and storage medium
CN113760638A (en) Log service method and device based on kubernets cluster
CN111782147A (en) Method and apparatus for cluster scale-up
CN111740859A (en) Test environment deployment method and device, electronic equipment and storage medium
CN111597026B (en) Method and device for acquiring information
JP7367287B2 (en) Methods, devices, electronic devices, and readable storage media for performing change tasks
CN112181653A (en) Job scheduling and executing method, device, equipment, system and storage medium
CN111858260A (en) Information display method, device, equipment and medium
CN111683140A (en) Method and apparatus for distributing messages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant