CN115576677A - Task flow scheduling management system and method for rapidly processing batch remote sensing data - Google Patents

Task flow scheduling management system and method for rapidly processing batch remote sensing data Download PDF

Info

Publication number
CN115576677A
CN115576677A CN202211568866.3A CN202211568866A CN115576677A CN 115576677 A CN115576677 A CN 115576677A CN 202211568866 A CN202211568866 A CN 202211568866A CN 115576677 A CN115576677 A CN 115576677A
Authority
CN
China
Prior art keywords
task
information
processing
file
flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211568866.3A
Other languages
Chinese (zh)
Inventor
张灏
李宏益
张正
胡昌苗
唐娉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202211568866.3A priority Critical patent/CN115576677A/en
Publication of CN115576677A publication Critical patent/CN115576677A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, and provides a task flow scheduling management system and a task flow scheduling management method for rapidly processing batch remote sensing data, wherein a flow configuration module manages a flow template file and provides an interface matched with a corresponding flow template according to data source information; the data leading module monitors a newly added file leading image file in the appointed folder to the system, analyzes file information to create a processing task, calls the flow configuration module to obtain flow template information, and sends the task information to the message queue; the task processing module draws a piece of task information from the message queue, creates specific flow information according to the template information, monitors the task execution state, and sends the processed task information to the message queue; and the data filing module is used for leading the processed task information from the message queue and filing, storing and managing the output file and the meta information thereof. The invention simplifies the arrangement logic of the processing task flow, improves the expandability of the processing module and reduces the coupling between the modules.

Description

Task flow scheduling management system and method for rapidly processing batch remote sensing data
Technical Field
The invention relates to the technical field of data processing, in particular to a task flow scheduling management system and method for rapidly processing batch remote sensing data, and specifically relates to a multi-node task rapid distribution processing and corresponding flow scheduling management system and method for large-batch remote sensing data processing.
Background
With the rapid development of remote sensing and satellite technologies, the data precision of remote sensing images is continuously improved, the resolution and the return rate of image data are gradually improved, the storage space required by monoscopic images is also gradually increased, meanwhile, the timeliness requirement on the processing result of the remote sensing images in various industries and various application scenes is high, and special products are usually supported by batch data. Therefore, it is often desired to increase the processing speed of each link in the complete cycle of data processing to ensure the rapid availability of data products. How to process remote sensing images quickly and efficiently aiming at batch data, computer resources are fully utilized, and the situation that the processing time of a single link in a flow is long and the processing time of the whole flow is increased due to flow blockage is avoided, so that the method is the key point of research problems.
Currently, methods for increasing processing speed and efficiency mainly include the following: a type of processing mode combining multiple nodes and clusters is specifically divided into the following steps: for the block calculation recombination of the single-view image, the method is limited to the application scene that a specific algorithm allows the calculation processing of the block image, and the use of the block rule and the algorithm is limited by the prior knowledge, so the method is slightly insufficient in system compatibility and expansibility; another multi-node task scheduling management method is to perform split node processing on flow steps in a processing task, wherein different steps are executed on different configured computing nodes, so that computing resources of each step can be fully allocated, but in a remote sensing image processing scene, an image file is large, if a processing mode similar to a local file needs to be configured for network shared storage, the bottleneck of processing speed is the hard disk reading and writing speed. The other is to realize out-of-order polling by means of flow customization and keyword identification library.
Disclosure of Invention
The invention provides a task flow scheduling management system and method for rapidly processing batch remote sensing data, which are used for overcoming the defect of low processing efficiency of remote sensing image data in the prior art, realizing data access and task information creation, rapidly matching a flow template and constructing a processing flow, actively acquiring rapid processing of tasks by processing nodes, and finally filing and managing task output results.
The invention provides a task flow scheduling management system for rapidly processing batch remote sensing data, which comprises:
the flow configuration module is used for creating and managing a flow template file, wherein the flow template file comprises information and dependency relationship of processing steps in the flow and provides an interface matched with the corresponding flow template according to the data source information;
the data leading module is used for monitoring a newly added file in the appointed folder to lead the image file to the system, analyzing the file information of the image file and creating a processing task, and calling the flow configuration module to obtain flow template information and send the task information to the message queue;
the task processing module is used for leading a piece of task information, creating specific flow information according to the flow template information, distributing processing tasks to the task execution submodule, monitoring the task execution state, and sending the processed task information to the message queue;
and the data filing module is used for leading the task information which is processed and finished, and filing, storing and managing the output file and the meta information thereof.
According to the task flow scheduling management system for rapidly processing the batch remote sensing data, the data leading module comprises a data leading submodule and a task creating submodule, wherein:
the data leading sub-module is used for monitoring the specified folder, judging a newly added image file in the folder, reading the information of the newly added image file, wherein the information of the newly added image file comprises the file basic information of the image, such as time information, space information, data source and file basic information, and recording and storing the information through a relational database;
and the task creating submodule is used for automatically creating a processing task, persisting task information to a database according to an interface matching flow template provided by the flow configuration module, and sending the task information to the message queue.
According to the task flow scheduling management system for rapidly processing the batch remote sensing data, the task processing module comprises a task obtaining submodule, an execution submodule and a management submodule, wherein:
the task acquisition submodule is used for actively acquiring task information from the message queue and removing the task information from the message queue at the same time, so that other task processing nodes are prevented from acquiring the same information, and the node where the processing module is located executes all processing steps in a task;
the execution submodule is used for calling a corresponding algorithm to complete the processing of the current step;
and the management submodule is used for monitoring the state of the execution step, is similar to the task state and comprises four states of waiting for processing, processing completion and processing failure, wherein the state is that the processed task meets the filing condition, and the information of the file to be filed is sent to the message queue.
According to the task flow scheduling management system for rapidly processing the batch remote sensing data, which is provided by the invention, the data filing module comprises a file filing submodule and a file management submodule, wherein:
the file filing submodule is used for storing the processed output file;
and the file management submodule is used for managing the meta information corresponding to the file.
The invention also provides a task flow scheduling management method for rapidly processing the batch remote sensing data, which is based on any one of the task flow scheduling management systems for rapidly processing the batch remote sensing data to realize task flow scheduling management for rapidly processing the remote sensing data.
The invention further provides electronic equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the task flow scheduling management method for rapidly processing the batch remote sensing data is realized.
The invention also provides a computer readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the task flow scheduling management method for batch remote sensing data fast processing is realized.
The invention provides a task flow scheduling management system and method for rapidly processing batch remote sensing data, wherein a flow configuration module creates a management flow template file, the flow template file comprises information and dependency relationship of processing steps in a flow, and an interface matched with a corresponding flow template according to data source information is provided; the data leading module monitors a newly added file leading image file in the appointed folder to the system, analyzes file information to create a processing task, calls the flow configuration module to obtain flow template information, and sends the task information to the message queue; the task processing module draws a piece of task information from the message queue, creates specific flow information according to the template information, distributes processing tasks to the task execution sub-module, monitors the task execution state, and sends the processed task information to the message queue; and the data filing module is used for leading the processed task information from the message queue and filing, storing and managing the output file and the meta information thereof. The invention simplifies the arrangement logic of the processing task flow, improves the expandability of the processing modules and reduces the coupling among the modules.
Compared with the prior art, the method has the advantages that each module is deployed by adopting a containerization technology, the requirements on the server system environment and the architecture are reduced, the service access data can be rapidly deployed, and the consistency of the functions of the processing modules in a multi-node environment is also ensured. And the expandability is strong, the addition and closing exit of the processing nodes are simplified, a plurality of processing modules can be configured in a plurality of nodes in the system, the addition or closing of a certain processing module does not influence other function modules and other processing modules, the decoupling between the modules is realized, and the expandability and the management convenience are improved. And the unordered flow arrangement is realized, only dependent steps are configured, the processing sequence of each step in the flow is not arranged, the steps without the dependent relation can be synchronously performed in a single processing task, the computer resource is fully utilized, and the total time consumption of the processing flow is reduced.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is one of the overall structural diagrams of a task flow scheduling system for rapidly processing batch remote sensing data provided by the present invention;
FIG. 2 is a second overall framework diagram of the task flow scheduling system for rapidly processing batch remote sensing data according to the present invention;
FIG. 3 is an overall flowchart of a task flow scheduling system for fast processing of batch remote sensing data provided by the present invention;
FIG. 4 is a flow chart of a task execution sub-module provided by the present invention;
FIG. 5 is a schematic diagram of an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention realizes the distribution and the scheduling of the processing task of the remote sensing data through four modules of leading external data, creating and distributing tasks, leading and executing the tasks and archiving the data. As shown in fig. 1-3, the method includes a flow configuration module, a data connection module, a task processing module, and a data archiving module.
Referring to fig. 1, the task flow scheduling management system for rapidly processing batch remote sensing data provided by the invention comprises:
the flow configuration module is used for creating and managing a flow template file, wherein the flow template file comprises information and dependency relationship of processing steps in the flow and provides an interface matched with the corresponding flow template according to the data source information;
the template information of the template file comprises the application range of the process template, the application range of the process and step information, and the detailed description is as follows:
the application range of the process template (defining the scene and the output file suitable for the template information, dividing according to the business requirement (calculating to generate a certain product, such as scene classification, and the like), dividing according to the data type to produce a corresponding standard data product (L3-level data, calculating a certain index month, and the like), dividing according to the satellite type, performing specified processing on a high-grade series, a resource series, noctilucent data, and the like, or defaulting to not performing type division, performing general processing suitable for all remote sensing data, such as pyramid construction, and the like, and executing a complete and complex process template can generate a plurality of intermediate process files and a plurality of output results.
The process application range defines a scene to which the configuration information is applicable, the configuration information is divided according to the service scene requirements, the configuration information can be divided according to the data types (hyperspectral data, SAR data, multispectral data and the like) and the satellite types, such as hyperspectral series, resource series, noctilucent data and the like, the configuration information can be divided according to the types by default, the configuration information is applicable to all satellites, and information corresponding to the application range can be provided when the data is connected.
The step information includes step type, dependent steps required for executing the current step, input source and output form, as follows:
the types of the steps are as follows: the step type has uniqueness indicating the content of the step and the corresponding algorithm. Such as quality evaluation, geometric correction, reference image matching, etc.
Dependent steps required to perform the current step: some dependency relationships exist among steps in the process, the precondition for executing the current step is that all the dependency steps are completed, for example, the geometric precise correction needs a reference image and control point data, the image cropping needs to acquire the geographic space information of the image and the range data of the region of interest, the change detection needs to complete the geometric registration of two images with different time phases, and the like, and the calculation and synthesis of the monthly data can be executed only by completing certain processing of a plurality of data.
Inputting a source: typically the input file for the current step is the output file for the dependent step, and in some cases the resulting output parameters for the dependent step.
And (3) outputting the form: specifying the form of the output result of the current step, one or more output files, or a parametric result.
It should be noted that a new flow template is created after the system is initialized for the function operation support of other modules. And creating a universal template or a targeted service template according to actual use requirements.
And the process configuration module is responsible for information configuration and management of the process template, is created and stored in the form of a local xml file, a json file or a relational database and the like, and provides a template matching interface for reading the template when the process is arranged according to the data information after the data is connected.
The process template includes a plurality of steps to be performed to complete the final product, each step further having a plurality of necessary information, including: step type, dependent step, input source, output form and the like.
The process configuration module also provides an interface for acquiring the process template file according to the data information in the data leading module.
The data leading module is used for monitoring a newly added file in the appointed folder to lead the image file to the system, analyzing the file information of the image file and creating a processing task, and meanwhile, calling the flow configuration module to obtain flow template information and sending the task information to the message queue;
and the data leading module is responsible for leading external data to the system and creating task and flow information.
Firstly, the data leading module monitors the appointed folder through the system configuration information, and reads the information of the newly added image file, including the file basic information of the image, such as time information, space information, data source, and file basic information (including format, size, creation time, modification time, etc.).
Secondly, a corresponding processing task is created according to the image file information, the task state is initialized to be a waiting process, and a random code is generated to be used as a task unique number. And calling an interface provided by the flow configuration module, matching to obtain a corresponding flow template file, and generating flow template information.
And finally, combining the flow template information into the task information, persisting the flow template information to a database, and sending the task information to a message queue.
The task processing module is used for leading a piece of task information, creating specific flow information according to the flow template information, distributing processing tasks to the task execution submodule, monitoring the task execution state, and sending the processed task information to the message queue;
and the task processing module is deployed for self-starting operation by adopting a container technology, actively acquiring the task and executing the processing task, and simultaneously monitoring and recording the processing condition of each step in the task, including task acquisition, task execution and task monitoring.
And the data filing module is used for leading the task information which is processed and finished, and filing, storing and managing the output file and the meta information thereof.
The data filing module is used for storing the processed output file and storing the meta information corresponding to the management file. And acquiring output file information from the message queue, copying or moving the file to an address configured by the filing module, and analyzing and storing corresponding meta information into the database.
When the deployment operation is performed, the deployment operation is performed in a multi-node containerization mode, the coupling relation among the four modules is weak, and the deployment operation can be performed on the same terminal or a plurality of terminals. The modules of the same type can also be deployed and operated on a plurality of nodes, and low coupling is achieved through communication between the interfaces and the message queues. When a plurality of task processing modules exist in the system, each processing module draws a piece of task information from the message queue when spare computing resources exist, and each task information to be executed is consumed by only one processing module by utilizing the characteristics of the message queue, so that the repeated execution of the same file is avoided. The reading of the cross-node file can be realized by downloading through an interface in a file stream mode and realizing the reading and writing operation of an approximate local file in a mode of configuring a network shared folder.
Compared with the prior art, the task flow scheduling management system and the task flow scheduling management method for rapidly processing the batch remote sensing data have the advantages that containerization technology is adopted for deployment of all modules, requirements on environment and architecture of a server system are reduced, service access data can be rapidly deployed, and consistency of functions of processing modules in a multi-node environment is guaranteed. And the expandability is strong, the addition and closing exit of the processing nodes are simplified, a plurality of processing modules can be configured in a plurality of nodes in the system, the addition or closing of a certain processing module does not influence other function modules and other processing modules, the decoupling between the modules is realized, and the expandability and the management convenience are improved. And the unordered flow arrangement is realized, only dependent steps are configured, the processing sequence of each step in the flow is not arranged, the steps without the dependent relation can be synchronously performed in a single processing task, the computer resources are fully utilized, and the total time consumption of the processing flow is reduced.
In one embodiment, the data docking module comprises a data docking submodule and a task creation submodule, wherein:
the data leading sub-module is used for monitoring the specified folder, judging a newly added image file in the folder, reading the information of the newly added image file, wherein the information of the newly added image file comprises the file basic information of the image, such as time information, space information, data source and file basic information, and recording and storing the information through a relational database;
and the task creating submodule is used for automatically creating a processing task, persisting task information to a database according to an interface matching flow template provided by the flow configuration module, and sending the task information to the message queue.
Firstly, monitoring a specified folder through system configuration information, and reading information of a newly added image file, including file basic information of an image, such as time information, space information, a data source, and file basic information (including format, size, creation time, modification time, and the like). Secondly, a corresponding processing task is created according to the image file information, the task state is initialized to be a waiting process, and a random code is generated to be used as a task unique number. And calling an interface provided by the flow configuration module, matching to obtain a corresponding flow template file, and generating flow template information. And finally, combining the flow template information into the task information, persisting the flow template information to a database, and sending the task information to a message queue.
And the data leading sub-module is used for leading external data to the system, monitoring the specified folder through system configuration information polling, judging newly added image files in the folder, reading the information of the newly added image files, including the file basic information of the images, such as time information, space information, data sources and file basic information, and recording and storing the information through a relational database.
And the task creating submodule automatically creates a processing task after acquiring the file, creates a processing task according to an interface matching flow template provided by the flow module, persists the task information to a database, and simultaneously sends the task information to a message queue. The processing task contains several necessary information, including task state information, task unique identification code and process template information.
Wherein, the task state information: task execution contains several basic states: and waiting for processing, finishing processing and failing processing. The task unique identification code: a unique identification code for the task of query retrieval. Flow template information: and acquiring the flow template information according to the interface provided by the flow configuration module.
In one embodiment, the task processing module includes a task obtaining sub-module, an execution sub-module, and a management sub-module, wherein:
the task acquisition submodule is used for actively acquiring task information from the message queue and removing the task information from the message queue at the same time, so that other task processing nodes are prevented from acquiring the same information, and the node where the processing module is located executes all processing steps in a task;
the task execution submodule is used for calling a corresponding algorithm to complete the processing of the current step;
and the task management submodule is used for monitoring the states of the execution steps, is similar to the task states and comprises four states of waiting for processing, processing completion and processing failure, and sends the information of the files to be archived to the message queue, wherein the states are that the processed tasks meet archiving conditions.
The task processing module comprises a task obtaining submodule, a task executing submodule and a task managing submodule.
The task obtaining sub-module actively obtains a piece of task information from the message queue, and simultaneously removes the task information from the message queue, thereby avoiding other task processing nodes from obtaining the same information and avoiding all processing steps in the task executed by the node where the processing module is located. Firstly, judging the available computing resource condition of the computing node, and when the available computing resource meets a piece of processing data, acquiring a piece of task information from a message queue and starting to process the task. And analyzing each step information in the task acquired by the task acquisition submodule, and adding the step information to the task execution submodule.
And the task execution sub-module calls a corresponding algorithm by using the queue or stack data structure to complete the processing of the current step. The task execution submodule integrates processing algorithms of all processing steps, each processing algorithm comprises a task execution submodule, and the function of executing the algorithm contained in the step and returning to the algorithm execution state of the step is realized. When the system is started, all the task execution sub-modules are started to wait for the newly added processing steps.
As shown in fig. 4, when a step to be processed is in the step processing module: judging whether a dependent step exists, checking the processing state of the dependent step, judging whether all the dependent steps of the current step are finished, meeting the execution precondition after all the dependent steps are finished, and starting to execute the current step.
The task management submodule monitors the states of the execution steps by adopting a timing polling method, regularly checks all the task execution submodules, records the real-time states of all the processing steps, is similar to the task states and comprises four states of waiting for processing, processing completion and processing failure, the states are that the processed tasks meet the filing conditions, the task updating state is complete, and the information of the files to be filed is sent to a message queue for the data filing module to carry out filing management. Wherein all steps in the task information are after the processing state is completed.
In one embodiment, the data archiving module includes a file archiving submodule and a file management submodule, wherein:
the file filing submodule is used for storing the processed output file;
and the file management submodule is used for managing the meta information corresponding to the file.
The file filing submodule is used for storing the processed output file, and the file management submodule is used for storing the meta information corresponding to the management file, so that the information of the output file is obtained from the message queue, the file is copied or moved to the address configured by the filing module, and the corresponding meta information is analyzed and stored in the database.
And the data filing module realizes the storage of the processed output file through the file filing submodule and the file management submodule, stores the meta information corresponding to the management file, acquires the data information to be filed through the message queue and realizes the functions of filing and file management.
For easy understanding, fig. 5 shows a specific embodiment, please refer to fig. 5, when processing the remote sensing image data, firstly reading the image metadata, which can be read from the attached file or the image file; then, downloading a reference picture and performing region-of-interest side cutting on the image pixel data to obtain side-cut image data; and after geometric correction processing is carried out according to the reference image and the image data after side cutting, task processing including ground feature classification, target detection, pyramid generation and the like is carried out according to a template provided by a flow configuration module.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor) 610, a communication Interface (Communications Interface) 620, a memory (memory) 630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. Processor 610 may invoke logic instructions in memory 630 to perform a task flow scheduling management method for fast processing of batches of remotely sensed data.
In addition, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In still another aspect, the present invention further provides a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a task flow scheduling management method for performing fast processing of batch remote sensing data provided by the above methods.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A task flow scheduling management system for rapidly processing batch remote sensing data is characterized by comprising:
the flow configuration module is used for creating and managing a flow template file, wherein the flow template file comprises information and dependency relationship of processing steps in the flow and provides an interface matched with the corresponding flow template according to the data source information;
the data leading module is used for monitoring a newly added file in the appointed folder to lead the image file to the system, analyzing the file information of the image file and creating a processing task, and meanwhile, calling the flow configuration module to obtain flow template information and sending the task information to the message queue;
the task processing module is used for leading a piece of task information, creating specific flow information according to the flow template information, distributing processing tasks to the task execution submodule, monitoring the task execution state, and sending the processed task information to the message queue;
and the data filing module is used for leading the task information which is processed and finished, and filing, storing and managing the output file and the meta information thereof.
2. The task flow scheduling management system for the rapid processing of the batch of remote sensing data according to claim 1, wherein the data access module comprises a data access submodule and a task creation submodule, wherein:
the data leading sub-module is used for monitoring the specified folder, judging a newly added image file in the folder, reading the information of the newly added image file, wherein the information of the newly added image file comprises the file basic information of the image, such as time information, space information, data source and file basic information, and recording and storing the information through a relational database;
and the task creating submodule is used for automatically creating a processing task, persisting task information to a database according to an interface matching flow template provided by the flow configuration module, and sending the task information to the message queue.
3. The task flow scheduling management system for the rapid processing of the batch of remote sensing data according to claim 1, wherein the task processing module comprises a task obtaining submodule, an execution submodule and a management submodule, wherein:
the task acquisition submodule is used for actively acquiring task information from the message queue and removing the task information from the message queue at the same time, so that other task processing nodes are prevented from acquiring the same information, and the node where the processing module is located executes all processing steps in a task;
the execution submodule is used for calling a corresponding algorithm to complete the processing of the current step;
and the management submodule is used for monitoring the state of the execution step, is similar to the task state and comprises four states of waiting for processing, processing completion and processing failure, wherein the state is that the processed task meets the filing condition, and the information of the file to be filed is sent to the message queue.
4. The task flow scheduling management system for the rapid processing of batch remote sensing data according to claim 1, wherein the data filing module comprises a file filing submodule and a file management submodule, wherein:
the file filing submodule is used for storing the processed output file;
and the file management submodule is used for managing the meta information corresponding to the file.
5. A task flow scheduling management method for rapidly processing batch remote sensing data is characterized in that task flow scheduling management for rapidly processing remote sensing data is realized based on the task flow scheduling management system for rapidly processing batch remote sensing data of any one of claims 1 to 4.
6. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the task flow scheduling management method for fast processing of the batch of remote sensing data according to claim 5 when executing the program.
7. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements a task flow scheduling management method for fast processing of batch of remote sensing data according to claim 5.
CN202211568866.3A 2022-12-08 2022-12-08 Task flow scheduling management system and method for rapidly processing batch remote sensing data Pending CN115576677A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211568866.3A CN115576677A (en) 2022-12-08 2022-12-08 Task flow scheduling management system and method for rapidly processing batch remote sensing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211568866.3A CN115576677A (en) 2022-12-08 2022-12-08 Task flow scheduling management system and method for rapidly processing batch remote sensing data

Publications (1)

Publication Number Publication Date
CN115576677A true CN115576677A (en) 2023-01-06

Family

ID=84590408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211568866.3A Pending CN115576677A (en) 2022-12-08 2022-12-08 Task flow scheduling management system and method for rapidly processing batch remote sensing data

Country Status (1)

Country Link
CN (1) CN115576677A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302508A (en) * 2023-02-27 2023-06-23 中国科学院空间应用工程与技术中心 High-speed distributed image synthesis method and system for space application
CN116382813A (en) * 2023-03-16 2023-07-04 成都考拉悠然科技有限公司 Video real-time processing AI engine system for smart city management
CN116886691A (en) * 2023-09-08 2023-10-13 蓝思系统集成有限公司 File management control method, device, system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1959717A (en) * 2006-10-09 2007-05-09 北京道达天际软件技术有限公司 System and method for preprocessing mass remote sensing data collection driven by order form
CN107423053A (en) * 2017-06-15 2017-12-01 东莞理工学院 The webization model encapsulation and distributed approach of a kind of remote sensing image processing
CN111723221A (en) * 2020-06-19 2020-09-29 珠江水利委员会珠江水利科学研究院 Mass remote sensing data processing method and system based on distributed architecture
US11018959B1 (en) * 2016-10-15 2021-05-25 Rn Technologies, Llc System for real-time collection, processing and delivery of data-telemetry
US20220058052A1 (en) * 2020-08-21 2022-02-24 Leica Microsystems Cms Gmbh Data processing management methods for imaging applications

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1959717A (en) * 2006-10-09 2007-05-09 北京道达天际软件技术有限公司 System and method for preprocessing mass remote sensing data collection driven by order form
US11018959B1 (en) * 2016-10-15 2021-05-25 Rn Technologies, Llc System for real-time collection, processing and delivery of data-telemetry
CN107423053A (en) * 2017-06-15 2017-12-01 东莞理工学院 The webization model encapsulation and distributed approach of a kind of remote sensing image processing
CN111723221A (en) * 2020-06-19 2020-09-29 珠江水利委员会珠江水利科学研究院 Mass remote sensing data processing method and system based on distributed architecture
US20220058052A1 (en) * 2020-08-21 2022-02-24 Leica Microsystems Cms Gmbh Data processing management methods for imaging applications

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
廖芳芳 等: ""遥感影像在线处理流程编排与算法调度技术研究"", 《电子设计工程》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302508A (en) * 2023-02-27 2023-06-23 中国科学院空间应用工程与技术中心 High-speed distributed image synthesis method and system for space application
CN116302508B (en) * 2023-02-27 2023-12-22 中国科学院空间应用工程与技术中心 High-speed distributed image synthesis method and system for space application
CN116382813A (en) * 2023-03-16 2023-07-04 成都考拉悠然科技有限公司 Video real-time processing AI engine system for smart city management
CN116382813B (en) * 2023-03-16 2024-04-19 成都考拉悠然科技有限公司 Video real-time processing AI engine system for smart city management
CN116886691A (en) * 2023-09-08 2023-10-13 蓝思系统集成有限公司 File management control method, device, system and storage medium
CN116886691B (en) * 2023-09-08 2024-02-09 蓝思系统集成有限公司 File management control method, device, system and storage medium

Similar Documents

Publication Publication Date Title
CN115576677A (en) Task flow scheduling management system and method for rapidly processing batch remote sensing data
CN112449750A (en) Log data collection method, log data collection device, storage medium, and log data collection system
US20180113707A1 (en) Microservice-based data processing apparatus, method, and program
US9213463B2 (en) Graphical object classification
CN113052696B (en) Financial business task processing method, device, computer equipment and storage medium
CN110249312B (en) Method and system for converting data integration jobs from a source framework to a target framework
US20210409346A1 (en) Metadata driven static determination of controller availability
US20190034806A1 (en) Monitor-mine-manage cycle
US20130226670A1 (en) Method and system for automatically partitioning and processing a business process
US11700241B2 (en) Isolated data processing modules
CN113535673A (en) Method and device for generating configuration file and processing data
CN116755799A (en) Service arrangement system and method
KR20210033230A (en) An engineering drawing sharing system based on cloud for co-work
EP3633514A1 (en) Data exchange system, data exchange method, and data exchange program
KR20160072851A (en) System and method for analyzing large-scale high resolution satellite image
US8904369B2 (en) Method and system for automated process distribution
CN114064429A (en) Audit log acquisition method and device, storage medium and server
US20130198138A1 (en) Model for capturing audit trail data with reduced probability of loss of critical data
CN114579127A (en) Page deployment method and device, computer equipment and storage medium
CN112162831A (en) Big data analysis method and system, electronic device and storage medium
CN112463181A (en) Software product distribution method, device, equipment and storage medium under multi-cloud scene
CN113805976B (en) Data processing method and device, electronic equipment and computer readable storage medium
CN118364033B (en) Data processing method and device among database clusters and related equipment
CN114816506B (en) Quick processing method and device for model features, storage medium and electronic equipment
EP4109364B1 (en) Method and device for managing project by using data pointer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230106