CN112783627A - Batch processing method and device - Google Patents
Batch processing method and device Download PDFInfo
- Publication number
- CN112783627A CN112783627A CN202110087185.4A CN202110087185A CN112783627A CN 112783627 A CN112783627 A CN 112783627A CN 202110087185 A CN202110087185 A CN 202110087185A CN 112783627 A CN112783627 A CN 112783627A
- Authority
- CN
- China
- Prior art keywords
- data block
- processed
- batch processing
- processing
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 210
- 238000000034 method Methods 0.000 claims description 45
- 230000008569 process Effects 0.000 claims description 24
- 230000036316 preload Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 238000010923 batch production Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 8
- 239000012634 fragment Substances 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000005192 partition Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000005034 decoration Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 230000005055 memory storage Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 239000010453 quartz Substances 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
When batch processing is executed, when the data blocks with the front sequence start processing, namely, the data blocks with the back sequence are preloaded, so that the database with the back sequence is in a state that the execution processing can be started at any time. When the state adjustment of the data block preceding the order is completed, the processing may be directly performed on the data block in the preloaded state following the order. Therefore, the batch processing method in the specification is tightly linked between the processing of the data blocks in the front order and the processing of the data blocks in the back order, and is beneficial to improving the batch processing efficiency. Moreover, the technical scheme in the specification is suitable for various business processing scenes, and is particularly suitable for financial business processing scenes.
Description
Technical Field
The application relates to the technical field of terminals, in particular to a batch processing method and device.
Background
Batch processing refers to that after a computer has a multi-program running capacity, tasks of a plurality of users are delivered to the computer in batches, and then the computer carries out scheduling processing on each task until all tasks submitted by the users are completed.
With the rapid development of the credit card industry, the amount of user data is growing explosively, and it is common for financial companies to process massive business data in a batch processing manner. The diversity of business functions and the modularization of program functions cause the complexity of batch processing programs and the complexity of dependence among batch processing, and the number of batch processing programs is large. After the previous batch process is completed, the subsequent batch process can be started, and there is a problem that the start operation takes a long time.
Therefore, how to reduce the time consumption of starting and improve the batch processing efficiency becomes a problem to be solved urgently.
Disclosure of Invention
The application provides a batch processing method and a batch processing device, which effectively improve the safety of a service processing process performed by a terminal based on a client, and adopts the following technical scheme:
in a first aspect, a batch processing method is provided, the method comprising:
determining data to be processed;
splitting data to be processed into a plurality of data blocks according to a preset splitting rule, wherein each data block corresponds to one batch processing grade, and the batch processing grades are sequentially arranged according to a preset sequence, so that the processing sequence of the data block corresponding to the batch processing grade with the previous sequence is prior to the processing sequence of the data block corresponding to the batch processing grade with the next sequence;
for each batch processing grade, when the data block corresponding to the batch processing grade is processed, determining the data block corresponding to the batch processing grade in the next sequence as a data block to be processed;
preloading a data block to be processed;
and after the data block corresponding to the currently processed batch processing grade is processed, processing the data block to be processed until the data block of each batch processing grade is processed.
In an alternative embodiment of the present specification, the method is based on a batch processing system, the batch processing system comprises a Spring Cloud Data Flow, and the preloading of the Data block to be processed is performed by the Spring Cloud Data Flow.
In an alternative embodiment of the present description, the batch processing system runs on an Open Shift container cloud platform.
In an optional embodiment of this specification, the batch processing system includes a main system and a backup system, and when the main system is abnormally operated, the batch processing system is switched to the backup system to perform batch processing.
In an alternative embodiment of the present specification, the batch processing system corresponds to a plurality of slices, and each slice corresponds to a docker container service;
preloading a data block to be processed, comprising:
initiating the invocation of mirror image service to the docker container service by Spring Cloud Data Flow so as to preload the Data blocks to be processed;
after the data block to be processed is processed, the method further includes:
and destroying the mirror image process corresponding to the data block to be processed.
In an optional embodiment of this specification, after the data block corresponding to the batch processing level currently being processed is processed, processing the data block to be processed includes:
after a first preset time period of preloading a data block to be processed, determining whether a data block which is being processed and has a corresponding batch processing grade before a grade corresponding to the data block to be processed exists, and if not, processing the data block to be processed.
In an alternative embodiment of the present description, the method further comprises: and if the time for processing a data block exceeds a second preset time period, determining that the processing result of the data block is failure.
In an optional embodiment of the present description, the method further comprises, after: and recording the processing result of each processed data block.
In an alternative embodiment of the present description, for each data block, the state of the data block is periodically obtained after execution for the data block is initiated. If the state of the data block is in processing, continuously and periodically acquiring the state of the data block; if the state of the data block is complete, ending the execution of the data block, and reclaiming the container resource corresponding to the data block. The period may be 2 seconds.
In an optional embodiment of this specification, for each fragment, determining a data block in the fragment that is being processed, determining whether a record corresponding to the data block exists in a task information database, if so, determining whether a current state of the data block matches a state of the record, and if not, updating a state corresponding to the data block in the task information data block according to the current state of the data block until the state of the data block is adjusted to be completed; if not, judging whether the state of the data block is preloading or not.
In a second aspect, a batch processing method apparatus is provided, the apparatus comprising:
a data determination module configured to determine data to be processed;
the splitting module is configured to split the data to be processed into a plurality of data blocks according to a preset splitting rule, so that each data block corresponds to one batch processing grade, and the batch processing grades are sequentially arranged according to a preset sequence, so that the processing sequence of the data block corresponding to the batch processing grade with the previous sequence is prior to the processing sequence of the data block corresponding to the batch processing grade with the next sequence;
the data block to be processed determining module is configured to determine, for each batch processing level, a data block corresponding to a next-order batch processing level as a data block to be processed when processing of the data block corresponding to the batch processing level is started;
the pre-loading module is configured to pre-load the data blocks to be processed;
and the processing module is configured to process the data block to be processed after the data block corresponding to the currently processed batch processing grade is processed, until the data block of each batch processing grade is processed.
In an alternative embodiment of the present specification, the apparatus is used in a batch processing system, the batch processing system comprises a Spring Cloud Data Flow, and the preloading of the Data block to be processed is performed by the Spring Cloud Data Flow.
In an alternative embodiment of the present description, the batch processing system runs on an Open Shift container cloud platform.
In an optional embodiment of this specification, the batch processing system includes a main system and a backup system, and when the main system is abnormally operated, the batch processing system is switched to the backup system to perform batch processing.
In an alternative embodiment of the present description, the batch system corresponds to a plurality of slices, each slice corresponding to a docker container service.
In an alternative embodiment of the present disclosure, the preloading module 306 is specifically configured to initiate invocation of the mirroring service by the Spring Cloud Data Flow to the docker container service to preload the to-be-processed Data chunk.
In an alternative embodiment of the present disclosure, the apparatus further comprises an adjustment module. The adjusting module is configured to destroy the mirror image process corresponding to the data block to be processed.
In an optional embodiment of this specification, the processing module 308 is specifically configured to, after a first preset time period of preloading a data block to be processed, determine whether there is a data block which is being processed and whose corresponding batch processing level is before a level corresponding to the data block to be processed, and if not, process the data block to be processed.
In an optional embodiment of the present description, the apparatus further comprises a determination module. The judging module is configured to determine that the processing result of a data block is failure if the time for processing the data block exceeds a second preset time period.
In an alternative embodiment of the present description, the apparatus further comprises a recording module. The recording module is configured to record a processing result of each processed data block.
In an optional embodiment of the present description, the apparatus further comprises a first supervision module configured to: for each data block, the state of the data block is periodically obtained after execution for the data block is initiated. If the state of the data block is in processing, continuously and periodically acquiring the state of the data block; if the state of the data block is complete, ending the execution of the data block, and reclaiming the container resource corresponding to the data block.
In an optional embodiment of the present description, the apparatus further comprises a second supervision module configured to: for each fragment, determining a data block in the fragment under processing, determining whether a record corresponding to the data block exists in a task information database, if so, determining whether the current state of the data block is matched with the recorded state, and if not, updating the state corresponding to the data block in the task information data block according to the current state of the data block until the state of the data block is adjusted to be finished; if not, judging whether the state of the data block is preloading or not.
In a third aspect, an electronic device is provided, which includes:
one or more processors;
a memory;
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: the batch processing method shown in the first aspect is performed.
In a fourth aspect, there is provided a computer-readable storage medium for storing computer instructions which, when executed on a computer, cause the computer to perform the batch processing method of the first aspect.
When batch processing is executed, when the data blocks with the front sequence start processing, namely, the data blocks with the back sequence are preloaded, so that the database with the back sequence is in a state that the execution processing can be started at any time. When the state adjustment of the data block preceding the order is completed, the processing may be directly performed on the data block in the preloaded state following the order. Therefore, the batch processing method in the specification is tightly linked between the processing of the data blocks in the front order and the processing of the data blocks in the back order, and is beneficial to improving the batch processing efficiency. Moreover, the technical scheme in the specification is suitable for various business processing scenes, and is particularly suitable for financial business processing scenes.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram of a batch processing scenario according to an embodiment of the present application;
FIG. 2 is a schematic illustration of a batch process according to an embodiment of the present application;
FIG. 3a is a schematic diagram of a batch process hierarchy according to an embodiment of the present application;
FIG. 3b is a schematic diagram of a batch processing apparatus according to an embodiment of the present disclosure;
FIG. 3c is a schematic diagram of a portion of a batch process according to an embodiment of the present application;
FIG. 3d is a schematic diagram of a portion of a batch process according to an embodiment of the present application;
FIG. 3e is a schematic diagram of a portion of a batch process according to an embodiment of the present application;
FIG. 3f is a schematic diagram of a batch processing apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
In the description of the embodiments of the present application, the words "exemplary," "for example," or "for instance" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary," "e.g.," or "e.g.," is not to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words "exemplary," "e.g.," or "exemplary" is intended to present relevant concepts in a concrete fashion.
In the description of the embodiments of the present application, the term "and/or" is only one kind of association relationship describing an associated object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, B exists alone, and A and B exist at the same time. In addition, the term "plurality" means two or more unless otherwise specified. For example, the plurality of systems refers to two or more systems, and the plurality of terminals refers to two or more terminals.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit indication of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present description may be combined with each other without conflict.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
Illustratively, the batch process in this specification relates to a network architecture of a batch system as shown in FIG. 1. The batch processing system may include a batch processing server and at least one database. The batch server is communicatively coupled to the database (e.g., the communication may be over a network).
The batch process in this specification is performed by a batch server in a batch system, and may include one or more of the following steps.
S200: data to be processed is determined.
The batch server connected to the database in this specification may be plural, and as shown in fig. 1, the batch server is connected to the databases 1 to n. The batch processing server determines data which can be used for batch processing in each database according to preset service requirements, and the data are used as the data to be processed in the specification. It can be seen that the data to be processed in this specification may not be data in the same database.
S202: and splitting the data to be processed into a plurality of data blocks according to a preset splitting rule.
The splitting rule is not specifically limited in this specification, and the existing process of obtaining a data block can be used in this specification.
Each data block obtained in this step corresponds to a batch processing level, the batch processing levels are sequentially arranged according to a preset order to form a chain, and correspondingly, the obtained data blocks are also arranged into a chain according to the order of the batch processing levels, as shown in fig. 3 a.
The arrangement order of the respective batch levels characterizes the processing order of the respective data blocks corresponding to the batch levels. That is, the processing order of the data blocks corresponding to the batch processing level in the preceding order precedes the processing order of the data blocks corresponding to the batch processing level in the succeeding order.
S204: and for each batch processing grade, when the data block corresponding to the batch processing grade is processed, determining the data block corresponding to the next-order batch processing grade as the data block to be processed.
S206: and preloading the data block to be processed.
The data block to be processed is preloaded to enable the data block to be processed to be in a prepared state, and when the data block to be processed is triggered to be processed, the data block to be processed can be immediately processed.
According to the existing batch processing process, only after the processing of the data blocks in the front order is finished, the data blocks in the back order are preloaded, the data volume of ten million can be processed in 5 minutes according to the existing batch processing condition, and the time consumption of the preloading operation of the data blocks in the back order can reach 1 minute and half minutes. According to the existing batch processing process, the longest data block chain reaches 25 batch processing levels, and according to the existing batch processing arrangement logic, each time the data block currently being processed is executed and the data block of the next batch processing level is pulled up, the whole data block chain takes about 100 minutes after the execution is finished, and the container resource allocation program (namely the data block of the next batch processing level is pulled up) takes more than 20 minutes. The efficiency of the batch process is significantly affected.
In contrast, in the batch processing process in this specification, when the processing of the data block with the prior execution order is performed, that is, the processing of the data block of the next batch processing level is pulled up, the time consumed by the container resource allocation and program start is saved to a greater extent, which is beneficial to improving the batch processing efficiency and also significantly improving the container resource utilization rate.
S208: and after the data block corresponding to the currently processed batch processing grade is processed, processing the data block to be processed until the data block of each batch processing grade is processed.
In an alternative embodiment of the present specification, the batch processing system includes a Spring Cloud Data Flow, and the preloading of the Data chunks to be processed is performed by the Spring Cloud Data Flow. The batch processing system runs on an Open Shift container cloud platform, cross-region cluster management can be realized through OpenShift, and cross-region task scheduling is realized. The device computing node is a Vmware virtual machine, the system is a Linux operating system, and the system runs on an X86 physical machine. And the database connected with the batch processing system is based on the MGR cluster, runs in a Vmware virtual machine, and is a Linux operating system.
The batch processing system corresponds to a plurality of slices, each slice corresponding to a docker container service. The preloading module (as shown in the figure) of the batch server in the present specification is developed based on Spring Cloud Data Flow, and is started by an independent docker container, provides a batch program scheduling page externally, performs operations such as starting, pausing, restarting and the like on tasks, provides job sequence arrangement in the tasks, and provides a scheduling execution relation and a fragmentation parameter configuration of a fragmentation subtask. To achieve the pre-loading of the Data blocks to be processed, the invocation of the mirroring service may be initiated by the Spring Cloud Data Flow to the docker container service to pre-load the Data blocks to be processed.
The batch processes in this specification are applicable to a distributed batch scenario in which the batch system may also include distributed storage. The distributed storage is used for storing the fragment files to be processed, such as the card organization transaction file of the visa mastercard, which are needed by the batch processing task.
In addition, the batch processing system may also include a task information database. The task information database is connected with the Spring Cloud Data Flow and used for storing the execution information of the tasks, the task execution state configuration and the fragment information. The processing order of each data block can be determined in advance and recorded in the task information database, and then each data block is processed in sequence according to the record in the task information database. The processing plan execution record and the Batch task execution record are logically executed, the Batch task partition execution is program execution in the true sense, and one Batch partition execution record represents that one section of program is really started, corresponding container resources are applied, corresponding data are processed, corresponding to one container cloud POD and corresponding to Job of Spring Batch.
In an optional embodiment of the present specification, after a first preset time period of preloading a data block to be processed, it is determined whether there is a data block which is being processed and whose corresponding batch processing level is before a level corresponding to the data block to be processed, and if not, the data block to be processed is processed. Further, after the data block to be processed is processed, the mirror process corresponding to the data block to be processed is destroyed.
It can be seen that, in the batch process in this specification, the processing state of the data block may be one of "preload", "in processing", and "complete". If the processing state of one data block is adjusted to be in process, the state of the data block in the order following the data block is adjusted to be preloaded immediately. If the processing state of a block of data is adjusted to completion, the order is processed immediately.
In an alternative embodiment of the present description, when determining the processing order of the data blocks, the batch process is packaged into 3 levels: planning, task partitioning, and task partitioning execution is real batch program execution. The batch processing arrangement service realizes the control of the batch processing flow through six timing monitoring logics, and the batch processing program queries the state of the task partition by calling a batch processing arrangement service interface to confirm whether the task partition can be executed or not.
In an alternative embodiment of the present specification, as shown in fig. 3b, for a batch processing level with a top order, if a preset trigger condition is satisfied, it may be determined whether a state of a data block corresponding to the batch processing level with the top order is executable, and if the determination result is yes, processing of the data block corresponding to the batch processing level with the top order is executed. If the judgment result is negative, waiting for the preset time, and judging again until the judgment result is positive. Alternatively, the determination may be based on Quartz.
Since the task information database in this specification has a function of recording the status of the database, the status of each data block can be periodically queried, and when the status of a data block is adjusted to be complete, the container pod is immediately pulled up for the data block in the preloaded state, and the batch processing program is started.
In addition, the process in this specification also traverses data blocks in order before the preloaded data block before the batch processing for the preloaded data block is performed, determines whether the data blocks in order before the preloaded data block are not in an incomplete state, and if not, performs the processing for the preloaded data block, as shown in fig. 3 c.
The process in this specification further supervises the status of the data blocks being processed, and specifically, may periodically acquire, for each data block, the status of the data block after starting execution on the data block, as shown in fig. 3 d. If the state of the data block is in processing, continuously and periodically acquiring the state of the data block; if the state of the data block is complete, ending the execution of the data block, and reclaiming the container resource corresponding to the data block. The period may be 2 seconds.
In an optional embodiment of this specification, as shown in fig. 3e, for each fragment, a data block in the fragment that is being processed may be determined, and it is determined whether a record corresponding to the data block exists in a task information database, if yes, it is determined whether a current state of the data block matches a state of the record, and if not, according to the current state of the data block, a state corresponding to the data block in the task information data block is updated until the state of the data block is adjusted to be completed; if not, judging whether the state of the data block is preloading or not.
If the state of the data block is pre-loaded, determining whether the time length from the time of starting to process the data block to the current time is greater than a second preset time period (for example, 5 minutes), if so, determining that the processing result of the data block is processing failure, otherwise, determining that the processing result of the data block is finished.
In an alternative embodiment of the present disclosure, after the execution of each data block is completed, the processing result of each processed data block is recorded.
The technical scheme in the specification has better performance in the aspects of safety, compatibility and the like, so that the technical scheme in the specification is suitable for various business processing scenes, particularly financial business processing scenes.
The present specification further provides a batch processing apparatus, which can execute a batch processing process provided in the above embodiments of the present application. The device is used for a server, the server belongs to a batch processing system, the batch processing system further comprises a plurality of terminals, and as shown in fig. 3, the device comprises one or more of the following modules:
a data determination module 300 configured to determine data to be processed;
the splitting module 302 is configured to split data to be processed into a plurality of data blocks according to a preset splitting rule, so that each data block corresponds to one batch processing level, and the batch processing levels are sequentially arranged according to a preset order, so that a processing order of a data block corresponding to a batch processing level in the previous order is prior to a processing order of a data block corresponding to a batch processing level in the next order;
a to-be-processed data block determining module 304, configured to determine, for each batch processing level, when starting to process a data block corresponding to the batch processing level, a data block corresponding to a next-order batch processing level as a to-be-processed data block;
a preloading module 306 configured to preload a to-be-processed data block;
the processing module 308 is configured to process the data block to be processed after the data block corresponding to the currently processed batch processing level is processed, until the data block of each batch processing level is processed.
In an alternative embodiment of the present specification, the apparatus is used in a batch processing system, the batch processing system comprises a Spring Cloud Data Flow, and the preloading of the Data block to be processed is performed by the Spring Cloud Data Flow.
In an alternative embodiment of the present description, the batch processing system runs on an Open Shift container cloud platform.
In an optional embodiment of this specification, the batch processing system includes a main system and a backup system, and when the main system is abnormally operated, the batch processing system is switched to the backup system to perform batch processing.
In an alternative embodiment of the present description, the batch system corresponds to a plurality of slices, each slice corresponding to a docker container service.
In an alternative embodiment of the present disclosure, the preloading module 306 is specifically configured to initiate invocation of the mirroring service by the Spring Cloud Data Flow to the docker container service to preload the to-be-processed Data chunk.
In an alternative embodiment of the present disclosure, the apparatus further comprises an adjustment module. The adjusting module is configured to destroy the mirror image process corresponding to the data block to be processed.
In an optional embodiment of this specification, the processing module 308 is specifically configured to, after a first preset time period of preloading a data block to be processed, determine whether there is a data block which is being processed and whose corresponding batch processing level is before a level corresponding to the data block to be processed, and if not, process the data block to be processed.
In an optional embodiment of the present description, the apparatus further comprises a determination module. The judging module is configured to determine that the processing result of a data block is failure if the time for processing the data block exceeds a second preset time period.
In an alternative embodiment of the present description, the apparatus further comprises a recording module. The recording module is configured to record a processing result of each processed data block.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of one or more embodiments of the present specification. One of ordinary skill in the art can understand and implement it without inventive effort.
Further, an embodiment of the present application provides an electronic device, as shown in fig. 4, where the electronic device 40 shown in fig. 4 includes: a processor 401 and a memory storage device 403. Wherein the processor 401 is coupled to a memory storage device 403, such as via a bus 402. Further, the electronic device 40 may also include a transceiver 404. It should be noted that the transceiver 404 is not limited to one in practical applications, and the structure of the electronic device 40 is not limited to the embodiment of the present application. The processor 401 is applied to the embodiment of the present application, and is used to implement the functions of the respective modules shown in fig. 3. The transceiver 404 includes a receiver and a transmitter.
The processor 401 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 401 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
The storage device 403 is used for storing application program codes for executing the scheme of the application, and the execution is controlled by the processor 401. The processor 401 is configured to execute application program code stored in the storage device 403 for implementing the functions of the various modules shown in fig. 3.
The present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method shown in the above embodiments is implemented.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.
Claims (11)
1. A batch processing method, the method comprising:
determining data to be processed;
splitting data to be processed into a plurality of data blocks according to a preset splitting rule, wherein each data block corresponds to one batch processing grade, and the batch processing grades are sequentially arranged according to a preset sequence, so that the processing sequence of the data block corresponding to the batch processing grade with the previous sequence is prior to the processing sequence of the data block corresponding to the batch processing grade with the next sequence;
for each batch processing grade, when the data block corresponding to the batch processing grade is processed, determining the data block corresponding to the batch processing grade in the next sequence as a data block to be processed;
preloading a data block to be processed;
and after the data block corresponding to the currently processed batch processing grade is processed, processing the data block to be processed until the data block of each batch processing grade is processed.
2. The method of claim 1, wherein the method is based on a batch processing system, wherein the batch processing system comprises a Spring Cloud Data Flow, and wherein the preloading of the Data chunks to be processed is performed by the Spring Cloud Data Flow.
3. The method of claim 2, wherein the batch processing system runs on an Open Shift container cloud platform.
4. The method of claim 2, wherein the batch processing system comprises a primary system and a backup system, and when the primary system is abnormally operated, the batch processing system is switched to the backup system for batch processing.
5. The method of claim 2, wherein the batch system corresponds to a plurality of shards, each shard corresponding to a docker container service;
preloading a data block to be processed, comprising:
initiating the invocation of mirror image service to the docker container service by Spring Cloud Data Flow so as to preload the Data blocks to be processed;
after the data block to be processed is processed, the method further includes:
and destroying the mirror image process corresponding to the data block to be processed.
6. The method of claim 1, wherein processing the data block to be processed after the data block corresponding to the currently processed batch processing level is processed, comprises:
after a first preset time period of preloading a data block to be processed, determining whether a data block which is being processed and has a corresponding batch processing grade before a grade corresponding to the data block to be processed exists, and if not, processing the data block to be processed.
7. The method of claim 1, further comprising: and if the time for processing a data block exceeds a second preset time period, determining that the processing result of the data block is failure.
8. The method of claim 7, further comprising, after the method: and recording the processing result of each processed data block.
9. A batch processing apparatus, the apparatus comprising:
a data determination module configured to determine data to be processed;
the splitting module is configured to split the data to be processed into a plurality of data blocks according to a preset splitting rule, so that each data block corresponds to one batch processing grade, and the batch processing grades are sequentially arranged according to a preset sequence, so that the processing sequence of the data block corresponding to the batch processing grade with the previous sequence is prior to the processing sequence of the data block corresponding to the batch processing grade with the next sequence;
the data block to be processed determining module is configured to determine, for each batch processing level, a data block corresponding to a next-order batch processing level as a data block to be processed when processing of the data block corresponding to the batch processing level is started;
the pre-loading module is configured to pre-load the data blocks to be processed;
and the processing module is configured to process the data block to be processed after the data block corresponding to the currently processed batch processing grade is processed, until the data block of each batch processing grade is processed.
10. An electronic device, one or more processors;
a memory;
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: performing the method of any of the preceding claims 1-8.
11. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110087185.4A CN112783627A (en) | 2021-01-22 | 2021-01-22 | Batch processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110087185.4A CN112783627A (en) | 2021-01-22 | 2021-01-22 | Batch processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112783627A true CN112783627A (en) | 2021-05-11 |
Family
ID=75758560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110087185.4A Pending CN112783627A (en) | 2021-01-22 | 2021-01-22 | Batch processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112783627A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106330987A (en) * | 2015-06-15 | 2017-01-11 | 交通银行股份有限公司 | Dynamic load balancing method |
CN106775985A (en) * | 2016-12-26 | 2017-05-31 | 中国建设银行股份有限公司 | A kind of batch processing task dispatching method and device |
CN107005597A (en) * | 2014-10-13 | 2017-08-01 | 七网络有限责任公司 | The wireless flow management system cached based on user characteristics in mobile device |
CN109167719A (en) * | 2018-08-16 | 2019-01-08 | 广州爽游网络科技有限公司 | A kind of super large community implementation method with content intelligence isolation features |
CN109670932A (en) * | 2018-09-25 | 2019-04-23 | 平安科技(深圳)有限公司 | Credit data calculate method, apparatus, system and computer storage medium |
CN110716744A (en) * | 2019-10-21 | 2020-01-21 | 中国科学院空间应用工程与技术中心 | Data stream processing method, system and computer readable storage medium |
CN111522630A (en) * | 2020-04-30 | 2020-08-11 | 北京江融信科技有限公司 | Method and system for executing planned tasks based on batch dispatching center |
CN111666144A (en) * | 2020-06-19 | 2020-09-15 | 中信银行股份有限公司 | Batch processing task execution method and system and machine room deployment system |
CN112131305A (en) * | 2020-06-19 | 2020-12-25 | 中信银行股份有限公司 | Account processing system |
-
2021
- 2021-01-22 CN CN202110087185.4A patent/CN112783627A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107005597A (en) * | 2014-10-13 | 2017-08-01 | 七网络有限责任公司 | The wireless flow management system cached based on user characteristics in mobile device |
CN106330987A (en) * | 2015-06-15 | 2017-01-11 | 交通银行股份有限公司 | Dynamic load balancing method |
CN106775985A (en) * | 2016-12-26 | 2017-05-31 | 中国建设银行股份有限公司 | A kind of batch processing task dispatching method and device |
CN109167719A (en) * | 2018-08-16 | 2019-01-08 | 广州爽游网络科技有限公司 | A kind of super large community implementation method with content intelligence isolation features |
CN109670932A (en) * | 2018-09-25 | 2019-04-23 | 平安科技(深圳)有限公司 | Credit data calculate method, apparatus, system and computer storage medium |
CN110716744A (en) * | 2019-10-21 | 2020-01-21 | 中国科学院空间应用工程与技术中心 | Data stream processing method, system and computer readable storage medium |
CN111522630A (en) * | 2020-04-30 | 2020-08-11 | 北京江融信科技有限公司 | Method and system for executing planned tasks based on batch dispatching center |
CN111666144A (en) * | 2020-06-19 | 2020-09-15 | 中信银行股份有限公司 | Batch processing task execution method and system and machine room deployment system |
CN112131305A (en) * | 2020-06-19 | 2020-12-25 | 中信银行股份有限公司 | Account processing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7152080B2 (en) | Method, apparatus, and computer readable medium for managing replication of back-up object | |
US9317373B2 (en) | Snapshots in a hybrid storage device comprising a magnetic disk and a solid state disk | |
CN110941481A (en) | Resource scheduling method, device and system | |
CN109144785B (en) | Method and apparatus for backing up data | |
CN111538585B (en) | Js-based server process scheduling method, system and device | |
CN110008206B (en) | Data processing method and device based on block chain system | |
CN104504147A (en) | Resource coordination method, device and system for database cluster | |
CN111209110A (en) | Task scheduling management method, system and storage medium for realizing load balance | |
CN106407395A (en) | A processing method and device for data query | |
CN109522273B (en) | Method and device for realizing data writing | |
EP3018581B1 (en) | Data staging management system | |
CN110597835A (en) | Transaction data deleting method and device based on block chain | |
CN110209736A (en) | Device, method and the storage medium of block chain data processing | |
CN112948169A (en) | Data backup method, device, equipment and storage medium | |
CN110377664B (en) | Data synchronization method, device, server and storage medium | |
CN116450287A (en) | Method, device, equipment and readable medium for managing storage capacity of service container | |
CN117312262A (en) | Service data transmission and monitoring management method | |
CN112783627A (en) | Batch processing method and device | |
CN115756549A (en) | Method and device for downloading data of big data middlebox and storage medium | |
CN111324668B (en) | Database data synchronous processing method, device and storage medium | |
CN110363515B (en) | Rights and interests card account information inquiry method, system, server and readable storage medium | |
CN113326266A (en) | Data synchronization method and device based on table structure relationship | |
CN110677497A (en) | Network medium distribution method and device | |
CN110569259A (en) | Method and device for processing mass data | |
CN116821250B (en) | Distributed graph data processing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |