CN114201319A - Data scheduling method, device, terminal and storage medium - Google Patents

Data scheduling method, device, terminal and storage medium Download PDF

Info

Publication number
CN114201319A
CN114201319A CN202210144350.XA CN202210144350A CN114201319A CN 114201319 A CN114201319 A CN 114201319A CN 202210144350 A CN202210144350 A CN 202210144350A CN 114201319 A CN114201319 A CN 114201319A
Authority
CN
China
Prior art keywords
data
information
scheduling
subscription program
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210144350.XA
Other languages
Chinese (zh)
Inventor
郭浩哲
蒙圣光
陈星栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Fastersoft Software Co ltd
Original Assignee
Guangdong Fastersoft Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Fastersoft Software Co ltd filed Critical Guangdong Fastersoft Software Co ltd
Priority to CN202210144350.XA priority Critical patent/CN114201319A/en
Publication of CN114201319A publication Critical patent/CN114201319A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data scheduling method, a device, a terminal and a storage medium, wherein the method comprises the following steps: acquiring first data, and storing the first data to a first position as second data; determining that the data volume of the second data at the first position is greater than or equal to a first judgment threshold value, and sending first information to a scheduling center; the scheduling center receives the first information, determines whether to generate a first event according to the first information, and sends a first notification to a first subscription program subscribed to the first event if the first event is generated, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold. The invention realizes the scheduling among different data processing by using the scheduling center, and improves the efficiency of data scheduling.

Description

Data scheduling method, device, terminal and storage medium
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a data scheduling method, an apparatus, a terminal, and a storage medium.
Background
With the application range of big data becoming wider and more frequent, data scheduling between different events becomes more and more frequent, and at present, a fixed time is usually set to process different events, so as to realize scheduling between the events, but this method will cause low efficiency of scheduling data, and at the same time, data processing performance is reduced.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the embodiment of the invention provides a data scheduling method, a data scheduling device, a terminal and a storage medium, so as to improve the efficiency of data scheduling.
In one aspect, an embodiment of the present invention provides a data scheduling method, including: acquiring first data, and storing the first data to a first position as second data; determining that the data volume of the second data at the first position is greater than or equal to a first judgment threshold value, and sending first information to a scheduling center; the dispatching center receives the first information, determines whether to generate a first event according to the first information, and sends a first notice to a first subscription program subscribed to the first event if the first event is generated, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold.
The data scheduling method according to the embodiment of the invention at least has the following beneficial effects:
and receiving the first information through the scheduling center, determining whether a first event is generated or not, sending a first notice to a corresponding first subscription program after determining that the first event is generated, and executing corresponding processing according to the first subscription program. By the method, real-time scheduling among different events is realized, so that the efficiency of data scheduling and data processing is improved.
According to some embodiments of the invention, the obtaining the first data and storing the first data to the first location as the second data comprises at least one of the following steps: acquiring first data, determining that the data volume of the first data is greater than or equal to a third judgment threshold, dividing the first data into one or more first batches of data, and storing the first batches of data to a first position as second data; wherein the data volume of the first batch of data is less than or equal to a first partition threshold value, the first partition threshold value representing a maximum data volume that can be processed each time the first batch of data is deposited to a first location; or acquiring first data, determining that the data volume of the first data is smaller than the third judgment threshold, and continuing to acquire the first data.
According to some embodiments of the invention, the method further comprises: and if the data volume of the second data at the first position is determined to be smaller than a first judgment threshold value, continuing to store the first data to the first position.
According to some embodiments of the invention, the first subscribing program extracts the second data from the first location, processes to obtain third data, and comprises the following steps:
and if the first subscription program subscribed to the first event is determined to be cleaning data, the first subscription program extracts the second data from the first position, cleans the second data to obtain third data and stores the third data in a second position, and if the first subscription program determines that the data volume of the third data in the second position is greater than a second judgment threshold, the first subscription program sends second information to the dispatching center.
According to some embodiments of the invention, the first subscribing program extracts the second data from the first location, processes to obtain third data, and comprises the following steps: and if the first subscription program subscribed to the first event is determined to model data, the first subscription program extracts the second data from the first position, models the second data to obtain third data and stores the third data in a second position, and if the first subscription program determines that the data volume of the third data in the second position is greater than a second judgment threshold, second information is sent to the dispatching center.
According to some embodiments of the invention, the dispatch center receives the first information, determines whether to generate a first event based on the first information; and when the scheduling center is determined not to generate the first event, acquiring a first subscription program for sending the first information, determining third data processed by the first subscription program, and caching the third data.
According to some embodiments of the invention, the method further comprises: and if the first subscription program determines that the data volume of the third data at the second position is smaller than a second judgment threshold, continuously extracting the second data from the first position through the first subscription program, processing to obtain third data, and storing the third data in the second position.
In another aspect, an embodiment of the present invention provides a data scheduling apparatus, including: the first module is used for acquiring first data and storing the first data to a first position as second data; a second module, configured to determine that a data amount of the second data in the first location is greater than or equal to a first determination threshold, send first information to a scheduling center; a third module, configured to, after receiving the first information, the scheduling center determines whether to generate a first event according to the first information, and if the first event is generated, send a first notification to a first subscription program that has subscribed to the first event, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold.
The data scheduling device according to the embodiment of the invention has at least the following beneficial effects: and receiving the first information through the scheduling center, determining whether a first event is generated or not, sending a first notice to a corresponding first subscription program after determining that the first event is generated, and executing corresponding processing according to the first subscription program. By the method, real-time scheduling among different events is realized, so that the efficiency of data scheduling and data processing is improved.
On the other hand, an embodiment of the present invention further provides a computer device, including: at least one processor; at least one memory for storing at least one program; when executed by the at least one processor, cause the at least one processor to implement the data scheduling method as any one of the above.
The computer device according to the embodiment of the present invention has at least the same advantageous effects as the above-described data scheduling method.
In another aspect, an embodiment of the present invention provides a storage medium, where a program instruction is stored in the storage medium, and when the program instruction is executed by a processor, the storage medium implements a data scheduling method capable of implementing any one of the foregoing methods.
The storage medium according to the embodiment of the present invention has at least the same advantageous effects as the above-described data scheduling method.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a general flowchart of steps of a data scheduling method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of step S100 in FIG. 1;
FIG. 3 is a schematic flow chart of step S200 in FIG. 1;
FIG. 4 is a first flowchart of step S300 in FIG. 1;
FIG. 5 is a second flowchart of step S300 in FIG. 1;
FIG. 6 is a third flowchart of step S300 in FIG. 1;
fig. 7 is a schematic block diagram of a data scheduling apparatus according to an embodiment of the present invention;
FIG. 8 is a schematic block diagram of an apparatus of an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. In the following description, suffixes such as "module", "part", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no peculiar meaning in itself. Thus, "module", "component" or "unit" may be used mixedly. "first", "second", etc. are used for the purpose of distinguishing technical features only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the precedence of the indicated technical features. In the following description, the method steps are labeled continuously for convenience of examination and understanding, and the implementation sequence of the steps is adjusted without affecting the technical effect achieved by the technical scheme of the invention in combination with the overall technical scheme of the invention and the logical relationship among the steps. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
With the development of big data, the problem of efficient storage of large-scale and diversified data needs to be solved, and the problem of efficient processing of large-scale and diversified data needs to be solved at the same time. In the existing method for processing big data in batches, collected data is pushed to a receiving server in batches by a data collection program, and after the data is stored in a database by a data access server, related services such as later-stage data cleaning, data conversion and the like need to be scheduled and executed by setting real-time tasks. Different data processing conditions can be judged only by related technical personnel through execution time, so that the scheduling period is reasonably typeset, and the method can cause the problem that data cannot be scheduled in time.
Therefore, embodiments of the present invention provide a data scheduling method, apparatus, device, and storage medium, which can improve the data scheduling rate and facilitate subsequent data processing.
As shown in fig. 1, an embodiment of the present invention provides a data scheduling method, including the following steps:
step S100, acquiring first data, and storing the first data to a first position as second data.
The first data is collected by a data collector, for example: SQL Server 2008 performance data collector. The data collector then pushes the data to the associated data handler to obtain the first data, and the associated data handler writes the first data to the first location as the second data, and the associated data handler may be a data receiving service or the like. It is noted that the data may also be collected in other ways.
In another embodiment, the step S100 of acquiring the first data and storing the first data to the first location as the second data includes at least one of the following steps, as shown in fig. 2:
step S110, acquiring first data, determining that the data volume of the first data is greater than or equal to a third judgment threshold, dividing the first data into one or more first batches of data, and storing the first batches of data to a first position as second data; the data volume of the first batch of data is smaller than or equal to a first division threshold value, and the first division threshold value represents the maximum data volume capable of being processed each time when the first data is stored to a first position;
step S120, acquiring the first data, determining that the data amount of the first data is smaller than the third judgment threshold, and continuing to acquire the first data.
Specifically, when the first data is written into the first position, it is necessary to determine whether the data size of the first data reaches a third determination threshold that can be written into the first position, and when it is determined that the data size of the first data is smaller than the third determination threshold, the first data continues to be acquired until the data size of the first data is greater than or equal to the third determination threshold; after the first data is determined to reach a third judgment threshold which can be written into the first position, the first data is divided into one or more first batches of data according to the first division threshold, wherein the first batches of data are smaller than or equal to the first division threshold, and the first batches of data are stored in the first position in batches to serve as second data, wherein the third judgment threshold and the first division threshold are target results selected from multiple training, and can be selected according to prior knowledge or actual needs of technicians in related fields. For example: the method needs to receive records continuously and then read the records to the first position, receives all records, and reads the past data at one time, so that the data volume is very large, problems are easy to occur in the reading process, and even the data cannot be transmitted, but if a record is received, the past record is read, a large amount of resources are consumed, or the data is read to the first position at fixed time, a large amount of data accumulated in a time period is scheduled and processed in each fixed time period, so that a program needs to be supported by a large amount of resources (such as a memory and a CPU). Therefore, a third judgment threshold is set, and if the third judgment threshold is set to be 100 million according to requirements, the corresponding data volume is determined to be 100 million when ten thousand records exist, and the data volume corresponding to the received record is determined to reach the third judgment threshold, and at this moment, the received record is stored to the first position; however, there is still a great risk that 100 million records are read all over to the first location, so 100 million records are divided into a first batch of data according to a first division threshold, and assuming that the first division threshold is 8 million, the divided first batch of data is less than or equal to 8 million, and the first batch of data is sent to the first location in multiple batches. It should be noted that, receiving the first data, determining whether the first data can be transmitted, and dividing the first data and then transmitting the first data are all performed simultaneously, that is, when it is determined that there is data, a corresponding operation is performed on the data. By the method, the timeliness of data processing is greatly improved, and the processing period is thinned to the second level.
Step S200, if the data volume of the second data at the first position is determined to be larger than or equal to a first judgment threshold value, first information is sent to a dispatching center;
specifically, the data volume of the second data at the first position is compared with a first judgment threshold, and when the data volume of the second data is greater than the first judgment threshold, first information is sent to the scheduling center for reminding the scheduling center of completing the work of storing the first data at the first position. It should be noted that the first judgment threshold is used for the target result screened from multiple training, and the first judgment threshold may be selected according to the prior knowledge of the relevant technical personnel or the actual requirement, and may be, for example, a value of 100 megabits, 150 megabits, etc. By the method, the resource consumption of the related program is greatly reduced, and the resource consumption performance of the program is further optimized, such as: generally, an operation is performed by acquiring a piece of data and then storing the piece of data in a corresponding position, or the data is acquired by a data acquisition unit and pushed to a data receiving service, and the data receiving service sets a timing schedule to process the data after writing the data in the position where the data is correspondingly stored. But this approach is very costly in performance and very inefficient. Therefore, the dispatching center is utilized to set the first judgment threshold value to be 100 million, and after the first data in the first position is determined to be greater than or equal to 100 million, the batch of data is shown to be pushed, so that the first information is sent to the dispatching center to show that the system has completed data pushing. The scheduling center can be implemented by using message middleware, which is basic software for completing the transmission and reception of messages in the distributed system. The message middleware also can be called as a message queue, and refers to the integration of a distributed system based on data communication by using an efficient and reliable message transmission mechanism to perform data communication independent of a platform. By providing a message passing and message queuing model, the communication of processes can be extended in a distributed environment. Currently common message middleware includes: ActiveMQ, RabbitMQ, RocketMQ, Kafka, ZeroMQ, and the like. In addition to message middleware, other related software that enables scheduling message delivery may be employed.
In another embodiment, step S200, as shown in fig. 3, further includes:
step S210, if it is determined that the data amount of the second data at the first location is smaller than the first determination threshold, the first data is continuously stored to the first location.
Step S300, the scheduling center receives the first information, determines whether to generate the first event according to the first information, and if the first event is generated, sends a first notification to a first subscription program that has subscribed to the first event, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold.
Specifically, the scheduling center, after receiving the first information, sends a first notification to a first subscription program that has subscribed to the first event if it is determined that the first event is generated. Each event can correspond to a plurality of subscription programs, but each subscription program can only monitor one notification broadcast by the scheduling center, so as to prevent the situation that the first subscription program is executed circularly. And after the first notification is sent to the corresponding first subscription program, the first subscription program extracts the second data from the second position, performs corresponding processing on the second data to obtain third data, stores the third data to the second position until the third data is determined to be greater than a second judgment threshold value, and sends second information to the scheduling center.
In another embodiment, in step S300, the first subscribing program extracts the second data from the first location and processes the second data to obtain the third data, including the following steps, as shown in fig. 4:
step S310, determining that a first subscription program subscribed to the first event is cleaning data, extracting second data from the first position by the first subscription program, cleaning to obtain third data and storing the third data in the second position, and sending second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is larger than a second judgment threshold.
Step S320, determining that the first subscription program subscribed to the first event is to model data, extracting, by the first subscription program, second data from the first location, modeling the second data to obtain third data, storing the third data in the second location, determining, by the first subscription program, that the data amount of the third data in the second location is greater than the second determination threshold, and sending second information to the scheduling center.
In another embodiment, the step S300 of determining whether the dispatch center produces the first event further includes the following, as shown in fig. 5:
step S330, the dispatching center receives the first information and determines whether to generate a first event according to the first information;
step S340, when it is determined that the scheduling center does not generate the first event, acquiring a first subscription program that sends the first information, determining third data obtained by processing of the first subscription program, and caching the third data.
Specifically, in the first subscription program, the second data is extracted from the first location and processed to obtain the third data, the processing on the data includes, but is not limited to, dividing the steps into cleaning, processing, fusing, labeling, modeling, and the like, and according to the processing on different data, the different processes are combined with the data relationship between each other to form a scheduling chain, for example: the scheduling center generates a first event after receiving the first information, sends a first notice to a first subscription program which has subscribed the first event, wherein the first subscription program is cleaning data, namely the first subscription program extracts second data from a first position, cleans the second data to obtain third data and stores the third data in a second position, judges that the data quantity of the third data in the second position is larger than a second judgment threshold value, and sends the second information to the scheduling center, at the moment, the scheduling center receives the second information, determines to generate a second event according to the second information, determines a second subscription program which subscribes the second event, determines that the second subscription program extracts the third data from the second position, models the third data to obtain fourth data and sends the fourth data to the third position, determines to send the fourth data to the third position, and sends the third information to the scheduling center to indicate that the modeling of the third data is completed, and determining that the third event is not generated after the third information is received by the scheduling center, wherein the second subscription program for sending the third information does not have a corresponding subscription program subscribed with the related event after the data processing is finished, namely the second subscription program is a final execution program of the whole event processing, and at the moment, caching fourth data obtained by the processing of the second subscription program for calling a third-party service, so that the execution of the scheduling chain is finished. In order to prevent the situation of circularly executing the scheduling chain, each event is limited to correspond to a plurality of subscription programs, but each subscription program can only monitor one notice broadcasted by the scheduling center.
In another embodiment, in step S300, as shown in fig. 6, the method further includes:
step S350, if the first subscription program determines that the data amount of the third data at the second location is smaller than the second determination threshold, the first subscription program continues to extract the second data from the first location, and the third data is obtained through processing and stored at the second location.
In one aspect, referring to fig. 7, the present embodiment provides a data scheduling apparatus 700, which at least includes: a first module 710, a second module 720, and a third module 730.
The first module 710 obtains first data, stores the first data in a first position as second data, outputs the second data to the second module 720, and the second module 720 determines that the data amount of the second data is greater than or equal to a first judgment threshold value by judging the data amount of the second data and the size of the first judgment threshold value, and then sends first information to the scheduling center; the third module 730 is configured to receive the first information, determine whether to generate a first event according to the first information, and send a first notification to a first subscription program that has subscribed to the first event if the first event is generated. The first subscription program extracts the second data from the first position, and performs corresponding processing on the second data, for example: modeling, cleaning and the like to obtain third data, storing the third data in the second position, then judging the data volume of the third data in the second position and the size of a second judgment threshold, determining that the data volume of the third data is greater than or equal to the second judgment threshold, sending second information to a scheduling center to indicate that the program is executed completely, and continuously acquiring the program subscribing the corresponding event according to the second information.
Referring to fig. 8, the present embodiment provides an electronic device, which includes a processor 810 and a memory 820 coupled to the processor 810, wherein the memory 820 stores program instructions executable by the processor 810, and the processor 810 implements the data scheduling method when executing the program instructions stored in the memory 820. The processor 810 may also be referred to as a Central Processing Unit (CPU). Processor 810 may be an integrated circuit chip having signal processing capabilities. The processor 810 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The general purpose processor may be a microprocessor, but in the alternative, the general purpose processor may be any conventional processor or the like. Memory 820 may include various components (e.g., machine-readable media) including, but not limited to, random access memory components, read-only components, and any combination thereof. The memory 820 may also include: instructions (e.g., software) (e.g., stored on one or more machine-readable media); the instruction implements the data scheduling method in the above embodiment. The electronic device has a function of loading and operating a software system for data scheduling provided by the embodiment of the present invention, for example, a Personal Computer (PC), a mobile phone, a smart phone, a Personal Digital Assistant (PDA), a wearable device, a pocket PC (ppc), a tablet Computer, and the like.
The present embodiment provides a computer-readable storage medium storing a program executed by a processor to implement the data scheduling method described above.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The processor of the computer device may read the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the aforementioned service data processing method.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for scheduling data, comprising the steps of:
acquiring first data, and storing the first data to a first position as second data;
determining that the data volume of the second data at the first position is greater than or equal to a first judgment threshold value, and sending first information to a scheduling center;
the dispatching center receives the first information, determines whether to generate a first event according to the first information, and sends a first notice to a first subscription program subscribed to the first event if the first event is generated, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold.
2. The data scheduling method of claim 1, wherein the obtaining the first data and storing the first data to the first location as the second data comprises at least one of the following steps:
acquiring first data, determining that the data volume of the first data is greater than or equal to a third judgment threshold, dividing the first data into one or more first batches of data, and storing the first batches of data to a first position as second data; wherein the data volume of the first batch of data is less than or equal to a first partition threshold value, the first partition threshold value representing a maximum data volume that can be processed each time the first batch of data is deposited to a first location;
alternatively, the first and second electrodes may be,
and acquiring first data, determining that the data volume of the first data is smaller than the third judgment threshold, and continuously acquiring the first data.
3. The data scheduling method of claim 1, wherein the method further comprises:
and if the data volume of the second data at the first position is determined to be smaller than a first judgment threshold value, continuing to store the first data to the first position.
4. The data scheduling method of claim 1, wherein the first subscribing program extracts the second data from the first location and processes the second data to obtain third data, and the method comprises the following steps:
and if the first subscription program subscribed to the first event is determined to be cleaning data, the first subscription program extracts the second data from the first position, cleans the second data to obtain third data and stores the third data in a second position, and if the first subscription program determines that the data volume of the third data in the second position is greater than a second judgment threshold, the first subscription program sends second information to the dispatching center.
5. The data scheduling method of claim 1, wherein the first subscribing program extracts the second data from the first location and processes the second data to obtain third data, and the method comprises the following steps:
and if the first subscription program subscribed to the first event is determined to model data, the first subscription program extracts the second data from the first position, models the second data to obtain third data and stores the third data in a second position, and if the first subscription program determines that the data volume of the third data in the second position is greater than a second judgment threshold, second information is sent to the dispatching center.
6. The data scheduling method of claim 1, comprising:
the dispatching center receives the first information and determines whether to generate a first event according to the first information;
and when the scheduling center is determined not to generate the first event, acquiring a first subscription program for sending the first information, determining third data processed by the first subscription program, and caching the third data.
7. The data scheduling method of claim 1, wherein the method further comprises:
and if the first subscription program determines that the data volume of the third data at the second position is smaller than a second judgment threshold, continuously extracting the second data from the first position through the first subscription program, processing to obtain third data, and storing the third data in the second position.
8. A data scheduling apparatus, comprising:
the first module is used for acquiring first data and storing the first data to a first position as second data;
a second module, configured to determine that a data amount of the second data in the first location is greater than or equal to a first determination threshold, send first information to a scheduling center;
a third module, configured to, after receiving the first information, the scheduling center determines whether to generate a first event according to the first information, and if the first event is generated, send a first notification to a first subscription program that has subscribed to the first event, so that: and the first subscription program extracts the second data from the first position, processes the second data to obtain third data and stores the third data in the second position, and sends second information to the dispatching center if the first subscription program determines that the data volume of the third data in the second position is greater than or equal to a second judgment threshold.
9. A computer device, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the data scheduling method of any one of claims 1 to 7.
10. A storage medium having stored therein program instructions which, when executed by a processor, enable a data scheduling method as claimed in any one of claims 1 to 7 to be implemented.
CN202210144350.XA 2022-02-17 2022-02-17 Data scheduling method, device, terminal and storage medium Pending CN114201319A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210144350.XA CN114201319A (en) 2022-02-17 2022-02-17 Data scheduling method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210144350.XA CN114201319A (en) 2022-02-17 2022-02-17 Data scheduling method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN114201319A true CN114201319A (en) 2022-03-18

Family

ID=80645575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210144350.XA Pending CN114201319A (en) 2022-02-17 2022-02-17 Data scheduling method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN114201319A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN109800204A (en) * 2018-12-27 2019-05-24 深圳云天励飞技术有限公司 Data distributing method and Related product
CN110069493A (en) * 2019-02-28 2019-07-30 平安科技(深圳)有限公司 Data processing method, device, computer equipment and storage medium
WO2021015739A1 (en) * 2019-07-23 2021-01-28 Hitachi Vantara Llc Systems and methods for collecting and sending real-time data
CN112395116A (en) * 2021-01-20 2021-02-23 北京东方通软件有限公司 Adjusting and optimizing method and system for message middleware
CN113760986A (en) * 2021-01-29 2021-12-07 北京沃东天骏信息技术有限公司 Data query method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN109800204A (en) * 2018-12-27 2019-05-24 深圳云天励飞技术有限公司 Data distributing method and Related product
CN110069493A (en) * 2019-02-28 2019-07-30 平安科技(深圳)有限公司 Data processing method, device, computer equipment and storage medium
WO2021015739A1 (en) * 2019-07-23 2021-01-28 Hitachi Vantara Llc Systems and methods for collecting and sending real-time data
CN112395116A (en) * 2021-01-20 2021-02-23 北京东方通软件有限公司 Adjusting and optimizing method and system for message middleware
CN113760986A (en) * 2021-01-29 2021-12-07 北京沃东天骏信息技术有限公司 Data query method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邬贺铨著: "《数据之道 从技术到应用》", 31 August 2019 *

Similar Documents

Publication Publication Date Title
CN107800554B (en) Data acquisition method, device and system
CN111639138B (en) Data processing method, device, equipment and storage medium
WO2021190087A1 (en) Task execution method, device and system, and server
CN105577772A (en) Material receiving method, material uploading method and device
CN111371672A (en) Message pushing method and device
CN101288238B (en) Device and method for expressing status of terminal using character
CN109800150A (en) A kind of gray scale test method, server and computer readable storage medium
CN101741974A (en) Terminal and method for counting utilization rate of loadable module of terminal
CN111177363B (en) Writing method and device of group message, server and storage medium
CN114385378A (en) Active data processing method and device for Internet of things equipment and storage medium
CN112040431B (en) Marketing short message management and control system and method thereof
CN114201319A (en) Data scheduling method, device, terminal and storage medium
CN106940710B (en) Information pushing method and device
CN110704212B (en) Message processing method and device
WO2008032992A1 (en) Method for downloading multimedia contents to electronic picture frame
CN101902342A (en) Method and device for acquiring billing and accounting data in telecommunications
CN113641482B (en) AI algorithm offline scheduling method, system, computer equipment and storage medium
CN110310020A (en) Project alternative management method, relevant apparatus and storage medium based on data analysis
CN113657635B (en) Method for predicting loss of communication user and electronic equipment
CN112187667B (en) Data downloading method, device, equipment and storage medium
CN110781878B (en) Target area determination method and device, storage medium and electronic device
CN110971503B (en) WeChat applet message pushing method, storage medium, electronic device and system
CN110532253B (en) Service analysis method, system and cluster
CN110782167B (en) Method, device and storage medium for managing receiving and dispatching area
CN112417015A (en) Data distribution method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220318

RJ01 Rejection of invention patent application after publication