CN117573777A - Data synchronization method and device and electronic equipment - Google Patents
Data synchronization method and device and electronic equipment Download PDFInfo
- Publication number
- CN117573777A CN117573777A CN202311605112.5A CN202311605112A CN117573777A CN 117573777 A CN117573777 A CN 117573777A CN 202311605112 A CN202311605112 A CN 202311605112A CN 117573777 A CN117573777 A CN 117573777A
- Authority
- CN
- China
- Prior art keywords
- data
- program
- processing
- preset
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 238000012545 processing Methods 0.000 claims abstract description 97
- 238000004422 calculation algorithm Methods 0.000 claims description 30
- 230000008569 process Effects 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000013500 data storage Methods 0.000 claims description 10
- 238000012546 transfer Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 description 22
- 238000004891 communication Methods 0.000 description 7
- 238000007726 management method Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 238000013499 data model Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000004140 cleaning Methods 0.000 description 3
- 238000013480 data collection Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure relates to a data synchronization method, a device and electronic equipment, and relates to the technical field of data synchronization processing, wherein the method comprises the following steps: acquiring preset configuration information at fixed time, wherein the preset configuration information comprises: periodically starting task time and presetting a processing rule; when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs; and processing the initial data according to a preset processing rule by using a data processing program to obtain processed target data. In this way, the data acquisition from a plurality of data sources is supported simultaneously, and the synchronization time or the control data synchronization frequency can be flexibly set.
Description
Technical Field
The present disclosure relates to the field of data synchronization processing technologies, and in particular, to a data synchronization method, a data synchronization device, and an electronic device.
Background
In current large system or internet project development, the system environment typically includes a development environment, a testing environment, and a production environment. In order to reduce the repeated configuration times and reduce the data configuration error rate, the basic data or the configuration item data are generally configured in a development environment, the developed data are synchronously migrated to a test environment for testing, and finally the tested data are synchronously migrated to a production environment. In addition, some configuration data in the system needs to be processed by acquiring third party data and then put in storage for use.
The implementation of the existing data synchronization method or system mainly depends on the change detection of the source database, so that multiple data sources cannot be supported to acquire data at the same time; when the change data in the source database is acquired and the message is sent to inform the target database, SQL script update data is executed on the target database to achieve the consistency of the data storage states in the source database and the target database, so that the synchronous time or the synchronous frequency of the control data cannot be flexibly set.
Disclosure of Invention
In view of this, the present application provides a data synchronization method, apparatus and electronic device, and mainly aims to solve the technical problems that the prior art cannot support multiple data sources to acquire data at the same time, and cannot flexibly set synchronization time or control data synchronization frequency.
According to a first aspect of the present disclosure, there is provided a data synchronization method, the method comprising:
acquiring preset configuration information at fixed time, wherein the preset configuration information comprises: periodically starting task time and presetting a processing rule;
when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs;
and processing the initial data according to the preset processing rule by using a data processing program to obtain processed target data.
According to a second aspect of the present disclosure, there is provided a data synchronization apparatus, the apparatus comprising:
the first acquisition module is used for acquiring preset configuration information at fixed time, and the preset configuration information comprises: periodically starting task time and presetting a processing rule;
the acquisition module is used for acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs when the current time is equal to the period starting task time;
and the processing module is used for processing the initial data according to the preset processing rule by utilizing a data processing program to obtain processed target data.
According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect described above.
According to a fourth aspect of the present disclosure there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of the preceding first aspect.
Compared with the prior art, the data synchronization method, the data synchronization device and the electronic equipment provided by the disclosure acquire preset configuration information at regular time, wherein the preset configuration information comprises the following steps: periodically starting task time and presetting a processing rule; when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs; and processing the initial data according to a preset processing rule by using a data processing program to obtain processed target data. In this way, the data acquisition from a plurality of data sources is supported simultaneously, and the synchronization time or the control data synchronization frequency can be flexibly set.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
For a clearer description of an embodiment of the present application or of a technical application in the prior art, reference will be made below to the accompanying drawings, which are used as needed in the description of the embodiment or the prior art, it being obvious to a person skilled in the art that other drawings can be obtained from these without inventive effort.
Fig. 1 is a flow chart of a data synchronization method according to an embodiment of the disclosure;
fig. 2 is a flow chart of a data synchronization method according to an embodiment of the disclosure;
FIG. 3 is a flowchart illustrating a method for synchronizing data according to an embodiment of the present disclosure;
FIG. 4 is a periodic task graph provided by an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a data synchronization device according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
The following describes a data synchronization method, a data synchronization device and an electronic device according to an embodiment of the disclosure with reference to the accompanying drawings.
The present disclosure provides a data synchronization method, apparatus, and electronic device, which not only supports data acquisition from multiple data sources, but also can flexibly set synchronization time or control data synchronization frequency.
As shown in fig. 1, an embodiment of the present disclosure provides a data synchronization method, wherein the method may include:
step 101, acquiring preset configuration information at fixed time, wherein the preset configuration information comprises: and (5) periodically starting task time and presetting a processing rule.
The preset configuration information may be a series of parameters and settings set for the task in the timed task platform, where the parameters and settings determine the execution time, execution period, execution mode, and the like of the task.
The preset configuration information may include registration of a data acquisition program, and parameter configuration such as execution parameters of the data acquisition program, preset processing rules of data processing, period starting task time, whether execution fails to retry, and the maximum number of failed retries.
For the embodiment of the present disclosure, the execution body may be a data synchronization device or apparatus, and the timing task platform is taken as an example of the data synchronization device or apparatus, so that the technical solution in the present disclosure is described, but the specific limitation of the technical solution in the present disclosure is not configured.
The timing task platform is mainly responsible for timing monitoring, interacting with the data acquisition program, providing visual configuration, task program registration and monitoring and displaying of the whole process, enabling a user to flexibly set synchronization time and controlling data synchronization frequency so as to meet different scenes and requirements.
And 102, when the current time is equal to the period starting task time, acquiring initial data in each data source according to the corresponding execution parameters by utilizing a plurality of data acquisition programs.
For the disclosed embodiments, in a timed task platform, one or more data acquisition programs may be registered for each data synchronization task and corresponding execution parameters configured so that the platform can identify and schedule the corresponding data acquisition tasks. The data acquisition programs are independent of each other and do not affect each other, and can process different types of data sources, such as a relational database, a non-relational database or data acquired by calling a third-party program.
When all the configurations are completed, once the period starting task time set by the user is reached, the timing task platform starts a related data acquisition program to execute the data acquisition task according to the corresponding execution parameters, and initial data are acquired in each data source, wherein the initial data can be an initial version data set acquired under the period starting task time or state.
The time of periodically starting the task can be the time point of starting and executing the task at the appointed time, for example, a timing task is started at the whole point of each hour, and then the period starting time is the whole point of each hour, so that the accuracy of the data synchronization time point is ensured;
the execution parameters are used for guiding the data acquisition program to acquire data, process the data and interact with other components when the data synchronization task is executed, and can comprise an acquisition range, the frequency of data acquisition, the address of a data source, the condition of the data to be acquired and the like so as to meet the acquisition requirements of different data sources;
the data acquisition program is used for reading the execution parameters stored in the database, and according to the acquired execution parameters, the data acquisition program runs related program codes and acquires the data meeting the conditions from the data source.
And 103, processing the initial data by using a data processing program according to a preset processing rule to obtain processed target data.
The preset processing rules can be execution rules of a preset data processing program, and guidelines for specifying how to clean, convert, integrate and the like data can comprise operations of cleaning, converting, merging and the like of the data so as to preprocess the acquired data, ensure consistency, accuracy and integrity of the data, and further ensure that the data has higher value in the storage and analysis processes;
the data processing program can be a computer program for performing operations such as cleaning, converting, integrating, analyzing and the like on the original data so as to meet specific requirements or targets;
the target data can be the data processed by the data processing program, and the quality, format and content of the target data meet the requirements of subsequent analysis, modeling, application and the like.
In summary, compared with the prior art, the data synchronization method provided by the present disclosure obtains preset configuration information at regular time, where the preset configuration information includes: periodically starting task time and presetting a processing rule; when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs; and processing the initial data according to a preset processing rule by using a data processing program to obtain processed target data. In this way, the data acquisition from a plurality of data sources is supported simultaneously, and the synchronization time or the control data synchronization frequency can be flexibly set.
Further, as a refinement and extension of the foregoing embodiments, for a complete description of a specific implementation of the method of the present disclosure, the present disclosure provides a specific method as shown in fig. 2, where the method includes:
step 201, acquiring preset configuration information regularly, wherein the preset configuration information comprises: and (5) periodically starting task time and presetting a processing rule.
For the embodiment of the disclosure, the existing problems are solved mainly by introducing a timed task platform, message queue middleware and a database;
as shown in fig. 3, the four parts of the data acquisition program, the data processing program, the data storage program and the timing task platform form the whole data synchronization system. The mode of division and cooperation enables the system to simultaneously support the acquisition data of multiple data sources, and improves the flexibility and compatibility of data synchronization.
And introducing message queue middleware to perform asynchronous communication among programs, so as to solve the problem of coupling between data processing and data acquisition and data storage. This approach allows the data processing process to be independent of the data acquisition and storage process, improving the concurrent processing capacity and performance of the system.
The database is introduced to store information such as data acquisition program execution parameters, preset processing rules, program execution results and the like, so that unified management and monitoring are convenient.
The data acquisition program is responsible for acquiring data from each data source, storing the data into a cache and notifying the data processing program; the data processing program processes the data in the cache according to a preset processing rule, and sends the processed data to the data storage program for storage.
Step 202, when the current time is equal to the period starting task time, acquiring initial data in each data source according to the corresponding execution parameters by utilizing a plurality of data acquisition programs.
The data acquisition program and other programs can run in the timed task platform or on other servers or devices for communicating and cooperating with the timed task platform. The timed task platform is responsible for scheduling and monitoring the execution of these programs to ensure the successful performance of the data sync tasks. Meanwhile, the running environment and the data processing capacity of the data acquisition program can be selected and configured according to actual requirements so as to meet the data synchronization requirements in various different scenes.
For the embodiment of the disclosure, under the scheduling of the timing task platform, the data acquisition program can execute the data acquisition task according to the parameters and rules set by the user, wherein different data acquisition programs can acquire different types of data. The data source may be data in a relational database, such as MySQL, oracle, etc.; data in non-relational databases, such as MongoDB, redis, etc.; but also data obtained by calling a third party program.
Each data acquisition program needs corresponding data source configuration and can comprise information such as an address, an access mode, a data type and the like of a data source; the data acquisition program can acquire full data or incremental data by executing parameter configuration, the full data can be all data, and the incremental data can be newly added data from the last acquisition;
the data acquisition program may also set acquisition conditions by executing parameters in order to acquire data that meets specific requirements. For example, it may be arranged to collect data of a certain field, data of a certain period of time, etc.
Step 203, calculating the data quantity of the initial data; under the condition that the data quantity is larger than 0, storing initial data into a cache system to obtain cache data; the cached data is delivered to the data processing program in the form of a message by using message middleware.
For the presently disclosed embodiments, if the data acquisition program does not acquire any data (i.e., in the case where the data amount of the initial data is equal to 0), the flow of the program ends.
If the data acquisition program acquires data, the data may be stored in a cache system. The buffer memory system is used for temporarily storing data so as to facilitate the subsequent data processing; after caching data, the data acquisition program can send a message, the message middleware is responsible for receiving the message and sending the message to the data processing program, the message is used for waking up the data processing program to start a new task to process the cached data, the data processing program processes the data in the cache according to the data information contained in the message, and meanwhile, the data acquisition program can store the relevant information executed by the acquisition program into a database, such as data acquisition time length, data acquisition total amount and the like.
Wherein, the message can contain relevant information of cache data, such as data source, data type, data volume and the like; message Middleware (MQ) is a technique for handling messaging and communication in a distributed system, and is an asynchronous communication mechanism that implements decoupling between applications by sending messages into queues.
Step 204, determining the number of threads in the thread pool; when the data processing program receives the message, the cache data is subjected to data processing by utilizing threads in the thread pool according to a preset processing rule through a first preset processing algorithm, and the processed target data is obtained.
The thread pool can be a tool for distributing, scheduling and managing threads, and in the data processing program, the thread pool can distribute tasks among a plurality of threads, so that the data processing program can process a plurality of data tasks at the same time, and the next task can be started without waiting for completion of one task, thereby fully utilizing the multi-core processor of the computer, being beneficial to shortening the total time of data processing and improving the processing speed.
For the embodiment of the disclosure, after the data processing program receives the message sent by the message middleware, the data processing program wakes up, reads a preset processing rule from the database, and cleans and processes the original data (i.e., the cached data) in the cache by using a predefined algorithm (i.e., a first preset processing algorithm) to obtain processed target data (i.e., target database storage format data). The first preset processing algorithm may be an algorithm set in a data processing program, and is used for converting or processing data; the first preset processing algorithm can be stored in a data processing program and can be used for realizing simple conversion from one data model to another data model, or can be used for converting the original data model into another data model through complex algorithm operation; the target data can be data which meets the storage requirement of the target database after being cleaned and processed.
The preset processing rules can be used for identifying and removing abnormal values, error values and repeated values in the cache data, so that the quality of the data is improved; the method can also be used for realizing conversion among different data formats, so that the data meets the requirements of a target database or an analysis tool; and may also be used to direct the merging, aggregation, and correlation of data from different data sources for further analysis.
In the embodiment of the present disclosure, the preset processing rule may be an operation guide, describing how to process the cached data; the first preset processing algorithm may be a specific implementation method, and may perform cleaning, conversion, integration, etc. on the buffered data according to preset processing rules. The preset processing rules are used for providing guidance for a first preset processing algorithm, and the first preset processing algorithm can execute specific operations according to the preset processing rules. In an actual application scene, the preset processing rule and the first preset processing algorithm can jointly ensure the quality and accuracy of the cache data in the processing process.
For the embodiment of the disclosure, the thread pool with a reasonable size (i.e. the thread number in the thread pool) can be the optimal thread number configured according to factors such as system resources, task loads, performance requirements and the like in a specific application scene, and the thread pool with a reasonable size can avoid the problem caused by excessive or insufficient threads while ensuring the full utilization of the system resources.
Specifically, the data processing algorithm further includes an important thread Chi Suanfa with a reasonable size, and the algorithm formula (second preset processing algorithm) is as follows:
wherein N is threads Representing thread pools, N cpu Representing the number of processor cores, U cpu The method comprises the steps of representing the utilization rate of processor resource resources, W representing program waiting time and C representing program processing time;
wherein a higher number of processor (which may be a CPU) cores represents a higher computational power; when the utilization rate of the processor resource is high, the number of threads may need to be increased to increase the data processing speed; when the utilization rate of the processor resource is low, the thread number can be properly reduced to reduce the system resource contention degree; program waiting time (which may be the time spent by a program on waiting (e.g., waiting for an IO operation result)), the thread pool algorithm needs to consider the time of the program on waiting (e.g., waiting for an IO operation result), and the longer the waiting time, the greater the overhead of the program on waiting for an IO operation is, and the number of threads may need to be increased to increase the data processing speed; the program processing time length (i.e. the time length that the program actually occupies the processor to calculate), the thread pool algorithm needs to consider the time that the program actually occupies the processor to calculate, and the longer the calculation time, the greater the overhead of the program in terms of calculation, and the number of threads may need to be increased to increase the data processing speed.
The monitoring data of the program can be collected by using a JVissalVM tool, the ratio between the waiting time length of the program and the processing time length of the program is calculated, and the relation between the cost of the program in waiting IO operation and the actual calculation cost can be evaluated, so that the size of a thread pool is adjusted.
By the algorithm, the data processing program generates a thread pool with reasonable size, and can process data in batch so as to improve the data processing capacity and speed.
Accordingly, a specific process of determining the number of threads in a thread pool may include: acquiring processor data and program data of a data storage program, wherein the processor data comprises a processor core number and a processor resource utilization rate, and the program data comprises a program waiting time and a program processing time;
substituting the processor core number, the processor resource utilization rate, the program waiting time and the program processing time into a second preset processing algorithm to determine the number of threads in the thread pool.
Step 205, the processed target data is sent to a data storage program for data storage, and the data storage program is used for receiving the target data and updating or inserting the target data to realize data storage.
Wherein, the data storage can be a process of storing target data in a computer system, including storage, management, maintenance and the like of the data; the data storage program may be a program responsible for storing target data to a database or other storage system.
The data processing program sends the processed target data to the data storage program for storage, and the data storage program is responsible for receiving the target data processed by the data processing program and updating or inserting the target data into a corresponding data storage structure, so that the data can be efficiently stored and managed.
After each thread task performs the storage operation, the execution duration can be stored in a database, which is helpful for recording the task execution condition and is convenient for subsequent analysis, monitoring and optimizing the system performance.
Step 206, acquiring a preset task execution period configured in a data acquisition program; and under the condition that the preset task execution period is unreasonable, carrying out early warning reminding on the preset task execution period, and optimizing the preset task execution period.
The preset task execution period can be a time interval of executing the task once at intervals when the task is specified, namely a time period from the last execution of the task to the beginning of the next execution; for example, a timed task is executed once every minute, and then its period execution time is 1 minute, in practical application, the preset task execution period can be adjusted according to the requirement to realize more flexible task scheduling.
For the embodiment of the disclosure, as the data sources collected by each data synchronization task are different, the data collection and processing rules are different, and the total data amount of each data collection is also different, the execution period of the preset task can be continuously adjusted and optimized by analyzing and calculating the data synchronization task, so that each execution effect of each data synchronization task is optimal.
For the embodiment of the disclosure, the method for judging that the preset task execution period is unreasonable may include obtaining the number of tasks, the number of times of execution and the execution time in the preset task execution period;
calculating the overcycle of the execution cycle of the preset task corresponding to all the data acquisition programs, and calculating the deadline corresponding to the data acquisition programs according to the execution times and the execution time;
substituting the preset task execution period and the overcycle into a task number calculation formula to obtain the maximum task number in the overcycle, and substituting the preset task execution period and the execution time into a load calculation formula to obtain a system load;
if at least one of the number of tasks is greater than the maximum number of tasks, the system load is greater than 1, and the execution time is greater than the deadline is satisfied, the task execution period is determined to be unreasonable.
In a specific application scenario, assuming that the timing task platform registers N data acquisition programs T1, T2, …, tn, and the preset task execution period configured by the Ti-th data acquisition program is Pi, the j-th running time of the data acquisition program is (j-1) Pi (j=1, 2,3, …, N "), and the running relative deadline is D ij =jP i The minimum common multiple of all the execution cycles of the data acquisition program, namely the superseriod of the task set, is recorded as SLCM. For the data acquisition procedure, the situation is the same for each super period, so only at [0, SLCM]Studies were performed in the scope. The maximum number of tasks in a cycle is:
wherein, the overcycle can be the least common multiple of the execution cycles of the preset tasks of all the data acquisition programs; the least common multiple may be the smallest of the multiples common to two or more integers.
In a specific application scenario, it is assumed that n acquisition tasks Ti of n data acquisition programs are executed, where the execution time is Ci and the period is Pi. The deadline first algorithm can be executed by the processor if the acquisition procedure of the following equation is satisfied without regard to other overhead of the system.
Wherein the execution time may be the time required to complete a particular task, here a data acquisition program. Specifically, the time elapsed from the start of execution of a task to completion may be expressed as: execution time = time when task was actually completed/number of task executions; ρ represents the system load, if it is calculated that the system load is 1.275>1, it is not schedulable (overloaded) and the system cannot guarantee that all tasks are completed before the deadline.
In a specific application scenario, t is the current time of the system, cr is the estimated execution time of the data acquisition program Ti, and Dij is the deadline of the jth run of the data acquisition program Ti. If there is d=d ij -(t+C r ) And ≡0 indicates that the deadline for the running of the collection program is currently achievable. Thus, at dispatch time, the dispatched ready task d is calculated, and if d is greater than 0, execution is performed, whereas not.
Among other things, deadline priority algorithms (Earliest Deadline First, EDF) can be used in algorithms for scheduling tasks with deadlines in real-time systems, where the priorities of the tasks are dynamically assigned according to their deadlines, the earlier the deadline, the higher the priority. Without regard to other overhead of the system, it is possible for an acquisition program to be executed by the processor if:
the execution time of the acquisition program does not exceed the deadline of the acquisition program, and the acquisition program completes execution before the deadline, so that the real-time requirement is met.
In processing real-time tasks, deadline priority algorithms prioritize those tasks that are near deadlines, ensuring that they complete execution within a specified time. The scheduling strategy can improve the real-time performance of the system and ensure that each task is executed according to the expected time sequence.
In a specific application scenario, for example, the timing task platform registers 3 data acquisition programs A, B, C, the configured task criticality Ki is respectively medium, low and high, the configured execution period Pi is respectively 20ms, 40ms and 50ms, and the longest time length Ci of the data acquisition program A, B, C used in the past execution is respectively 10ms, 15ms and 20ms according to the executed parameters stored in the database, as shown in fig. 4.
For the embodiment of the present disclosure, according to the number of data acquisition programs registered by the timing task platform, the preset task execution period and task criticality configured by each data acquisition program, the acquired task duration of executed each data acquisition program stored in the database, the maximum task number in the system period can be calculated, whether the program can be executed by the processor or not can be calculated, and whether a certain acquisition task can be executed or not can be calculated, if at least one of the maximum task number, the system load is greater than 1, and the execution time is greater than the deadline is met, the configured preset task execution period is determined to be unreasonable, the system sends a reminder for modifying the task period execution time, and the task period execution time can be modified after the reminder is received, thereby achieving the effect of continuously optimizing the timing task period execution time.
For the disclosed embodiments, the present application proposes the concept of supporting multiple data sources, which essentially is a set of different data collection, processing, and storage procedures, supporting the requirement of diversified collection, where the data sources may be data in a relational database, data in a cache database, or data obtained by requesting a certain interface, etc.
The method integrates data synchronization and timing tasks, and centralized management is carried out on the data synchronization program through the timing task management platform. The platform provides visual configuration, and can configure the execution parameters of the acquisition program, the processing rules of the data processing program, the execution period of the program and the like.
The data processing algorithm and the thread pool size generation algorithm are added in the data processing step, so that data with different requirements and different capacity sizes can be processed.
The timing task execution time of the synchronous data in the method can be calculated according to an algorithm to judge whether the synchronous data is reasonable or not, and if the synchronous data is not reasonable, a modification prompt is sent, and then modification is carried out to achieve the optimization effect.
According to the method and the system, in a traditional data synchronization system, a single system can only support single data source acquisition requirements, a timing task management platform is introduced, the management platform can register different data source acquisition requirement programs, the problem of single data source can be well solved, different acquisition programs can be uniformly managed, and the timing task execution time of synchronous data can be informed whether reasonable or not according to an algorithm.
In summary, compared with the prior art, the data synchronization method provided by the present disclosure obtains preset configuration information at regular time, where the preset configuration information includes: periodically starting task time and presetting a processing rule; when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs; and processing the initial data according to a preset processing rule by using a data processing program to obtain processed target data. In this way, the data acquisition from a plurality of data sources is supported simultaneously, and the synchronization time or the control data synchronization frequency can be flexibly set.
Based on the specific implementation of the method shown in fig. 1 and fig. 2, this embodiment provides a data synchronization device, as shown in fig. 5, including: the device comprises a first acquisition module 31, an acquisition module 32 and a processing module 33;
the first obtaining module 31 is configured to obtain preset configuration information at regular time, where the preset configuration information includes: periodically starting task time and presetting a processing rule;
the acquisition module 32 is configured to acquire initial data in each data source according to corresponding execution parameters by using a plurality of data acquisition programs when the current time is equal to the period starting task time;
and a processing module 33, configured to process the initial data according to the preset processing rule by using a data processing program, so as to obtain processed target data.
In a specific application scenario, as shown in fig. 5, the apparatus further includes: a calculation module 34, a storage module 35, a transfer module 36, and a termination module 37;
a calculation module 34 for calculating a data amount of the initial data;
a storage module 35, configured to store the initial data into a cache system to obtain cache data when the data size is greater than 0;
a transfer module 36 for transferring the buffered data to the data processing program in the form of a message using message middleware;
a termination module 37, configured to terminate the processing flow if the data amount of the initial data is equal to 0.
In a specific application scenario, the processing module 33 may be configured to determine the number of threads in the thread pool; when the data processing program receives the message, the data processing program processes the cached data according to the preset processing rule by utilizing threads in the thread pool through a first preset processing algorithm, and the processed target data is obtained.
In a specific application scenario, the processing module 33 may be configured to obtain processor data and program data of the data storage program, where the processor data includes a processor core number and a processor resource utilization rate, and the program data includes a program waiting duration and a program processing duration;
substituting the processor core number, the processor resource utilization rate, the program waiting time and the program processing time into a second preset processing algorithm to determine the number of threads in the thread pool.
In a specific application scenario, as shown in fig. 5, the apparatus further includes: a storage module 38, a second acquisition module 39, and a reminder module 40;
the storage module 38 is configured to send the processed target data to a data storage program for data storage, where the data storage program is configured to receive the target data, and perform an update or insert operation on the target data to implement data storage;
a second obtaining module 39, configured to obtain a preset task execution period configured in the data acquisition program;
the reminding module 40 is configured to perform early warning reminding on the preset task execution period and optimize the preset task execution period when the preset task execution period is not reasonable.
In a specific application scenario, the reminding module 40 may be configured to obtain the number of tasks, the number of execution times and the execution time in the preset task execution period;
calculating the overcycles of the execution cycles of the preset tasks corresponding to all the data acquisition programs, and calculating the deadlines corresponding to the data acquisition programs according to the execution times and the execution time;
substituting the preset task execution period and the overcycle into a task number calculation formula to obtain the maximum task number in the overcycle, and substituting the preset task execution period and the execution time into a load calculation formula to obtain a system load;
and if at least one of the task number is greater than the maximum task number, the system load is greater than 1, and the execution time is greater than a deadline is met, determining that the task setting execution period is unreasonable.
It should be noted that, in other corresponding descriptions of each functional unit related to the frame-extracting task scheduling device applicable to the edge node side provided in this embodiment, reference may be made to corresponding descriptions of the method in fig. 1, which are not repeated herein.
Based on the above-described methods as shown in fig. 1 and 2, the present disclosure also provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the above-described methods as shown in fig. 1 and 2.
Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to execute the method of each implementation scenario of the present disclosure.
Based on the methods shown in fig. 1 and fig. 2 and the virtual device embodiment shown in fig. 5, in order to achieve the above objects, the embodiment of the disclosure further provides an electronic device, which may be configured on an end side of a vehicle (such as an electric automobile), and the device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the method as shown in fig. 1 and 2 described above.
Optionally, the physical device may further include a user interface, a network interface, a camera, radio frequency (RadioFrequency, RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
It will be appreciated by those skilled in the art that the above-described physical device structure provided by the present disclosure is not limiting of the physical device, and may include more or fewer components, or may combine certain components, or a different arrangement of components.
The storage medium may also include an operating system, a network communication module. The operating system is a program that manages the physical device hardware and software resources described above, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the information processing entity equipment.
From the above description of embodiments, it will be apparent to those skilled in the art that the present disclosure may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. Compared with the prior art, the data synchronization method, the data synchronization device and the electronic equipment provided by the disclosure acquire preset configuration information at regular time, wherein the preset configuration information comprises the following steps: periodically starting task time and presetting a processing rule; when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs; and processing the initial data according to a preset processing rule by using a data processing program to obtain processed target data. In this way, the data acquisition from a plurality of data sources is supported simultaneously, and the synchronization time or the control data synchronization frequency can be flexibly set.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The above is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A method of data synchronization, the method comprising:
acquiring preset configuration information at fixed time, wherein the preset configuration information comprises: periodically starting task time and presetting a processing rule;
when the current time is equal to the period starting task time, acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs;
and processing the initial data according to the preset processing rule by using a data processing program to obtain processed target data.
2. The method of claim 1, wherein after said acquiring initial data in each data source in accordance with the corresponding execution parameters using a plurality of data acquisition programs, the method further comprises:
calculating the data quantity of the initial data;
storing the initial data into a cache system to obtain cache data under the condition that the data volume is larger than 0;
and using message middleware to transfer the cache data to the data processing program in the form of a message.
3. The method according to claim 2, wherein the method further comprises:
in the case where the data amount of the initial data is equal to 0, the process flow is terminated.
4. The method according to claim 2, wherein the processing the initial data by the data processing program according to the preset processing rule to obtain the processed target data includes:
determining the number of threads in a thread pool;
when the data processing program receives the message, the data processing program processes the cached data according to the preset processing rule by utilizing threads in the thread pool through a first preset processing algorithm, and the processed target data is obtained.
5. The method of claim 4, wherein determining the number of threads in the thread pool comprises:
acquiring processor data and program data of the data storage program, wherein the processor data comprises a processor core number and a processor resource utilization rate, and the program data comprises a program waiting time and a program processing time;
substituting the processor core number, the processor resource utilization rate, the program waiting time and the program processing time into a second preset processing algorithm to determine the number of threads in the thread pool.
6. The method of claim 4, wherein after said data processing is performed on said buffered data to obtain said target data after processing, said method further comprises:
and sending the processed target data to a data storage program for data storage, wherein the data storage program is used for receiving the target data and carrying out updating or inserting operation on the target data to realize data storage.
7. The method according to claim 1, wherein the method further comprises:
acquiring a preset task execution period configured in the data acquisition program;
and under the condition that the preset task execution period is unreasonable, carrying out early warning reminding on the preset task execution period, and optimizing the preset task execution period.
8. The method of claim 7, wherein the determining that the preset task execution period is unreasonable comprises:
acquiring the task number, the execution times and the execution time in the preset task execution period;
calculating the overcycles of the execution cycles of the preset tasks corresponding to all the data acquisition programs, and calculating the deadlines corresponding to the data acquisition programs according to the execution times and the execution time;
substituting the preset task execution period and the overcycle into a task number calculation formula to obtain the maximum task number in the overcycle, and substituting the preset task execution period and the execution time into a load calculation formula to obtain a system load;
and if at least one of the task number is greater than the maximum task number, the system load is greater than 1, and the execution time is greater than a deadline is met, determining that the task setting execution period is unreasonable.
9. A data synchronization device, the device comprising:
the first acquisition module is used for acquiring preset configuration information at fixed time, and the preset configuration information comprises: periodically starting task time and presetting a processing rule;
the acquisition module is used for acquiring initial data in each data source according to corresponding execution parameters by utilizing a plurality of data acquisition programs when the current time is equal to the period starting task time;
and the processing module is used for processing the initial data according to the preset processing rule by utilizing a data processing program to obtain processed target data.
10. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311605112.5A CN117573777A (en) | 2023-11-28 | 2023-11-28 | Data synchronization method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311605112.5A CN117573777A (en) | 2023-11-28 | 2023-11-28 | Data synchronization method and device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117573777A true CN117573777A (en) | 2024-02-20 |
Family
ID=89893461
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311605112.5A Pending CN117573777A (en) | 2023-11-28 | 2023-11-28 | Data synchronization method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117573777A (en) |
-
2023
- 2023-11-28 CN CN202311605112.5A patent/CN117573777A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9798830B2 (en) | Stream data multiprocessing method | |
CN109656782A (en) | Visual scheduling monitoring method, device and server | |
CN105630588A (en) | Distributed job scheduling method and system | |
CN112685153A (en) | Micro-service scheduling method and device and electronic equipment | |
CN111338791A (en) | Method, device and equipment for scheduling cluster queue resources and storage medium | |
CN103685309A (en) | Asynchronous request queue model oriented to map visualization tile service | |
CN112363913B (en) | Parallel test task scheduling optimizing method, device and computing equipment | |
CN113672500B (en) | Deep learning algorithm testing method and device, electronic device and storage medium | |
CN110119307B (en) | Data processing request processing method and device, storage medium and electronic device | |
CN112579267A (en) | Decentralized big data job flow scheduling method and device | |
CN112487034A (en) | Double-queue asynchronous image processing method and device | |
CN111506430A (en) | Method and device for data processing under multitasking and electronic equipment | |
CN111680085A (en) | Data processing task analysis method and device, electronic equipment and readable storage medium | |
CN107179896A (en) | Task processing method and device | |
CN117149388A (en) | Batch task scheduling method and system, electronic equipment and storage medium | |
CN109284193A (en) | A kind of distributed data processing method and server based on multithreading | |
CN113641472A (en) | Method and device for realizing different conversion and same sharing of distributed application | |
CN117573777A (en) | Data synchronization method and device and electronic equipment | |
CN113485810A (en) | Task scheduling execution method, device, equipment and storage medium | |
CN112052077B (en) | Method, device, equipment and medium for managing software tasks | |
CN112486638A (en) | Method, apparatus, device and storage medium for executing processing task | |
CN114035928B (en) | Distributed task allocation processing method | |
CN109829005A (en) | A kind of big data processing method and processing device | |
CN116302423A (en) | Distributed task scheduling method and system for cloud management platform | |
CN112783613B (en) | Method and device for scheduling units |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |