CN112749197B - Data fragment refreshing method, device, equipment and storage medium - Google Patents

Data fragment refreshing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112749197B
CN112749197B CN202110036533.5A CN202110036533A CN112749197B CN 112749197 B CN112749197 B CN 112749197B CN 202110036533 A CN202110036533 A CN 202110036533A CN 112749197 B CN112749197 B CN 112749197B
Authority
CN
China
Prior art keywords
data
refreshing
code
date
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110036533.5A
Other languages
Chinese (zh)
Other versions
CN112749197A (en
Inventor
周兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202110036533.5A priority Critical patent/CN112749197B/en
Publication of CN112749197A publication Critical patent/CN112749197A/en
Application granted granted Critical
Publication of CN112749197B publication Critical patent/CN112749197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention relates to the field of big data and discloses a data fragment refreshing method, a device, equipment and a storage medium. The method comprises the following steps: configuring a slice refreshing parameter and defining an initialization task, wherein the slice refreshing parameter comprises the following components: initializing a task name, a mechanism code and a refreshing date; when a data refreshing request of a client is monitored, judging whether the data refreshing request refers to the initialization task or not; if the task is the initialization task, executing single-mechanism data refreshing, and reading the mechanism code and refreshing date in the fragment refreshing parameter; transmitting the mechanism codes and the refreshing date into a preset running number script; executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of the corresponding service system. The present invention also relates to blockchain technology, the organization data being stored in the blockchain. The invention realizes the data refreshing of a single mechanism and reduces the waste of computer resources.

Description

Data fragment refreshing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of big data, and in particular, to a method, an apparatus, a device, and a storage medium for refreshing a data slice.
Background
To cope with various changes in the external market environment or the internal organizational environment, business systems often need to make organizational adjustments, and business systems then need to keep pace with all the data of each organization up to date. Thus, when the business system is running, it is often necessary to refresh the data in order to obtain the latest organization's data. The existing refreshing mechanism is to completely clear the cache data in the service system and then rewrite the new full data, however, in practical application, the full data is not required to be refreshed every time, which results in a great amount of waste of computing resources.
Disclosure of Invention
The invention mainly aims to solve the technical problem that a service system cannot support single-mechanism data refreshing.
The first aspect of the present invention provides a data slice refreshing method, including:
configuring a slice refreshing parameter and defining an initialization task for data exchange, wherein the slice refreshing parameter comprises: initializing a task name, a mechanism code and a refreshing date;
when a data refreshing request of a client is monitored, judging whether the data refreshing request points to the initialization task or not;
if the initialization task is pointed, refreshing the single-mechanism data based on the initialization task, and reading the mechanism code and the refreshing date in the fragment refreshing parameter;
And transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system. Optionally, in a first implementation manner of the first aspect of the present invention, when the data refresh request of the client is monitored, determining whether the data refresh request points to the initialization task further includes:
if the initialization task is not pointed, refreshing the whole mechanism data, and setting the mechanism code as a preset code;
and transmitting the preset code and the refreshing date into the running number script, and executing the running number script to load all mechanism data corresponding to the refreshing date into a cache of a corresponding service system.
Optionally, in a second implementation manner of the first aspect of the present invention, before the configuring the slice refresh parameter and defining the initialization task, the method further includes:
modifying a table structure for storing the mechanism data in the cache to be partitioned according to a secondary mechanism;
and modifying the processing code support receiving mechanism coding parameters and storing the processing code support receiving mechanism coding parameters into a secondary mechanism partition so that the running number script is compatible with single-mechanism data refreshing and full-mechanism data refreshing.
Optionally, in a third implementation manner of the first aspect of the present invention, before the configuring the slice refresh parameter and defining the initialization task, the method further includes:
an organization code configuration table is created in the metadata base for storing organization code parameters that need to be refreshed.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the executing the running number script, loading the mechanism data corresponding to the refreshing date of the mechanism code into a cache of the corresponding service system includes:
executing the running number script, and searching a cache node and a memory fragment corresponding to the refreshing date corresponding to the mechanism code in a preset database;
determining index information of the mechanism data to be loaded according to the cache node, and acquiring corresponding mechanism data from the corresponding memory fragments according to the index information;
and loading the organization data into a cache of a corresponding service system.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the data slice refreshing method further includes:
determining the data type of each organization data according to the field header of the organization data;
generating a monitoring index during refreshing of the mechanism data according to the data type;
Counting appointed fields in corresponding mechanism data according to the monitoring indexes to obtain monitoring values when the mechanism data are refreshed;
and comparing the monitoring value with a preset result value, determining mechanism data for refreshing errors, and generating a data refreshing report for display.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the counting, according to the monitoring indicator, the designated field in the corresponding institution data, to obtain the monitoring value when the institution data is refreshed includes:
extracting a field value of a specified field in the mechanism data, and judging the type of the monitoring index, wherein the type of the monitoring index comprises a first type monitoring index and a second type monitoring index;
if the monitoring index is a first type monitoring index, accumulating field values of specified fields in corresponding mechanism data, and taking the accumulated field values as monitoring values when the mechanism data are refreshed;
if the monitoring index is the first type monitoring index, calculating an average value of field values of designated fields in corresponding organization data, and taking the average value as a monitoring value when the organization data is refreshed. .
The second aspect of the present invention provides a data slice refreshing device, including:
The configuration module is used for configuring the slice refreshing parameters and defining an initialization task for data exchange, wherein the slice refreshing parameters comprise: initializing a task name, a mechanism code and a refreshing date;
the judging module is used for judging whether the data refreshing request points to the initialization task or not when the data refreshing request of the client is monitored;
the shunting module is used for refreshing the single-mechanism data based on the initialization task if the shunting module points to the initialization task, and reading the mechanism code and the refreshing date in the fragmentation refreshing parameter;
and the execution module is used for transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
Optionally, in a first implementation manner of the second aspect of the present invention, the data slice refreshing device further includes:
the shunting module is also used for refreshing the whole mechanism data and setting the mechanism code as a preset code if the shunting module does not point to the initialization task;
and transmitting the preset code and the refreshing date into the running number script, and executing the running number script to load all mechanism data corresponding to the refreshing date into a cache of a corresponding service system.
Optionally, in a second implementation manner of the second aspect of the present invention, before the configuring module, the method further includes:
the first modification module is used for modifying the table structure used for storing the mechanism data in the cache into a partition according to a second-level mechanism;
and the second modification module is used for modifying the processing code support receiving mechanism coding parameters and storing the processing code support receiving mechanism coding parameters into the secondary mechanism partition so that the running number script is compatible with single-machine data refreshing and full-mechanism data refreshing.
Optionally, in a third implementation manner of the second aspect of the present invention, before the configuring module, the method further includes:
and the creation module is used for creating an organization code configuration table in the metadata base for storing organization code parameters needing refreshing.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the execution module is further configured to:
executing the running number script, and searching a cache node and a memory fragment corresponding to the refreshing date corresponding to the mechanism code in a preset database;
determining index information of the mechanism data to be loaded according to the cache node, and acquiring corresponding mechanism data from the corresponding memory fragments according to the index information;
and loading the organization data into a cache of a corresponding service system.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the data slice refreshing device further includes:
the determining module is used for determining the data type of each organization data according to the field head of the organization data;
the generation module is used for generating monitoring indexes during the refreshing of the mechanism data according to the data types;
the statistics module is used for carrying out statistics on specified fields in corresponding mechanism data according to the monitoring indexes to obtain monitoring values when the mechanism data are refreshed;
and the comparison module is used for comparing the monitoring value with a preset result value, determining mechanism data for refreshing errors, and generating a data refreshing report for display.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the statistics module includes:
the extraction unit is used for extracting a field value of a specified field in the organization data and judging the type of the monitoring index, wherein the type of the monitoring index comprises a first type monitoring index and a second type monitoring index;
the accumulation unit is used for accumulating the field value of the designated field in the corresponding mechanism data if the monitoring index is the first type monitoring index, and taking the accumulated field value as the monitoring value when the mechanism data is refreshed;
And the average value calculation unit is used for calculating the average value of the field values of the designated fields in the corresponding mechanism data if the monitoring index is the first type monitoring index, and taking the average value as the monitoring value when the mechanism data is refreshed.
A third aspect of the present invention provides a data slice refreshing apparatus, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the data slice refreshing device to perform the data slice refreshing method described above.
A fourth aspect of the present invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the data slice refresh method described above.
According to the technical scheme provided by the invention, corresponding mechanism codes are set for different mechanisms, and the mechanism codes of the data needing to be refreshed are written into the configuration table and carried by the initialization task, wherein the configuration table structure is presented in a secondary mechanism partition mode so as to support the refreshing of the data of a single mechanism; when the calling program executes the data refreshing task, determining whether to carry out single-mechanism data refreshing by judging whether the data refreshing task is the initializing task; when the single-mechanism data refreshing is carried out, the corresponding mechanism codes in the configuration table are transmitted into the running number script in the program to execute the data refreshing task of the corresponding mechanism, so that the single-mechanism data is generated, the data refreshing of the single mechanism is realized, and the waste of computer resources is reduced.
Drawings
FIG. 1 is a schematic diagram of a first embodiment of a data slice refreshing method according to an embodiment of the present invention;
FIG. 2 is a diagram showing a second embodiment of a data slice refreshing method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a third embodiment of a data slice refreshing method according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a fourth embodiment of a data slice refreshing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of a data slice refreshing apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another embodiment of a data slice refreshing apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an embodiment of a data slice refreshing apparatus according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a data fragment refreshing method, a device, equipment and a storage medium, which comprise the steps of configuring fragment refreshing parameters and defining an initialization task, wherein the fragment refreshing parameters comprise: initializing a task name, a mechanism code and a refreshing date; when a program is called to execute a data refreshing task, judging whether the data refreshing task executed by the program is an initialization task or not; if the task is an initialization task, determining that the program executes single-unit data refreshing, and reading a mechanism code and a refreshing date in the fragment refreshing parameter; transmitting the mechanism code and the refreshing date into a running number script of the program; executing the running number script to load the organization data of the organization code corresponding to the refreshing date into the cache of the corresponding service system. The present invention also relates to blockchain technology, the organization data being stored in the blockchain. The invention realizes the data refreshing of a single mechanism and reduces the waste of computer resources.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For easy understanding, the following describes a specific flow of an embodiment of the present invention, referring to fig. 1, and a first embodiment of a data slice refreshing method in the embodiment of the present invention includes:
101. configuring a slice refreshing parameter and defining an initialization task for data exchange, wherein the slice refreshing parameter comprises: initializing a task name, a mechanism code and a refreshing date;
It is to be understood that the execution body of the present invention may be a data slice refreshing device, and may also be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example. It is emphasized that to further ensure the privacy and security of the organization data, the organization data may also be stored in nodes of a blockchain.
In this embodiment, the initialization task may be an ETL initialization task, where ETL (Extract-Transform-Load) data is obtained from a source end, and is extracted, converted, and loaded to a destination end; in the fragment refreshing parameters, the ETL initialization task name refers to a task name for the user-defined data acquisition of a developer, wherein the task name can contain information such as a mechanism number, a development category number and the like; the mechanism code is used as a limiting condition for acquiring target data in a source end database so as to realize data fragment refreshing of a single mechanism; the refreshing date is used for setting the specific execution time of the ETL initialization task, on one hand, the refreshing date is matched with the segmented acquisition and use of data, and on the other hand, the query and acquisition time sequence of the database is scheduled when the data is refreshed in a segmented mode, so that the extraction efficiency of the database is improved.
In addition, the ETL initialization task refers to the rule of extracting, converting and loading definition data and the operation type parameter of newly added definition data refreshing, including a single-mechanism data refreshing type and a full-mechanism data refreshing type; specifically, initializing the task definition includes the steps of:
(1) Configuring a linkdo task, namely, an initialization task with operation type PARAMETERs, wherein the circulation execution time can be set, and the data of trunk AUTOPRM_ETL_DEPT_PARAMETER is executed at a ratio of 11:30 per day;
(2) Receiving an operation type parameter task_freq transmitted by Linkdo by using common_RUN.sh, and defining task_freq= 'O' as data slice refreshing of a single mechanism;
(3) Judging at common_run:
1. when task_freq= 'O' ('O' may be set according to the specific situation), reading one or more mechanism CODEs of a single mechanism, and assigning the mechanism CODEs to a parameter AUTOPRM_DEPT_CODE;
2. the parameter AUTOPRM_DEPT_CODE is directly assigned with '2' when task_freq < 'O', wherein '2' represents that the data refreshing of the whole mechanism is carried out;
3. when task_freq= 'O' and no mechanism code of any single mechanism is recorded, the program misexits.
102. When a data refreshing request of a client is monitored, judging whether the data refreshing request points to the initialization task or not;
In this embodiment, the ETL program refers to a rule of a discrimination condition for performing single-mechanism slicing processing or full-mechanism processing by adding an ETL in a conventional program for performing data extraction, conversion and loading, and specifically receives a data refresh request, and then receives an incoming parameter of task_freq through common_run.sh, when the data refresh request points to a preset initialization task, the data refresh task executed by the ETL program is the initialization task, that is, single-mechanism data refresh is performed, otherwise, the data refresh of the full-mechanism is directly performed.
103. If the initialization task is pointed, refreshing the single-mechanism data based on the initialization task, and reading the mechanism code and the refreshing date in the fragment refreshing parameter;
in this embodiment, when it is determined that a data refreshing task executed by a current ETL program is an initialization task, data refreshing is performed on one or more single mechanisms of the initialization task; the initialization task is provided with a configuration table, and the mechanism code for refreshing the data and the refreshing date are recorded; the organization code may be an organization code registered in a government, or a code of each organization in a custom enterprise, etc. The refresh date may be a refresh date of a single refresh mode, and a fixed year-month-day is set: the method comprises the following steps: second, when the refreshing date is reached, adopting an ETL program to refresh single times of data of the corresponding mechanism; the refreshing date can also be the refreshing time of a multiple fixed refreshing mode, and the data refreshing time can be recorded in sequence according to the form of a single refreshing date; in addition, the refresh date may be a refresh date of a cyclic refresh mode, and a start refresh time, a refresh period, and an end refresh time are recorded, where the refresh period may be a day, a week, a month, or the like.
104. And transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
In this embodiment, the ETL program performs data refreshing by using the data of the mechanism corresponding to the running steps according to the mechanism code and the refreshing date recorded in the configuration table in the initialization task, where the refreshing date is the trigger time, when the refreshing date is reached, the running script is started, the mechanism code is used as an index value, the corresponding data is obtained from the source database, the running script is used to perform data refreshing according to the data extraction, conversion and loading rules defined by the initialization task, the refreshed mechanism data is stored in the cache of the set target service system, and the mechanism code is used to perform identification, so that directional storage of the refreshed mechanism data is completed, and the validity and integrity of the mechanism data are easily searched and checked.
Wherein, the read-write plug-ins can be configured, the mechanism code is used as an index, each read-write plug-in corresponds to a data source, and each write-in corresponds to a cache of a service system; specifically, the data can be read and written through the topology in the storm cluster, and the landing of the data and the data of the refreshing mobile phone can be realized by adopting Hadoop (sea Du Pu) as hdfs (Hadoop Distributed File System, distributed file system) or Hadoop database hbase (Hadoop Base, distributed database).
In the embodiment of the invention, corresponding mechanism codes are set for different mechanisms, and the mechanism codes of the data needing to be refreshed are written into a configuration table and carried by an ETL initialization task, wherein the configuration table structure is presented in a secondary mechanism partition mode so as to support the data refreshing of a single mechanism; when the ETL program is called to execute a data refreshing task, determining whether to carry out single-mechanism data refreshing by judging whether the data refreshing task is the initialization task; when the single-mechanism data refreshing is carried out, the corresponding mechanism codes in the configuration table are transmitted into the running number script in the ETL program so as to execute the data refreshing task of the corresponding mechanism, generate the single-mechanism data, realize the data refreshing of the single mechanism and reduce the waste of computer resources.
Referring to fig. 2, a second embodiment of a data slice refreshing method according to an embodiment of the present invention includes:
201. configuring a slice refreshing parameter and defining an initialization task for data exchange, wherein the slice refreshing parameter comprises: initializing a task name, a mechanism code and a refreshing date;
202. when a data refreshing request of a client is monitored, judging whether the data refreshing request points to the initialization task or not;
203. If the initialization task is not pointed, refreshing the whole mechanism data, and setting the mechanism code as a preset code;
in this embodiment, the initialization task is configured with a single-mechanism data refresh function, and for the other initialization tasks, conventional full-mechanism data refresh is performed, and control refresh is performed by a preset code configured in advance, so that the single-mechanism data refresh is distinguished from the full-mechanism data refresh.
204. Transmitting the preset code and the refreshing date into the running number script, and executing the running number script to load all mechanism data corresponding to the refreshing date into a cache of a corresponding service system;
in this embodiment, the preset code controls the running steps of the ETL program to execute the full-mechanism data refresh, and the refresh date controls the execution time of the running steps.
In this embodiment, the preset code is an execution code for refreshing data of all mechanisms, and the running number script of the ETL program can identify that the data of all mechanisms need to be refreshed through the preset code; the whole-mechanism data refreshing can be carried out as a daily task, the refreshing date is usually set in a cyclic refreshing mode, and the cyclic period is in units of days, weeks and months; in another case, the refreshing of the whole organization data is set as a data modeling staged task or the release of a new version product, and the refreshing date is usually set in a limited condition mode, namely, after the data modeling staged is finished and the new version product is released, the refreshing of the whole organization data is executed.
205. If the initialization task is pointed, refreshing the single-mechanism data based on the initialization task, and reading the mechanism code and the refreshing date in the fragment refreshing parameter;
206. and transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
In this embodiment, the mechanism data to be refreshed by fragments is stored in the preset database, and the memory fragments can be found out by the mechanism codes at the cache node of the mechanism data to be loaded by each mechanism, wherein the index information of the mechanism data to be loaded and the pointer of the memory fragments are stored in the cache node, and the memory fragments corresponding to the storage position of the mechanism data to be loaded can be pointed to by the index information or the pointer, namely, different areas of the preset database where the mechanism data are stored. The specific implementation steps are as follows:
(1) Executing the running number script, and searching a cache node and a memory fragment corresponding to the refreshing date corresponding to the mechanism code in a preset database;
(2) Determining index information of the mechanism data to be loaded according to the cache node, and acquiring corresponding mechanism data from the corresponding memory fragments according to the index information;
(3) And loading the organization data into a cache of a corresponding service system.
In the embodiment of the invention, the process of executing the full-mechanism data refreshing by the ETL is introduced, the full-mechanism data refreshing is executed by the ETL program under the control of the special mechanism code for the full-mechanism data refreshing, and the data is stored in the corresponding partition in the modified cache space, so that the opening of a separate channel for the full-mechanism refreshing is realized.
Referring to fig. 3, a third embodiment of a data slice refreshing method according to an embodiment of the present invention includes:
301. modifying a table structure for storing the mechanism data in the cache to be partitioned according to a secondary mechanism;
in this embodiment, the table in the cache for storing the mechanism data may be a hive table in Hadoop, the structure of the mechanism data may be mapped into the table and displayed in the form of a table structure, where the primary mechanism table structure may be enterprise code, and the secondary mechanism partition is modified in this way by using the mechanism code as an identifier.
302. Modifying the coding parameters of the processing code support receiving mechanism, and storing the coding parameters into a secondary mechanism partition so that the running number script is compatible with single mechanism data refreshing and full mechanism data refreshing;
in this embodiment, when the program does not need a single-brush mechanism, the first condition where is constant is satisfied by inputting the parameter 2 into the AUTOPRM_DEPT_CODE; when a program requires a mechanism single brush, the first condition is not met and the second condition encodes a mechanism that constrains the single mechanism as follows: where (' { AUTOPRM_DEPT_CODE } ' = '2'or sec_department_code in (' { AUTOPRM_DEPT_CODE }).
303. Creating a mechanism coding configuration table in the metadata base for storing mechanism coding parameters to be refreshed;
in addition, the mechanism coding configuration table stores mechanism coding parameters to be refreshed, wherein different mechanism coding parameters are separated by english commas, for example, the mechanism coding parameter of the mechanism a is 201, the mechanism coding parameter of the mechanism B is 205, the mechanism coding parameter of the mechanism C is 208, and the presentation mode of the mechanism coding configuration table may be: {205,201,208}, the data corresponding to the mechanism B, the mechanism A and the mechanism C are sequentially refreshed according to the mechanism coding configuration table.
304. Configuring a slice refreshing parameter and defining an initialization task for data exchange, wherein the slice refreshing parameter comprises: initializing a task name, a mechanism code and a refreshing date;
305. when a data refreshing request of a client is monitored, judging whether the data refreshing request points to the initialization task or not;
306. if the initialization task is pointed, refreshing the single-mechanism data based on the initialization task, and reading the mechanism code and the refreshing date in the fragment refreshing parameter;
307. and transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
In the embodiment of the invention, the table structure of the storage mechanism data is modified into a secondary mechanism partition, the modified processing codes for adapting single mechanism refreshing and full mechanism refreshing are stored into the secondary mechanism partition, and a mechanism coding configuration table is created so as to realize the logic that the ETL can simultaneously support single mechanism data refreshing and full mechanism data refreshing.
Referring to fig. 4, a fourth embodiment of a data slice refreshing method in an embodiment of the present invention includes:
401. determining the data type of each organization data according to the field header of the organization data;
in this embodiment, for the data refreshing process of the oracle database, the data processing task of the data refreshing process is usually in the process code, and many software tools for lexical analysis and syntax analysis can support the analysis of the process code, such as lex and yacc based on the C language, java cc and antlr based on java, and the like. The java cc is selected in the present procedure, and the process may include: and running an analysis program, analyzing the designated procedure codes by using the generated class files, analyzing the description information (such as a table identifier or a table name) of the input table (i.e. an input data table) and the description information (such as an output data table) designated by the data refreshing task, and acquiring the field description information (such as a field name and a field data type) of the input table and the output table by inquiring meta information (i.e. metadata information) of oracle.
402. Generating a monitoring index during refreshing of the mechanism data according to the data type;
in the step, a monitoring index is determined according to the analyzed output data table and the data type of the field of the output data table. In the output table, for which types of fields belong to fields having a total amount statistical meaning, which types of fields belong to fields having an average value statistical meaning, it may be defined in advance according to a monitoring index. For example, the total amount of the fields named "sales" or "sales" may be counted, and the average value of the fields named "temperature" or "air temperature" may be counted.
403. Extracting a field value of a specified field in the mechanism data, and judging the type of the monitoring index, wherein the type of the monitoring index comprises a first type monitoring index and a second type monitoring index;
in this embodiment, the monitoring index generated for the output table may include the number of data records of the output table, and the monitoring index is classified into two types, where the first type of monitoring index designates summing for the numerical fields having a total amount statistical meaning, and the second type of monitoring index designates averaging for the numerical fields having an average value statistical meaning.
404. If the monitoring index is a first type monitoring index, accumulating field values of specified fields in corresponding mechanism data, and taking the accumulated field values as monitoring values when the mechanism data are refreshed;
405. if the monitoring index is a first type monitoring index, calculating an average value of field values of designated fields in corresponding mechanism data, and taking the average value as a monitoring value when the mechanism data is refreshed;
in this embodiment, after the monitor indicator is obtained, the refresh data output by the ETL program is read, and statistics or/and operation is performed on the corresponding field of the refresh data according to the statistical mode specified by the monitor indicator, so as to obtain the result value of the monitor indicator. According to the generated monitoring index, the obtained monitoring result value may include: total number of data records of the refresh data, numerical accumulation sum of specified numerical fields, numerical average of specified numerical fields, and the like.
406. And comparing the monitoring value with a preset result value, determining mechanism data for refreshing errors, and generating a data refreshing report for display.
In this embodiment, the result value of the monitoring index may be output in a report form as a monitoring result, and the result value of the monitoring index may be compared with an expected result value according to a preset quality evaluation policy, so as to evaluate the data refresh quality, and the evaluation result may be displayed in the output report. The ETL program needs to meet the expected result value for data processing, and if the result value of the monitoring index is not consistent with the expected result value, the data refreshing process of the ETL program is problematic, and further correction is needed.
In the embodiment of the invention, the monitoring flow in the process of refreshing the ETL program data is described in detail, corresponding monitoring indexes are generated through different data types, the monitoring value of data refreshing is calculated according to the preset calculation mode of the different types of monitoring indexes, and the monitoring value is compared with the preset result value to determine whether an error occurs in the data refreshing process or not, so that the ETL program is optimized.
The method for refreshing the data slices in the embodiment of the present invention is described above, and the device for refreshing the data slices in the embodiment of the present invention is described below, referring to fig. 5, one embodiment of the device for refreshing the data slices in the embodiment of the present invention includes:
a configuration module 501, configured to configure a slice refresh parameter, and define an initialization task for data exchange, where the slice refresh parameter includes: initializing a task name, a mechanism code and a refreshing date;
a judging module 502, configured to judge, when a data refresh request of a client is monitored, whether the data refresh request points to the initialization task;
the splitting module 503 is configured to perform single-mechanism data refresh based on the initialization task if the initialization task is pointed to, and read a mechanism code and a refresh date in the fragment refresh parameter;
And the execution module 504 is configured to transmit the mechanism code and the refresh date into a preset running number script, execute the running number script, and load the mechanism data of the mechanism code corresponding to the refresh date into a cache of a corresponding service system.
In the embodiment of the invention, corresponding mechanism codes are set for different mechanisms, and the mechanism codes of the data needing to be refreshed are written into a configuration table and carried by an ETL initialization task, wherein the configuration table structure is presented in a secondary mechanism partition mode so as to support the data refreshing of a single mechanism; when the ETL program is called to execute a data refreshing task, determining whether to carry out single-mechanism data refreshing by judging whether the data refreshing task is the initialization task; when the single-mechanism data refreshing is carried out, the corresponding mechanism codes in the configuration table are transmitted into the running number script in the ETL program so as to execute the data refreshing task of the corresponding mechanism, generate the single-mechanism data, realize the data refreshing of the single mechanism and reduce the waste of computer resources.
Referring to fig. 6, another embodiment of the data slice refreshing apparatus according to the present invention includes:
a configuration module 501, configured to configure a slice refresh parameter, and define an initialization task for data exchange, where the slice refresh parameter includes: initializing a task name, a mechanism code and a refreshing date;
A judging module 502, configured to judge, when a data refresh request of a client is monitored, whether the data refresh request points to the initialization task;
the splitting module 503 is configured to perform single-mechanism data refresh based on the initialization task if the initialization task is pointed to, and read a mechanism code and a refresh date in the fragment refresh parameter;
and the execution module 504 is configured to transmit the mechanism code and the refresh date into a preset running number script, execute the running number script, and load the mechanism data of the mechanism code corresponding to the refresh date into a cache of a corresponding service system.
Specifically, the data slice refreshing device further includes:
if the initialization task is not pointed, refreshing the whole mechanism data, and setting the mechanism code as a preset code;
and transmitting the preset code and the refreshing date into the running number script, and executing the running number script to load all mechanism data corresponding to the refreshing date into a cache of a corresponding service system.
Specifically, before the configuration module, the method further includes:
a first modifying module 505, configured to modify a table structure in the cache for storing mechanism data into partitions according to a second level mechanism;
And a second modification module 506, configured to modify the processing code support receiving mechanism coding parameters, and store the processing code support receiving mechanism coding parameters in the secondary mechanism partition, so that the running number script is compatible with single-mechanism data refreshing and full-mechanism data refreshing.
Specifically, before the configuration module, the method further includes:
a creation module 507, configured to create a mechanism code configuration table in the metadata base, for storing mechanism code parameters that need to be refreshed.
Specifically, the data slice refreshing device further includes:
a determining module 508, configured to determine a data type of each organization data according to a field header of the organization data;
a generating module 509, configured to generate, according to the data type, a monitoring indicator when the institution data is refreshed;
the statistics module 510 is configured to perform statistics on specified fields in corresponding organization data according to the monitoring index, so as to obtain a monitoring value when the organization data is refreshed;
the comparison module 511 is configured to compare the monitored value with a preset result value, determine mechanism data for refreshing the error, and generate a data refreshing report for display.
Specifically, the execution module is further configured to:
executing the running number script, and searching a cache node and a memory fragment corresponding to the refreshing date corresponding to the mechanism code in a preset database;
Determining index information of the mechanism data to be loaded according to the cache node, and acquiring corresponding mechanism data from the corresponding memory fragments according to the index information;
and loading the organization data into a cache of a corresponding service system.
Specifically, the statistics module includes:
the extracting unit 5101 is configured to extract a field value of a specified field in the organization data, and determine a type of the monitoring index, where the type of the monitoring index includes a first type of monitoring index and a second type of monitoring index;
the accumulating unit 5102 is configured to accumulate field values of specified fields in corresponding mechanism data if the monitoring index is a first type monitoring index, and take the accumulated field values as monitoring values when the mechanism data is refreshed;
and an average value calculating unit 5103, configured to calculate an average value of field values of specified fields in corresponding organization data if the monitoring index is a first type monitoring index, and use the average value as a monitoring value when the organization data is refreshed.
Specifically, the single architecture data is stored in a blockchain.
In the embodiment of the invention, the process of executing the full-mechanism data refreshing by the ETL is introduced, the full-mechanism data refreshing is executed by the ETL program under the control of the special mechanism code for the full-mechanism data refreshing, and the data refreshing is stored in the corresponding partition in the modified cache space, so that the opening of a single channel for the full-mechanism refreshing is realized; the method comprises the steps of providing a secondary mechanism partition for modifying a table structure of storage mechanism data, storing processing codes of the modified adaptive single mechanism refreshing and the full mechanism refreshing to the secondary mechanism partition, and creating a mechanism coding configuration table to realize the logic that the ETL can simultaneously support the single mechanism data refreshing and the full mechanism data refreshing; the monitoring flow in the ETL program data refreshing process is further described in detail, corresponding monitoring indexes are generated through different data types, the monitoring value of the data refreshing is calculated according to the preset calculation mode of the different types of monitoring indexes, and the monitoring value is compared with the preset result value to determine whether an error occurs in the data refreshing process or not, so that the ETL program is optimized.
The data slice refreshing device in the embodiment of the present invention is described in detail above in terms of the modularized functional entity in fig. 5 and fig. 6, and the data slice refreshing device in the embodiment of the present invention is described in detail below in terms of hardware processing.
Fig. 7 is a schematic structural diagram of a data slice refreshing device according to an embodiment of the present invention, where the data slice refreshing device 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 710 (e.g., one or more processors) and a memory 720, and one or more storage mediums 730 (e.g., one or more mass storage devices) storing application programs 733 or data 732. Wherein memory 720 and storage medium 730 may be transitory or persistent. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations in the data slice refresh device 700. Still further, processor 710 may be configured to communicate with storage medium 730 and to perform a series of instruction operations in storage medium 730 on data slice refresh device 700.
The data slice refresh apparatus 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input/output interfaces 760, and/or one or more operating systems 731, such as Windows Serve, mac OS X, unix, linux, freeBSD, etc. It will be appreciated by those skilled in the art that the data slice refresh device structure shown in fig. 7 is not limiting of the data slice refresh device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and may also be a volatile computer readable storage medium, where instructions are stored in the computer readable storage medium, which when executed on a computer, cause the computer to perform the steps of the data slice refreshing method.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The data slice refreshing method is characterized by comprising the following steps of:
configuring a slice refreshing parameter and defining an initialization task for data exchange, wherein the slice refreshing parameter comprises: initializing a task name, a mechanism code and a refreshing date; the initialization task refers to the rule of extracting, converting and loading definition data and the operation type parameter of newly added definition data refreshing;
when a data refreshing request of a client is monitored, judging whether the data refreshing request points to the initialization task or not;
if the initialization task is pointed, refreshing the single-mechanism data based on the initialization task, and reading the mechanism code and the refreshing date in the fragment refreshing parameter;
And transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script, and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
2. The method according to claim 1, wherein when the data refresh request of the client is monitored, determining whether the data refresh request is directed to the initialization task further comprises:
if the initialization task is not pointed, refreshing the whole mechanism data, and setting the mechanism code as a preset code;
and transmitting the preset code and the refreshing date into the running number script, and executing the running number script to load all mechanism data corresponding to the refreshing date into a cache of a corresponding service system.
3. The data burst refresh method of claim 1, further comprising, prior to said configuring the burst refresh parameters and defining the initialization tasks for data exchange:
modifying a table structure for storing the mechanism data in the cache to be partitioned according to a secondary mechanism;
and modifying the processing code support receiving mechanism coding parameters and storing the processing code support receiving mechanism coding parameters into a secondary mechanism partition so that the running number script is compatible with single-mechanism data refreshing and full-mechanism data refreshing.
4. The data burst refresh method of claim 1, further comprising, prior to said configuring the burst refresh parameters and defining the initialization tasks for data exchange:
an organization code configuration table is created in the metadata base for storing organization code parameters that need to be refreshed.
5. The method of claim 1, wherein executing the running script, loading the organization data of the organization code corresponding to the refresh date into a cache of the corresponding business system comprises:
executing the running number script, and searching a cache node and a memory fragment corresponding to the refreshing date corresponding to the mechanism code in a preset database;
determining index information of the mechanism data to be loaded according to the cache node, and acquiring corresponding mechanism data from the corresponding memory fragments according to the index information;
and loading the organization data into a cache of a corresponding service system.
6. The data slice refresh method of any one of claims 1-5, further comprising:
determining the data type of each organization data according to the field header of the organization data;
Generating a monitoring index during refreshing of the mechanism data according to the data type;
counting appointed fields in corresponding mechanism data according to the monitoring indexes to obtain monitoring values when the mechanism data are refreshed;
and comparing the monitoring value with a preset result value, determining mechanism data for refreshing errors, and generating a data refreshing report for display.
7. The method of refreshing data fragments according to claim 6, wherein the counting specified fields in corresponding organization data according to the monitoring index to obtain the monitoring value when refreshing the organization data comprises:
extracting a field value of a specified field in the mechanism data, and judging the type of the monitoring index, wherein the type of the monitoring index comprises a first type monitoring index and a second type monitoring index;
if the monitoring index is a first type monitoring index, accumulating field values of specified fields in corresponding mechanism data, and taking the accumulated field values as monitoring values when the mechanism data are refreshed;
if the monitoring index is the first type monitoring index, calculating an average value of field values of designated fields in corresponding organization data, and taking the average value as a monitoring value when the organization data is refreshed.
8. A data slice refreshing apparatus, characterized in that the data slice refreshing apparatus comprises:
the configuration module is used for configuring the slice refreshing parameters and defining an initialization task for data exchange, wherein the slice refreshing parameters comprise: initializing a task name, a mechanism code and a refreshing date; the initialization task refers to the rule of extracting, converting and loading definition data and the operation type parameter of newly added definition data refreshing;
the judging module is used for judging whether the data refreshing request points to the initialization task or not when the data refreshing request of the client is monitored;
the shunting module is used for refreshing the single-mechanism data based on the initialization task if the shunting module points to the initialization task, and reading the mechanism code and the refreshing date in the fragmentation refreshing parameter;
and the execution module is used for transmitting the mechanism code and the refreshing date into a preset running number script, executing the running number script and loading the mechanism data of the mechanism code corresponding to the refreshing date into a cache of a corresponding service system.
9. A data slice refresh device, the data slice refresh device comprising: a memory and at least one processor, the memory having instructions stored therein;
The at least one processor invoking the instructions in the memory to cause the data burst refresh device to perform the data burst refresh method of any of claims 1-7.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the data fragment refreshing method according to any of claims 1-7.
CN202110036533.5A 2021-01-12 2021-01-12 Data fragment refreshing method, device, equipment and storage medium Active CN112749197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110036533.5A CN112749197B (en) 2021-01-12 2021-01-12 Data fragment refreshing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110036533.5A CN112749197B (en) 2021-01-12 2021-01-12 Data fragment refreshing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112749197A CN112749197A (en) 2021-05-04
CN112749197B true CN112749197B (en) 2024-04-05

Family

ID=75650782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110036533.5A Active CN112749197B (en) 2021-01-12 2021-01-12 Data fragment refreshing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112749197B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111367860A (en) * 2018-12-26 2020-07-03 北京奇虎科技有限公司 File refreshing method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101540726A (en) * 2009-04-27 2009-09-23 华为技术有限公司 Method, client, server and system of synchronous data
CN110727738A (en) * 2019-12-19 2020-01-24 北京江融信科技有限公司 Global routing system based on data fragmentation, electronic equipment and storage medium
CN111264079A (en) * 2019-11-14 2020-06-09 深圳市汇顶科技股份有限公司 Data transmission method, electronic device, system and storage medium
CN111414392A (en) * 2020-03-25 2020-07-14 浩鲸云计算科技股份有限公司 Cache asynchronous refresh method, system and computer readable storage medium
CN111460024A (en) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 Real-time service system based on Elasticissearch
CN111913989A (en) * 2020-06-15 2020-11-10 东风日产数据服务有限公司 Distributed application cache refreshing system and method, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10437708B2 (en) * 2017-01-26 2019-10-08 Bank Of America Corporation System for refreshing and sanitizing testing data in a low-level environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101540726A (en) * 2009-04-27 2009-09-23 华为技术有限公司 Method, client, server and system of synchronous data
CN111264079A (en) * 2019-11-14 2020-06-09 深圳市汇顶科技股份有限公司 Data transmission method, electronic device, system and storage medium
CN110727738A (en) * 2019-12-19 2020-01-24 北京江融信科技有限公司 Global routing system based on data fragmentation, electronic equipment and storage medium
CN111414392A (en) * 2020-03-25 2020-07-14 浩鲸云计算科技股份有限公司 Cache asynchronous refresh method, system and computer readable storage medium
CN111460024A (en) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 Real-time service system based on Elasticissearch
CN111913989A (en) * 2020-06-15 2020-11-10 东风日产数据服务有限公司 Distributed application cache refreshing system and method, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112749197A (en) 2021-05-04

Similar Documents

Publication Publication Date Title
US20230126005A1 (en) Consistent filtering of machine learning data
US11379755B2 (en) Feature processing tradeoff management
US20220391763A1 (en) Machine learning service
US10713589B1 (en) Consistent sort-based record-level shuffling of machine learning data
US11182691B1 (en) Category-based sampling of machine learning data
US10366053B1 (en) Consistent randomized record-level splitting of machine learning data
US11100420B2 (en) Input processing for machine learning
US10318882B2 (en) Optimized training of linear machine learning models
US9886670B2 (en) Feature processing recipes for machine learning
EP3161635B1 (en) Machine learning service
US10339465B2 (en) Optimized decision tree based models
US10824758B2 (en) System and method for managing enterprise data
US9411712B2 (en) Generating test data
US9772890B2 (en) Sophisticated run-time system for graph processing
CN101405728B (en) Relational database architecture with dynamic load capability
CN110275861B (en) Data storage method and device, storage medium and electronic device
US10324710B2 (en) Indicating a trait of a continuous delivery pipeline
US20150254474A1 (en) Generation of analysis reports using trusted and public distributed file systems
CA3167981A1 (en) Offloading statistics collection
CN112749197B (en) Data fragment refreshing method, device, equipment and storage medium
CN112100219B (en) Report generation method, device, equipment and medium based on database query processing
CN110928941B (en) Data fragment extraction method and device
US8538792B1 (en) Method and system for determining total cost of ownership
CN112364007B (en) Mass data exchange method, device, equipment and storage medium based on database
CN117539830A (en) Log generation method and device for APS system and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant