CN116910075A - Data warehouse source table changing method, system, equipment and storage medium - Google Patents

Data warehouse source table changing method, system, equipment and storage medium Download PDF

Info

Publication number
CN116910075A
CN116910075A CN202310884052.9A CN202310884052A CN116910075A CN 116910075 A CN116910075 A CN 116910075A CN 202310884052 A CN202310884052 A CN 202310884052A CN 116910075 A CN116910075 A CN 116910075A
Authority
CN
China
Prior art keywords
comparison
buffer layer
content
source
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310884052.9A
Other languages
Chinese (zh)
Inventor
谢冬玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202310884052.9A priority Critical patent/CN116910075A/en
Publication of CN116910075A publication Critical patent/CN116910075A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data warehouse source table changing method, a system, equipment and a storage medium. Can be applied to the technical field of big data or the financial field. The method comprises the following steps: determining corresponding comparison and adjustment strategies according to the priorities of the technical buffer layer tables; the comparison strategy comprises comparison frequency and comparison content; obtaining table structure information of a source data issuing platform corresponding to the time node according to the comparison frequency of the buffer layer tables of each technology; comparing the comparison content in the table structure information of each technical buffer layer with the corresponding content in the table structure information of the source data issuing platform to obtain a comparison result; generating a corresponding execution script according to the comparison result; and executing the execution script to complete the change of the source table of the data warehouse. According to the method, according to corresponding comparison strategies, table structure information of a source data issuing platform is acquired at regular time and is automatically compared with a source table of a technical buffer layer, and the source table is updated in time.

Description

Data warehouse source table changing method, system, equipment and storage medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a method, a system, an apparatus, and a storage medium for changing a source table of a data warehouse.
Background
With the development of science and technology, data lakes are increasingly used as a data repository for storing and managing large-scale data in the production tasks of various enterprises.
When a large number of production tasks are carried out on the data lake, a source data issuing platform (Data Center Data Source, DCDS) issues source tables to the data warehouse, the modeling level of the data warehouse is divided into a technical buffer layer, an integration model layer and a result layer (comprising a common processing layer, an application computing layer and an application interface layer), and the technical buffer layer tables are source tables accessed to the data warehouse.
However, the source data issuing platform frequently changes the table structure, because the table structure is changed and synchronized to be manually processed, the efficiency is low, when the table structure is changed, the source table structure accessed by the technical buffer layer of the data warehouse often cannot be changed in time and synchronously, so that the subsequent data processing is in error, a great deal of manpower and time are consumed, and the production efficiency of the data lake is low.
Disclosure of Invention
Based on the problems, the application provides a method, a system, equipment and a storage medium for changing a data warehouse source table, so that the table structure of the data warehouse source table and a synchronous source data issuing platform can be changed in time, and the production efficiency of data lake tasks can be improved.
The application discloses the following technical scheme:
the first aspect of the present application provides a method for changing a source table of a data warehouse, comprising:
determining corresponding comparison and adjustment strategies according to the priorities of the technical buffer layer tables; the comparison strategy comprises comparison frequency and comparison content;
obtaining table structure information of a source data issuing platform of a corresponding time node according to the comparison frequency of each technical buffer layer table;
comparing the comparison content in the table structure information of each technical buffer layer with the corresponding content in the table structure information of the source data issuing platform to obtain a comparison result;
generating a corresponding execution script according to the comparison result;
and executing the execution script to complete the change of the source table of the data warehouse.
In one possible implementation manner, the determining the corresponding comparison and adjustment policy according to the priority of the buffer layer tables of each technology includes:
according to the technology, the priority of the buffer layer table is high, and the comparison scheduling policy is determined to be that any two preset time nodes in each small period are subjected to content first granularity comparison, and each unit time is subjected to second granularity comparison;
according to the priority of the technical buffer layer table as the medium priority, determining to compare the scheduling policy to perform content first granularity comparison for any one preset time node in each small period, and performing second granularity comparison in each unit time;
determining to compare the scheduling policy according to the low priority of the technical buffer layer table, wherein the content first granularity comparison is carried out on any one preset time node in each large period, and the second granularity comparison is carried out in each unit time; the content is compared with the content with the first granularity and the content with the second granularity is redundant;
the small period includes a plurality of unit times, and each large period includes a plurality of small periods.
In one possible implementation, the first granularity alignment of the content includes:
field name comparison and field type comparison;
the content second granularity comparison includes: and comparing the number of the fields.
In one possible implementation, the method further includes: determining the priority of each technical buffer layer table according to the application type of each result layer table and the interlayer relation of the table; the interlayer relation of the tables is the data association relation among tables of different layers.
In one possible implementation manner, the determining the priority of each technology buffer layer table according to the application type of each result layer table and the interlayer relation of the table includes:
determining the application type of each integrated model layer table according to the application type of each result layer table and the interlayer relation of the tables; the integrated model layer is a front layer of the result layer; the technical buffer layer is a front layer of the integrated model layer;
determining the use frequency of each technical buffer layer table according to the application type of each integrated model layer table and the interlayer relation of the table;
and determining the corresponding priority according to the use frequency of each technical buffer layer table.
In one possible implementation manner, the generating a corresponding execution script according to the comparison result includes:
generating an execution script for modifying the table structure according to the comparison result of the table structure modification;
and generating an execution script of the new table structure according to the comparison result of the new table structure.
In one possible implementation, the method further includes: and sending the comparison result to a user through a mail.
A second aspect of the present application provides a data warehouse source form altering system, comprising:
the comparison and adjustment strategy determining module is used for determining a corresponding comparison and adjustment strategy according to the priority of each technical buffer layer table; the comparison strategy comprises comparison frequency and comparison content;
the acquisition module is used for acquiring the table structure information of the source data issuing platform of the corresponding time node according to the comparison frequency of the technical buffer layer tables;
the comparison module is used for comparing the comparison content in the table structure information of each technical buffer layer with the corresponding content in the table structure information of the source data issuing platform to obtain a comparison result;
the script generation module is used for generating a corresponding execution script according to the comparison result;
and the change module is used for executing the execution script to complete the change of the data warehouse source table.
A third aspect of the present application provides a data warehouse source table alteration apparatus, comprising: the data warehouse source table changing method comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the data warehouse source table changing method according to the first aspect of the application when executing the computer program.
A fourth aspect of the present application provides a computer readable storage medium having instructions stored therein which, when executed on a terminal device, cause the terminal device to perform a data warehouse source table modification method according to the first aspect of the present application.
Compared with the prior art, the application has the following beneficial effects:
according to the method, table structure information of the source data issuing platform is obtained at regular time according to the scheduling strategies according to the corresponding comparison strategies according to different priorities, the table structure information is automatically compared with the source table of the technical buffer layer, corresponding execution scripts are generated and executed according to comparison results, the source table of the data warehouse is automatically changed, the source table accessed by the data warehouse is timely synchronized with the table structure change of the source data issuing platform, and the production efficiency of the data lake is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a diagram of a data processing link according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for changing a source table of a data warehouse according to an embodiment of the present application;
FIG. 3 is a diagram of a data processing link within a data warehouse in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram of an alignment strategy according to an embodiment of the present application;
FIG. 5 is a diagram of a system for changing a source table of a data warehouse according to an embodiment of the present application;
FIG. 6 is a diagram of another system for modifying a source table of a data warehouse according to an embodiment of the present application;
fig. 7 is a block diagram of a computer device according to an embodiment of the present application.
Detailed Description
As described above, with the development of science and technology, data lakes are increasingly used as a data repository for storing and managing large-scale data in the production tasks of various enterprises.
When a data lake performs a large number of production tasks, the source data issuing platform issues the source table to the data warehouse, and fig. 1 is a data processing link diagram according to an embodiment of the present application. As shown in fig. 1, the data warehouse is hierarchically differentiated into a technology buffer layer, an integrated model layer and a result layer (comprising a common processing layer, an application computing layer and an application interface layer), wherein the technology buffer layer is used for collecting source tables to the data warehouse; the integration model layer is used for carrying out thematic and normalized processing on the source table data of the technical buffer layer; the common processing layer is used for data preprocessing; the application calculation layer is used for carrying out derivative calculation on each service data by utilizing the data preprocessed by the common processing layer; the application interface layer is used for providing the data obtained by the calculation of the application calculation layer for each corresponding application scene.
However, the source data issuing platform frequently changes the table structure, because the table structure is changed and synchronized to be manually processed, the efficiency is low, when the table structure is changed, the source table structure accessed by the technical buffer layer of the data warehouse often cannot be changed in time and synchronously, so that the subsequent data processing is in error, a great deal of manpower and time are consumed, and the production efficiency of the data lake is low.
In view of the above, the embodiment of the application provides a method for changing a source table of a data warehouse, which determines corresponding comparison frequency and comparison content according to priorities of buffer layer tables of various technologies, and acquires DCDS table structure information of corresponding time nodes according to the comparison frequency of the buffer layer tables of various technologies; comparing the comparison content of each technical buffer layer table with the corresponding content in the DCDS table structure information to obtain a comparison result; generating a corresponding execution script according to the comparison result; executing the execution script. According to the method, table structure information of a source data issuing platform is obtained at regular time according to corresponding comparison strategies according to different priorities, automatic comparison of a technical buffer layer table and a source table is carried out, corresponding execution scripts are generated and executed according to comparison results, data warehouse source table change is automatically completed, the data warehouse source table is timely synchronized with the table structure change of the source data issuing platform, and production efficiency of a data lake is improved.
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 2, a flowchart of a method for changing a source table of a data warehouse according to an embodiment of the present application is shown. As shown in fig. 2, the method includes:
s201, determining a corresponding comparison strategy according to the priority of each technical buffer layer table; the comparison strategy comprises comparison frequency and comparison content;
priority levels may be classified as low, medium, and high. Different priority levels adopt different comparison strategies, and the comparison frequency is the comparison frequency in a period of time. The comparison content is a specific comparison item. The comparison frequency and comparison content of different priorities are different, which is beneficial to reasonably scheduling the calculation resources for comparison.
In one possible implementation manner, the method for determining the priority of each technology buffer layer table includes: and determining the priority of each technical buffer layer table according to the application type of each result layer table and the interlayer relation of the table.
The method for determining the priority of the buffer layer table of each technology specifically comprises the following steps:
determining the application type of each integrated model layer table according to the application type of each result layer table and the interlayer relation of the tables; determining the use frequency of each technical buffer layer table according to the application type of the integrated model layer table and the interlayer relation of the table; and determining the corresponding priority according to the use frequency of the buffer layer tables of each technology. The integrated model layer is a front layer of the result layer; the technical buffer layer is a front layer of the integrated model layer; the interlayer relation of the tables is the data association relation among tables of different layers.
Fig. 3 is a diagram of a data link between tables of each layer in a data warehouse according to an embodiment of the present application. As shown in fig. 3, the frequency with which the technology buffer layer table 1 is used is the largest, the frequency with which the technology buffer layer table 2 is used is next largest, and the frequency with which the technology buffer layer table 4 is used is the lowest. In determining the frequencies at which each technology buffer layer table is used, it is necessary to determine from the subsequent tables, i.e., the types of each association table of the result layer and the integration model layer.
The results layer table can be divided into three types:
type one: the table structure applies to products or functions that are already online and requires the data of the table structure to be provided to an associated external platform (e.g., an associated enterprise or product).
Type two: the table structure applies to products or functions that are already online, but does not need to provide data of the table structure to an associated external platform.
Type three: the table structure is applied to products or functions that are not on-line (in development or after development is completed and not in use), and the data of the table structure does not need to be provided to an associated external platform.
According to the association relationship between each result layer table and each integration model layer table, the integration model layers can be divided into the three types according to the same division principle. According to the association relation between each T table and each integrated model layer table, the technical buffer layers can be divided into the three types according to the same division principle, the types of the technical buffer layer tables can determine the use frequency of each technical buffer layer table, and different priorities can be defined according to the use frequency, for example, three priorities of high, medium and low are defined in sequence from high to low according to the use frequency. This embodiment employs the MRU (Most Recently Used, most commonly used recently) algorithm to support the alignment strategy.
In one possible implementation, S201 includes:
according to the priority of the buffer layer table of each technology as high priority, determining to compare the scheduling policy to perform content first granularity comparison for any two preset time nodes in each small period, and performing second granularity comparison in each unit time;
according to the priority of the buffer layer table of each technology as the medium priority, determining to compare the scheduling policy to perform content first granularity comparison for any one preset time node in each small period, and performing second granularity comparison in each unit time;
determining to compare the scheduling policy according to the low priority of the buffer layer table of each technology, wherein the content first granularity comparison is carried out on any one preset time node in each large period, and the second granularity comparison is carried out in each unit time; the content is compared with the content with the first granularity and the content with the second granularity is redundant;
the small period includes a plurality of unit times, and each large period includes a plurality of small periods.
In one possible implementation, the first granularity alignment of the content includes:
field name comparison and field type comparison;
the content second granularity comparison includes: and comparing the number of the fields.
Fig. 4 is a schematic diagram of an alignment policy according to an embodiment of the present application. As shown in fig. 4, the unit time is one day, the small period is one week, and the large period is one month. For the technical buffer layer table of all priorities, the comparison of the number of fields is carried out every day. The technical buffer layer table with high priority is compared twice every week, the comparison time can be Tuesday or Saturday, and the comparison of field names, field types and the like is carried out. The technical buffer layer table with the priority is compared once every week, the comparison time can be Tuesday, and the field name comparison and the field type comparison are carried out. The technical buffer layer table with low priority is compared once a month, and the comparison time can be the first sunday of each month, and the field name comparison and the field type comparison are carried out.
S202, obtaining table structure information of a source data distribution platform (DCDS) of a corresponding time node according to comparison frequencies of buffer layer tables of various technologies;
in one example, for a high priority technical buffer layer table, the comparison time can be wednesday and wednesday twice a week, and then the table structure information of the source data distribution platform (DCDS) is acquired every wednesday and wednesday for field name comparison and field type comparison.
According to the embodiment, the table structure information of the source data distribution platform (DCDS) is acquired at fixed time according to the comparison frequency, so that the updating condition of the DCDS can be found in time.
S203, comparing the comparison content of each technical buffer layer table with the corresponding content in the table structure information of the DCDS to obtain a comparison result;
in one example, the baselines of the data warehouse technology buffer layers are merged to facilitate retrieval of the technology buffer layer tables.
In one possible implementation, the method further includes: and sending the comparison result to a user through a mail.
In one example, the table names which do not pass the test are exported and stored in an excel table, the detailed comparison result of each table with inconsistent comparison results generates an independent sheet, and the comparison result excel table is sent to the user through mail.
S204, generating a corresponding execution script according to the comparison result;
in one possible implementation, S204 includes:
generating an execution script for modifying the table structure according to the comparison result of the table structure modification;
and generating an execution script of the new table structure according to the comparison result of the new table structure.
In one example, for the comparison result of the new added table structure, embedding an ETL development engine based on a PYTHON technology stack, and developing and generating the required script configuration again according to the new DCDS table structure information; for the comparison result of table structure modification, since the new field bit of the new table structure is generally placed at the last of the table structure, only the fields which are newly added need to be identified to generate the alter statement in a self-adaptive manner, which is as follows: "the table name of the alter table is the name of the add column dcds field name dcds field type comment 'dcds field Chinese name' after technology buffer layer last field name".
S205, executing the execution script to complete the change of the data warehouse source table.
The embodiment of the application determines the corresponding comparison scheduling strategy based on the priority of the technical buffer layer, and periodically compares the technical buffer layer table structure information with the DCDS table structure information according to the comparison strategy so as to acquire the change condition of the DCDS table structure information according to the comparison result, generate the execution script, and perform timely synchronous change, thereby realizing rapid change and synchronization and improving the production efficiency of the data lake.
In one example, the second granularity alignment, i.e., the core sql pseudocode that aligns the number of fields, is:
select p1.Table_name| 'DCDS', count (x) as numfrom stores the library p1 sphere p1.Table_name like 'table name' group by 1 of DCDS table structure
fulljoin
select p2.Table_name|| 'T', count (x) as num from store bin T table structured library p2 sphere p2.Table_name like 'table name' group by 1
on p1.table_name=p2.table_name;
Referring to fig. 5, the system is a structure diagram of a data warehouse source table modification system provided by the present application, as shown in fig. 5, the system includes:
the comparison and adjustment policy determining module 510 is configured to determine a corresponding comparison and adjustment policy according to the priority of each technology buffer layer table; the comparison strategy comprises comparison frequency and comparison content;
the obtaining module 520 is configured to obtain DCDS table structure information of a corresponding time node according to the comparison frequency of the buffer layer tables of each technology;
a comparison module 530, configured to compare the comparison content of each technical buffer layer table with the corresponding content in the DCDS table structure information to obtain a comparison result;
the script generating module 540 is configured to generate a corresponding execution script according to the comparison result;
a change module 550, configured to execute the execution script to complete the change of the source table of the data warehouse.
The embodiment of the application determines the corresponding comparison scheduling strategy based on the priority of the technical buffer layer, and periodically compares the technical buffer layer table structure information with the DCDS table structure information according to the comparison strategy so as to acquire the change condition of the DCDS table structure information according to the comparison result, generate the execution script, and perform timely synchronous change, thereby realizing rapid change and synchronization and improving the production efficiency of the data lake.
In one possible implementation manner, the comparison scheduling policy determining module 510 is specifically configured to determine, according to the priority of the buffer layer table of each technology being a high priority, that the comparison scheduling policy performs content first granularity comparison for any two preset time nodes in each small period, and performs second granularity comparison for each unit time; according to the priority of the buffer layer table of each technology as the medium priority, determining to compare the scheduling policy to perform content first granularity comparison for any one preset time node in each small period, and performing second granularity comparison in each unit time; determining to compare the scheduling policy according to the low priority of the buffer layer table of each technology, wherein the content first granularity comparison is carried out on any one preset time node in each large period, and the second granularity comparison is carried out in each unit time; the small period includes a plurality of unit times, and each large period includes a plurality of small periods.
In one possible implementation, the first granularity alignment of the content includes:
field name comparison and field type comparison;
the content second granularity comparison includes: and comparing the number of the fields.
Referring to fig. 6, the system is a structural diagram of another data warehouse source table modification system according to an embodiment of the present application, as shown in fig. 6, where the system further includes:
the priority determining module is used for determining the priority of each technical buffer layer table according to the application type of each result layer table and the interlayer relation of the table; the interlayer relation of the tables is the association relation between tables of different layers.
And the result sending module is used for sending the comparison result to the user through a mail.
In one possible implementation manner, the priority determining module is specifically configured to determine an application type of each integrated model layer table according to an application type of each result layer table and an interlayer relationship of the table; determining the use frequency of each technical buffer layer table according to the application type of the integrated model layer table and the interlayer relation of the table; and determining the corresponding priority according to the use frequency of the buffer layer tables of each technology. The interlayer relation of the tables is the data association relation among tables of different layers.
In one possible implementation manner, the script generation module is specifically configured to generate an execution script for modifying the table structure according to the comparison result modified by the table structure; and generating an execution script of the new table structure according to the comparison result of the new table structure.
The embodiment of the application provides a computer readable storage medium, wherein instructions are stored in the computer readable storage medium, and when the instructions run on terminal equipment, the terminal equipment is caused to execute a data warehouse source table changing method provided by the embodiment of the application.
In practical applications, the computer-readable storage medium may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this embodiment, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
As shown in fig. 7, a schematic structural diagram of a computer device is provided in an embodiment of the present application. The computer device can be applied to data warehouse source table changing method equipment. The computer device 12 shown in fig. 7 is only an example and should not be construed as limiting the functionality and scope of use of embodiments of the application.
As shown in fig. 7, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, commonly referred to as a "hard disk drive"). Although not shown in fig. 7, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the application.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown in fig. 7, the network adapter 20 communicates with other modules of the computer device 12 via the bus 18. It should be appreciated that although not shown in fig. 7, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processor unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement the data warehouse source table modification method provided by the embodiment of the present application.
The method, the system, the equipment and the storage medium for changing the source table of the data warehouse provided by the application can be used in the field of big data or the field of finance, and the method, the system, the equipment and the storage medium for changing the source table of the data warehouse are only examples and are not limited in application field.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The above-described apparatus and system embodiments are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements illustrated as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.
The foregoing is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the technical scope of the present application should be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (10)

1. A method for changing a source table of a data warehouse, comprising:
determining corresponding comparison and adjustment strategies according to the priorities of the technical buffer layer tables; the comparison strategy comprises comparison frequency and comparison content;
obtaining table structure information of a source data issuing platform of a corresponding time node according to the comparison frequency of each technical buffer layer table;
comparing the comparison content in the table structure information of each technical buffer layer with the corresponding content in the table structure information of the source data issuing platform to obtain a comparison result;
generating a corresponding execution script according to the comparison result;
and executing the execution script to complete the change of the source table of the data warehouse.
2. The method of claim 1, wherein said determining a corresponding alignment policy based on the priority of each technology buffer layer table comprises:
according to the technology, the priority of the buffer layer table is high, and the comparison scheduling policy is determined to be that any two preset time nodes in each small period are subjected to content first granularity comparison, and each unit time is subjected to second granularity comparison;
according to the priority of the technical buffer layer table as the medium priority, determining to compare the scheduling policy to perform content first granularity comparison for any one preset time node in each small period, and performing second granularity comparison in each unit time;
determining to compare the scheduling policy according to the low priority of the technical buffer layer table, wherein the content first granularity comparison is carried out on any one preset time node in each large period, and the second granularity comparison is carried out in each unit time; the content is compared with the content with the first granularity and the content with the second granularity is redundant;
the small period includes a plurality of unit times, and each large period includes a plurality of small periods.
3. The method of claim 2, wherein the content first granularity alignment comprises:
field name comparison and field type comparison;
the content second granularity comparison includes: and comparing the number of the fields.
4. The method according to claim 1, wherein the method further comprises: determining the priority of each technical buffer layer table according to the application type of each result layer table and the interlayer relation of the table; the interlayer relation of the tables is the data association relation among tables of different layers.
5. The method of claim 4, wherein determining the priority of each of the technology buffer layer tables according to the application type of each of the result layer tables and the inter-layer relationship of the tables, comprises:
determining the application type of each integrated model layer table according to the application type of each result layer table and the interlayer relation of the tables; the integrated model layer is a front layer of the result layer; the technical buffer layer is a front layer of the integrated model layer;
determining the use frequency of each technical buffer layer table according to the application type of each integrated model layer table and the interlayer relation of the table;
and determining the corresponding priority according to the use frequency of each technical buffer layer table.
6. The method of claim 1, wherein generating the corresponding execution script according to the comparison result comprises:
generating an execution script for modifying the table structure according to the comparison result of the table structure modification; and/or the number of the groups of groups,
and generating an execution script of the new table structure according to the comparison result of the new table structure.
7. The method according to any one of claims 1-6, further comprising: and sending the comparison result to a user through a mail.
8. A data warehouse source table modification system, comprising:
the comparison and adjustment strategy determining module is used for determining a corresponding comparison and adjustment strategy according to the priority of each technical buffer layer table; the comparison strategy comprises comparison frequency and comparison content;
the acquisition module is used for acquiring the table structure information of the source data issuing platform of the corresponding time node according to the comparison frequency of the technical buffer layer tables;
the comparison module is used for comparing the comparison content in the table structure information of each technical buffer layer with the corresponding content in the table structure information of the source data issuing platform to obtain a comparison result;
the script generation module is used for generating a corresponding execution script according to the comparison result;
and the change module is used for executing the execution script to complete the change of the data warehouse source table.
9. A data warehouse source table alteration device, comprising: a processor, memory, system bus;
the processor and the memory are connected through the system bus;
the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the data warehouse source table altering method of any of claims 1-6.
10. A computer readable storage medium having instructions stored therein which, when executed on a terminal device, cause the terminal device to perform the data warehouse source table altering method of any of claims 1-6.
CN202310884052.9A 2023-07-18 2023-07-18 Data warehouse source table changing method, system, equipment and storage medium Pending CN116910075A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310884052.9A CN116910075A (en) 2023-07-18 2023-07-18 Data warehouse source table changing method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310884052.9A CN116910075A (en) 2023-07-18 2023-07-18 Data warehouse source table changing method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116910075A true CN116910075A (en) 2023-10-20

Family

ID=88364304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310884052.9A Pending CN116910075A (en) 2023-07-18 2023-07-18 Data warehouse source table changing method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116910075A (en)

Similar Documents

Publication Publication Date Title
US20210374610A1 (en) Efficient duplicate detection for machine learning data sets
US11816100B2 (en) Dynamically materialized views for sheets based data
EP3161635B1 (en) Machine learning service
CN103608809B (en) Recommending data is enriched with
US8677366B2 (en) Systems and methods for processing hierarchical data in a map-reduce framework
US9274782B2 (en) Automated computer application update analysis
CN111709527A (en) Operation and maintenance knowledge map library establishing method, device, equipment and storage medium
CN106980669A (en) A kind of storage of data, acquisition methods and device
US20120158742A1 (en) Managing documents using weighted prevalence data for statements
US20110271145A1 (en) Efficient failure detection for long running data transfer jobs
US20170212930A1 (en) Hybrid architecture for processing graph-based queries
CN112860777B (en) Data processing method, device and equipment
CN113204425B (en) Method, device, electronic equipment and storage medium for process management internal thread
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
CN112363914B (en) Parallel test resource allocation optimizing method, computing device and storage medium
CN116910075A (en) Data warehouse source table changing method, system, equipment and storage medium
CN115391361A (en) Real-time data processing method and device based on distributed database
US20120159247A1 (en) Automatically changing parts in response to tests
CN112148461A (en) Application scheduling method and device
CN106844242B (en) A kind of method for interchanging data and system
CN112799954B (en) Method, apparatus and computer readable medium for quickly constructing test environment
US11914598B2 (en) Extended synopsis pruning in database management systems
US20230401055A1 (en) Contextualization of code development
CN107885834A (en) A kind of Hadoop big datas component uniformly verifies system
US20220100750A1 (en) Data shape confidence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination