CN107480235A - A kind of database framework of data platform - Google Patents

A kind of database framework of data platform Download PDF

Info

Publication number
CN107480235A
CN107480235A CN201710670614.4A CN201710670614A CN107480235A CN 107480235 A CN107480235 A CN 107480235A CN 201710670614 A CN201710670614 A CN 201710670614A CN 107480235 A CN107480235 A CN 107480235A
Authority
CN
China
Prior art keywords
data
layer
model
application
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710670614.4A
Other languages
Chinese (zh)
Inventor
周杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201710670614.4A priority Critical patent/CN107480235A/en
Publication of CN107480235A publication Critical patent/CN107480235A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management

Abstract

The present invention relates to database, it discloses a kind of database framework of data platform, solve database in conventional art to design because application oriented model and source data Model coupling are too high, if user's table model changes, causing directly to use user's table to do the program calculated in all data platforms will be forced to change, the problem of producing flood tide computing.Database framework in the present invention, including:Data buffering layer, Data Integration layer and data application layer;The data buffering layer, as the interim storage layer of initial data, it is responsible for the data in access patch source, data is not done with any processing;The Data Integration layer, for providing the data model of standard stabiliser, providing standardized data and combined data;The data application layer, is application oriented data processing and accumulation layer, and can be with on-demand customization.

Description

A kind of database framework of data platform
Technical field
The present invention relates to database, and in particular to a kind of database framework of data platform.
Background technology
Three strict big normal forms (1NF, 2NF, 3NF) are observed in traditional database design, it is intended to reduce the redundancy of database With guarantee data consistency.But if data volume is bigger, the dimension of data is disperseed in different data models, to obtain this The indication information of a little dimensions, can become very difficult.For example we will obtain the transaction total amount of an electric business website different regions Long-term change trend figure, user's table and order table are at this moment accessed, we design a table model and included:date、province、 Tri- fields of order_cnt.Association user table necessary first and order table, the customer volume of a very welcome electric business website More than 1,000,000, order table data volume is more than 100 times of user's table, and now, the operation to be done of database is 1,000,000 rows record Associating with 100,000,000 (1,00*,100 ten thousand) row records, collect according to area and date packet, this, which will calculate duration, to grow very much, and It is and very big to database performance impact.
On the other hand, there are a variety of different applications in the upper strata of data, and the demand applied be it is diversified, than Such as, it is necessary to analyze the order transaction change of different regions different age group, wherein valuable information has been found out.At this moment we are again A table model must be designed:Age, province, order_cnt, the order for completing almost to repeat again associate remittance with user's table It is total to calculate.
If data volume is smaller, this storage and computation schema seem this demand that disclosure satisfy that, but also one The problem of very big is exactly that application oriented model and source data Model coupling are too high, almost in the model of each application layer User's table that patch source has all been used in the inside is calculated, if the custom system of production in one day upgrades, the change of user's table model, access User's table model to data platform also can be changed accordingly, then directly use user's table to do what is calculated in all data platforms Program will be forced to change.
The content of the invention
The technical problems to be solved by the invention are:A kind of database framework of data platform is proposed, solves conventional art Middle database design, if user's table model changes, causes institute because application oriented model and source data Model coupling are too high Have and directly use user's table to do the program calculated in data platform will to be forced to change, the problem of producing flood tide computing.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of database framework of data platform, including:Data buffering layer, Data Integration layer and data application layer;
The data buffering layer, as the interim storage layer of initial data, it is responsible for the data in access patch source, data is not done Any processing;
The Data Integration layer, for providing the data model of standard stabiliser, providing standardized data and combined data;
The data application layer, is application oriented data processing and accumulation layer, and can be with on-demand customization.
As further optimization, table structure and the source system table structure of the data buffering layer are consistent, in source system When table structure changes, the table structure of data buffering layer also mutually changes.
As further optimization, the attribute that each model in each system is associated is incorporated into one by the Data Integration layer In individual model, standardized data, and carry out dimension collecting from fine granularity to coarseness.
As further optimization, the application layer is directed to obstructed application, and corresponding data are extracted to application from conformable layer Layer, and store using the direct result collection for accessing and using.
As further optimization, the data model in the Data Integration layer covers all business datums of data buffering layer.
The beneficial effects of the invention are as follows:
Using data buffering layer, Data Integration layer, data application layer three-decker design data platform, can utilize compared with Few storage tape carrys out more convenient quick data experience, has saved carrying cost, has improved data user rate, has improved productivity ratio.
Brief description of the drawings
Fig. 1 is the database framework map of the data platform in the present invention;
Fig. 2 is data buffering layer and a kind of example schematic of source system table structure relation;
Fig. 3 is Data Integration layer and data buffering layer and a kind of example schematic of source system table structure relation;
Fig. 4 is a kind of example schematic that Data Integration layer is standardized mapping to the data of data cushion;
Fig. 5 is a kind of example schematic of the Data Integration layer to data summarization;
Fig. 6 is a kind of example schematic that data application layer extracts data from Data Integration layer;
It is to build the database framework map designed by electric business website data platform in the embodiment of the present invention that Fig. 7, which is,.
Embodiment
The present invention is directed to propose a kind of database framework of data platform, solve in conventional art database design because towards The model and source data Model coupling of application are too high, if user's table model changes, cause directly to make in all data platforms Using user's table and doing the program calculated will be forced to change, the problem of producing flood tide computing.
As shown in figure 1, in the present invention, the database framework of proposition includes data buffering layer, Data Integration layer, data should With layer three-decker;
1st, data buffering layer:For the Access Layer of initial data, the initial data of each system, table structure and source system are deposited Table structure is consistent, and data are without any processing, is ephemeral data accumulation layer.
The layer is mainly that conformable layer data processing is prepared.Because the system on production line may use a variety of storage systems System, and data platform is used as unified data processing storage center, it is necessary to the data pick-up of each production system to platform In unified storage and computing system, association aggregation process so could be effectively done.
This layer of table structure is consistent with source system table structure, if source system table structure changes, then cushion table knot Structure changes accordingly.As shown in Fig. 2 production system has the table of entitled " address ", ETL developer is first in data platform Cushion create a table " t_address " with " address " equally, then by data exchange tool, The data pick-up of " address " table is in " t_address ", and " t_address " table is in addition to table name is different, table structure, literary name Name section, field type, data etc. are all consistent with " address " table of production system.
If production system model changes, such as:" address " table " province, city " field are merged into " addr " field, cushion table " t_address " do corresponding change:Delete and rebuild " t_address " table.With production system " address " table be consistent.
Assuming that no cushion, data platform will directly do System processing on a production line, it is understood that there may be ask below Topic:
1) tremendous influence is caused to the performance of production system.
2) because production system data real-time change, if data platform data processing mistake, the number before can not being back-calculated According to.
3) multiple data source systems, a variety of data memory formats, make data platform directly highly difficult using creation data.
By contrast, after data buffering layer is set, have the advantage that:
1) data access cushion first, simply process derived from a data, reduce the time of connection production system, And without the calculating operation for doing some consumption resources on line.
If 2) data platform data processing mistake, data can first preserve into text before being eliminated, to answer logarithm According to running again.
3) then source data can use some ETL data processing meanses, make data standard first by access cushion Change.
2nd, conformable layer is used to provide the data model of standard stabiliser, provides standardized data and combined data;
In conformable layer, model is standard, as shown in figure 3, " i_customer " table model of conformable layer, it should include All dimensional informations of user:User number, age, sex, place province, place city etc., at all conformable layers or application layer data User profile during reason should all be got from this table.In conformable layer, model is stable, not with source system table knot Structure changes and changed, such as, production system " address " table change above-mentioned, the table model of conformable layer can't change, Simply the mapping ruler of field changes in data upload program.In figure 3, layer model " t_address " is buffered with source The change of system " address " literary name section and change, but " i_customer " table model of conformable layer need not then change, only Need the data processing loading procedure of " i_customer " of adjustment conformable layer:Cushion in data processing loading procedure " the addr fields " of " t_address " is split, and is mapped as " province " and " city ", can simply be stated with SQL statement.
In conformable layer, data are standards.Initial data always has some quality problems, for example address is not standard , such as Fig. 4, cust_id=13 user's ship-to is Xinjiang Urumqi City of province in initial data, should in conformable layer " Xinjiang province " is mapped as " Xinjiang Uygur Autonomous Regions " in standard three-level administrative division.
In conformable layer, data have some primary dimensions to collect.In conformable layer, in addition to having detailed data, also There is certain dimension to collect, can be directly from summary sheet so to upper layer application in use, without collecting again from detailed data Data are extracted, or are collected again on the basis of summary sheet, improve efficiency.Such as Fig. 5, conformable layer table " i_cust_trade_ Sum_m " is collected by " i_trade ", the association of " i_goods " table, is stored to merchandise user's moon and is gathered information.If upper layer application Need analyze different regions moon trading situation, can directly from conformable layer table " i_cust_trade_sum_m " extraction number According to, then collect, to be easier compared to combined data is extracted directly from transaction details table, it is more efficient.
3rd, application layer:It is application oriented data processing and accumulation layer, and can be with on-demand customization.Should for obstructed With, from the corresponding data of conformable layer extraction to application layer, application layer storage application access and the direct result collection used.
The layer region be directly facing application demand, customize data model, integrates layer data and is extracted, handled, stored, convenient to answer The data needed for quick use.Such as Fig. 6, data analysis system needs to analyze the trading situation of different regions every month, built first Formwork erection type " rpt_trade_regon_m ", there is id, date, province, city, trans_amt field.In conformable layer Have and collect model:" i_cust_trade_sum_m "-customer transaction Monthly Summary, application layer can quickly extract " i_cust_ The data of date, province, city, trans_amt in trade_sum_m " tables, then according to province, city points Group collects, and finally obtains the data of " rpt_trade_regon_m ".Compared to from the millions of transaction details table of daily trading volume Extraction data collect again wants more convenient and quicker.
Embodiment:
Exemplified by building electric business website data platform, as shown in fig. 7,
Initially set up buffering layer model:T_customer (user's table), t_address (address table), t_goods (commodity Table), t_trade (tran list), be collected into initial data storage arrive cushion, data are without any processing.
Then model is established in conformable layer:I_customer (user's table), i_goods (commodity list), i_trade (are handed over Easy table), i_cust_trade_sum_d (customer transaction day summary sheet), (the customer transaction moon collects i_cust_trade_sum_m Table), and from cushion extraction data to conformable layer, conformable layer modelling will cover all business datums of cushion, and to slow Rush layer data and do ETL processing, the base data table of storage to cushion:I_customer, i_goods, i_trade, on basis On the basis of tables of data, do some primary and collect processing, store in primary summary sheet:i_cust_trade_sum_d、i_ cust_trade_sum_m
Finally designed a model according to user's request in application layer, and simultaneously process data is extracted from conformable layer.As reported in Fig. 7 Table system needs to check the trading situation table of different regions, and user behavior analysis system needs to analyze the purchasing power of different provinces. Designed in application layer:Rpt_trade_regon_m, ana_trade_regon_m table, and from the i_cust_trade_ of conformable layer Simultaneously combined data is extracted in sum_m tables.

Claims (5)

  1. A kind of 1. database framework of data platform, it is characterised in that including:Data buffering layer, Data Integration layer and data should With layer;
    The data buffering layer, as the interim storage layer of initial data, it is responsible for the data in access patch source, data is not done any Processing;
    The Data Integration layer, for providing the data model of standard stabiliser, providing standardized data and combined data;
    The data application layer, is application oriented data processing and accumulation layer, and can be with on-demand customization.
  2. A kind of 2. database framework of data platform as claimed in claim 1, it is characterised in that the table of the data buffering layer Structure is consistent with source system table structure, and when in source, system table structure changes, the table structure of data buffering layer is also mutually sent out Changing.
  3. 3. the database framework of a kind of data platform as claimed in claim 1, it is characterised in that the Data Integration layer is each The associated attribute of each model is incorporated into a model in individual system, standardized data, and carries out dimension from fine granularity To collecting for coarseness.
  4. 4. the database framework of a kind of data platform as claimed in claim 1, it is characterised in that the application layer is for obstructed Application, extract corresponding data to application layer from conformable layer, and store using the direct result collection for accessing and using.
  5. 5. the database framework of a kind of data platform as claimed in claim 1, it is characterised in that in the Data Integration layer Data model covers all business datums of data buffering layer.
CN201710670614.4A 2017-08-08 2017-08-08 A kind of database framework of data platform Pending CN107480235A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710670614.4A CN107480235A (en) 2017-08-08 2017-08-08 A kind of database framework of data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710670614.4A CN107480235A (en) 2017-08-08 2017-08-08 A kind of database framework of data platform

Publications (1)

Publication Number Publication Date
CN107480235A true CN107480235A (en) 2017-12-15

Family

ID=60599047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710670614.4A Pending CN107480235A (en) 2017-08-08 2017-08-08 A kind of database framework of data platform

Country Status (1)

Country Link
CN (1) CN107480235A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020215833A1 (en) * 2019-04-26 2020-10-29 创新先进技术有限公司 Offline cache method and apparatus, and terminal and readable storage medium
CN112001710A (en) * 2020-09-07 2020-11-27 山东钢铁集团日照有限公司 Big data reading and integrating system in steel product production process
US10880583B2 (en) 2019-04-26 2020-12-29 Advanced New Technologies Co., Ltd. Method, apparatus, terminal, and readable storage medium for offline caching

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040199553A1 (en) * 2003-04-02 2004-10-07 Ciaran Byrne Computing environment with backup support
CN101183986A (en) * 2007-11-26 2008-05-21 河北全通通信有限公司 Method for economized storage, construction ''green'' database
CN101290573A (en) * 2008-05-30 2008-10-22 同济大学 Trans-platform embedded geographical information systems
CN105512790A (en) * 2015-08-14 2016-04-20 上海合胜计算机科技股份有限公司 Integrated operation and maintenance management system
CN106294521A (en) * 2015-06-12 2017-01-04 交通银行股份有限公司 Date storage method and data warehouse

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040199553A1 (en) * 2003-04-02 2004-10-07 Ciaran Byrne Computing environment with backup support
CN101183986A (en) * 2007-11-26 2008-05-21 河北全通通信有限公司 Method for economized storage, construction ''green'' database
CN101290573A (en) * 2008-05-30 2008-10-22 同济大学 Trans-platform embedded geographical information systems
CN106294521A (en) * 2015-06-12 2017-01-04 交通银行股份有限公司 Date storage method and data warehouse
CN105512790A (en) * 2015-08-14 2016-04-20 上海合胜计算机科技股份有限公司 Integrated operation and maintenance management system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
搜狐: "国内首个《块数据GLDM:区域数据资源化模型与规范》白皮书发布(视频解说+全文)", 《HTTPS://WWW.SOHU.COM/A/161944990_353595》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020215833A1 (en) * 2019-04-26 2020-10-29 创新先进技术有限公司 Offline cache method and apparatus, and terminal and readable storage medium
US10880583B2 (en) 2019-04-26 2020-12-29 Advanced New Technologies Co., Ltd. Method, apparatus, terminal, and readable storage medium for offline caching
CN112001710A (en) * 2020-09-07 2020-11-27 山东钢铁集团日照有限公司 Big data reading and integrating system in steel product production process

Similar Documents

Publication Publication Date Title
CN108647330A (en) A kind of 3D lightweight conversion methods based on BIM model files
CN108038222B (en) System of entity-attribute framework for information system modeling and data access
CN106294521B (en) Date storage method and data warehouse
CN107193967A (en) A kind of multi-source heterogeneous industry field big data handles full link solution
CN103678339B (en) Data backflow method and system and data access method and system in relational database
CN103729337B (en) report conversion method and device
CN104933112A (en) Distributed Internet transaction information storage and processing method
CN103984755A (en) Multidimensional model based oil and gas resource data key system implementation method and system
CN106951552A (en) A kind of user behavior data processing method based on Hadoop
CN101944082A (en) Excel-like report processing method
CN102646039A (en) Software interface generating system and method based on extensible markup language (XML) Schema
CN107480235A (en) A kind of database framework of data platform
CN110647512A (en) Data storage and analysis method, device, equipment and readable medium
CN107229688A (en) A kind of database level point storehouse point table method and system, server
CN107357812A (en) A kind of data query method and device
CN110851667A (en) Integrated analysis method and tool for multi-source large data
CN104298779A (en) Processing method and system for massive data processing
CN108009145A (en) Report form generation method and device
CN102682082A (en) Network Flash searching system and network Flash searching method based on content structure characteristics
CN108563730A (en) A kind of cold and hot data automatic switching method, device, electronic equipment and storage medium
CN109669975B (en) Industrial big data processing system and method
CN106649226A (en) Report generation method and device
CN108009290A (en) A kind of data modeling and storage method of track traffic command centre gauze big data
CN110414926A (en) Account management method, device and computer readable storage medium
CN107943455A (en) A kind of list rendering system based on JSON data formats

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171215

RJ01 Rejection of invention patent application after publication